Systems Reliability Engineer (SRE) - Edge

Posted 4 Hours Ago
Be an Early Applicant
Hiring Remotely in Austin, TX
Remote
Hybrid
3-5 Years Experience
Artificial Intelligence • Information Technology • Security • Cybersecurity
The Role
The Systems Reliability Engineer will build and operate the Edge platform globally, focusing on automation and operational excellence. Responsibilities include improving service availability, utilizing monitoring tools, enhancing platform capabilities, and ensuring system performance. Candidates should have experience in SRE roles, knowledge of networking, and proficiency in either Go or Python.
Summary Generated by Built In

About the Role
We are looking for talented Systems Reliability Engineers to build and operate our Edge platform running in more than 320 cities in over 120 countries. Our SREs come from diverse technical backgrounds and have built up their knowledge working in different environments, but common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. We support our services in a "follow the sun" model with offices in East Asia, Europe and North America.
This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. We live at the boundary between systems, network, and software, and love improving the glue that holds them together. Working with us, you will build tools to constantly improve service availability, performance, and operational velocity. You will nurture a passion for an "automate everything" approach that makes systems failure resistant and ready to scale.
SREs focus on the immediate state and functionality of the Cloudflare platform around the world, leveraging an array of monitoring, alerting and diagnostics tools while developing and enhancing the Cloudflare platform and its capabilities. We own a wide portfolio of applications and services, running a tight feedback loop of developer and operator patterns. The ideal SRE candidate has a passionate curiosity about how the Internet fundamentally works and has a strong knowledge of networking, Linux and TLS along with coding ability in Go or Python.
Requisite Skills

  • Aptitude for identifying problems, owning them and working with others to solve them
  • Linux systems experience
  • 3 years experience in an SRE role or a role with similar functions
  • Software development skills in some programming language such as Go or Python
  • Understanding of distributed software systems and large scale system design tradeoffs
  • Intermediate experience of common network protocols like DNS and HTTP


Examples of desirable skills, knowledge and experience

  • Experience with the Linux kernel and Linux software packaging
  • Performance analysis and debugging
  • Configuration management systems such as Saltstack, Chef, Puppet or Ansible
  • Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Squid or Apache
  • SQL databases
  • Time series databases such as OpenTSDB, Graphite, Prometheus or Grafana
  • Key/Value stores
  • Internetworking and BGP


Bonus Points

  • Experience with continuous / rapid release engineering
  • Strong tooling and automation development experience
  • Experience working in a 24/7/365 service environment
  • Experience working with large scale production distributed systems
  • A history of contributing to Open Source Software


Some tools that we use

  • Nginx
  • PostgreSQL
  • Docker
  • Prometheus
  • Grafana
  • Consul
  • Nomad
  • Temporal
  • Salt

Top Skills

Go
Python
The Company
Austin, TX
3,300 Employees
Hybrid Workplace
Year Founded: 2009

What We Do

Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare have all traffic routed through its intelligent global network, which gets smarter with each new site added.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery

Cloudflare Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Not Specified
Austin, TX

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account