Join a talented team as a Systems Reliability Engineer to enhance the Cloudflare platform's availability and performance using automation and monitoring tools.
Available Locations: Austin
About the Role
We are looking for talented Systems Reliability Engineers to build and operate our Edge platform running in more than 320 cities in over 120 countries. Our SREs come from diverse technical backgrounds and have built up their knowledge working in different environments, but common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. We support our services in a "follow the sun" model with offices in East Asia, Europe and North America.
This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. We live at the boundary between systems, network, and software, and love improving the glue that holds them together. Working with us, you will build tools to constantly improve service availability, performance, and operational velocity. You will nurture a passion for an "automate everything" approach that makes systems failure resistant and ready to scale.
SREs focus on the immediate state and functionality of the Cloudflare platform around the world, leveraging an array of monitoring, alerting and diagnostics tools while developing and enhancing the Cloudflare platform and its capabilities. We own a wide portfolio of applications and services, running a tight feedback loop of developer and operator patterns. The ideal SRE candidate has a passionate curiosity about how the Internet fundamentally works and has a strong knowledge of networking, Linux and TLS along with coding ability in Go, Rust, or Python.
Requisite Skills
About the Role
We are looking for talented Systems Reliability Engineers to build and operate our Edge platform running in more than 320 cities in over 120 countries. Our SREs come from diverse technical backgrounds and have built up their knowledge working in different environments, but common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. We support our services in a "follow the sun" model with offices in East Asia, Europe and North America.
This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. We live at the boundary between systems, network, and software, and love improving the glue that holds them together. Working with us, you will build tools to constantly improve service availability, performance, and operational velocity. You will nurture a passion for an "automate everything" approach that makes systems failure resistant and ready to scale.
SREs focus on the immediate state and functionality of the Cloudflare platform around the world, leveraging an array of monitoring, alerting and diagnostics tools while developing and enhancing the Cloudflare platform and its capabilities. We own a wide portfolio of applications and services, running a tight feedback loop of developer and operator patterns. The ideal SRE candidate has a passionate curiosity about how the Internet fundamentally works and has a strong knowledge of networking, Linux and TLS along with coding ability in Go, Rust, or Python.
Requisite Skills
- Aptitude for identifying problems, owning them and working with others to solve them
- Linux systems experience
- 3 years experience in an SRE role or a role with similar functions
- Software development skills in some programming language such as Go, Rust, or Python
- Understanding of distributed software systems and large scale system design tradeoffs
- Intermediate experience of common network protocols like DNS and HTTP
- Experience with the Linux kernel and Linux software packaging
- Performance analysis and debugging
- Configuration management systems such as Saltstack, Chef, Puppet or Ansible
- Workflow automation systems such as Temporal or Apache Airflow
- Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Squid or Apache
- SQL databases
- Time series databases such as OpenTSDB, Graphite, Prometheus or Grafana
- Key/Value stores
- Internetworking and BGP
- Experience with continuous / rapid release engineering
- Strong tooling and automation development experience
- Experience working in a 24/7/365 service environment
- Experience working with large scale production distributed systems
- A history of contributing to Open Source Software
- Nginx
- PostgreSQL
- Docker
- Prometheus
- Grafana
- Consul
- Nomad
- Temporal
- Salt
Top Skills
Ansible
Apache Airflow
Chef
Consul
Docker
Go
Grafana
Linux
Nginx
Nomad
Postgres
Prometheus
Puppet
Python
Rust
Saltstack
SQL
Temporal
Similar Jobs at Cloudflare
Cloud • Information Technology • Security • Software • Cybersecurity
Lead and scale developer marketing efforts, focusing on demand generation and field marketing to enhance developer engagement and revenue. Define marketing strategies, manage teams, and collaborate with various departments for successful product positioning and growth.
Top Skills:
Cloudflare WorkersDurable ObjectsKvPagesR2Workers Ai
Cloud • Information Technology • Security • Software • Cybersecurity
The Pricing Analyst will analyze pricing performance and revenue trends, build pricing models, and collaborate with cross-functional teams to influence pricing strategy.
Top Skills:
BigQueryExcelGoogle SheetsSQLTableau
Cloud • Information Technology • Security • Software • Cybersecurity
Lead the design and implementation of scalable revenue processes in the Salesforce Revenue Cloud, optimizing Quote-to-Cash lifecycle and leading change management initiatives across the organization.
Top Skills:
BillingCelonisConfluenceCongaCpqIroncladNetSuiteOracle Cloud Fusion ErpSalesforce Revenue CloudTableauZuora
What you need to know about the Chennai Tech Scene
To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.