Site Reliability Engineering Manager

Posted 4 Days Ago
Be an Early Applicant
Chennai, Tamil Nadu
Senior level
Healthtech • Information Technology • Telehealth
Creating healthier futures for us all.
The Role
As a Site Reliability Engineering Manager, you will lead a team in ensuring the reliability and performance of critical systems. Responsibilities include mentoring engineers, defining service objectives, incident management, and system automation to support scalable and secure technology. Collaboration with cross-functional teams is crucial for aligning priorities and driving operational excellence.
Summary Generated by Built In

Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.

We are looking for a Site Reliability Engineering Manager to lead our Cloud Infrastructure Engineering division. Cloud Infrastructure Engineering ensures the continuous availability of the technologies and systems that are the foundation of athenahealth’s services. We are directly responsible for thousands of servers, petabytes of storage, and handling thousands of web requests per second, all while sustaining growth at a meteoric rate. We enable an operating system for the medical office that abstracts away administrative complexity, leaving doctors free to practice medicine.

But enough about us; let’s talk about you!

As an SRE Manager, you will lead a team of Site Reliability Engineers responsible for ensuring the reliability, availability, and performance of mission-critical systems. You will manage a team that builds, automates, and maintains the infrastructure and systems that support scalable, secure, and high-performing services. In addition to managing technical operations, you will play a key role in optimizing systems, reducing downtime, and improving incident response procedures. The ideal candidate will be an experienced manager who can balance technical leadership with team development and operational efficiency

The Team:

We are a bunch of Site Reliability Engineers who are passionate about reliability, automation, and scalability. We use an agile based framework to execute our work, ensuring we are always focused on the most

important and impactful needs of the business. We support systems & Platform in Hybrid Cloud and make data-driven decisions for which one best suit the needs of the business. We are relentless in automating away manual, repetitive work so we can focus on projects that help move the business forward.

Job Responsibilities

Team Leadership & Development

· Lead and mentor a team of SREs, providing guidance, coaching, and support to foster growth and career development.

· Build and grow a high-performing team focused on operational excellence, reliability, and scalability.

· Establish and maintain a strong team culture of collaboration, accountability, and continuous improvement.

· Work with cross-functional teams (Engineering, Product and Project Management) to align priorities and build effective working relationships.

· Service Reliability & Performance

· Define and track Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs) for critical systems.

· Monitor and improve the reliability, availability, and performance of all production services and infrastructure.

· Own and drive efforts to improve incident management, root cause analysis, and postmortem documentation.

· Implement proactive monitoring, alerting, and incident response strategies.

· System Automation & Scalability

· Lead efforts to automate and streamline operational processes, reduce manual toil, and improve system reliability.

· Identify and implement best practices for system design, capacity planning, and cost optimization.

· Work closely with engineering teams to build scalable, resilient, and efficient systems that can handle increasing load.

· Collaboration & Cross-functional Engagement

· Collaborate with Engineering & Product teams to ensure reliability is baked into the development process, including reviewing code, design, and deployment practices.

· Advocate for reliability improvements across the engineering and product teams, ensuring a balance between speed and reliability.

· Work with other engineering managers to align on long-term goals, technical debt, and infrastructure investments.

· Process & Efficiency Improvement

· Drive continuous improvements in incident management, deployment pipelines, and system observability.

· Champion the adoption of tools and processes that improve automation, monitoring, alerting, and reporting.

· Measure and track key operational metrics, using data to inform decision-making and drive improvements.

Qualifications

· Atleast 8 years of experience building, scaling, and supporting highly available systems and services

· Around 3-4 years of years of experience managing and leading technical teams, including mentoring engineers and fostering team development.

· Strong experience with enterprise grade middleware, e.g. Web Servers, Apache & Load Balancers (NetScaler) hosted on a virtual machine cluster.

· Strong Expertise in configuration management tools like Puppet.

· Experience with Infrastructure-as-Code, Linux, VmWare and API integration. Familiarity with Terraform desired.

· Proficiency in at least one scripting or programming language (Ansible, Python, Go, Ruby, etc.).

· Expertise in the delivery, maintenance, and support of Linux systems and infrastructure

· Experience with cloud platforms ( AWS), containerization ( Docker), and orchestration ( Kubernetes).

· Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack, CloudWatch, Splunk)

· Experience implementing solutions using SRE, DevOps principles,

·· Familiarity with telemetry, latest monitoring, visualization tools.

· Expertise in promoting and driving system visibility to aid in the rapid detection and resolution of issues

· Bachelor's or master's degree in computer science, Engineering, or a related field.

· Experience in industries with high uptime requirements (e.g., financial services, healthcare, SaaS)

Behaviors & Abilities Required:

· Results-oriented: A strong focus on achieving operational excellence, meeting SLAs, and driving results.

· Proactive: Able to anticipate and address challenges before they become critical issues.

· Collaborative: Works well across teams and with multiple stakeholders.

· Innovative: Seeks opportunities to improve and optimize systems and processes

About athenahealth

Here’s our vision: To create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all. 

What’s unique about our locations? 
From an historic, 19th century arsenal to a converted, landmark power plant, all of athenahealth’s offices were carefully chosen to represent our innovative spirit and promote the most positive and productive work environment for our teams. Our 10 offices across the United States and India — plus numerous remote employees — all work to modernize the healthcare experience, together. 
 
Our company culture might be our best feature. 
We don't take ourselves too seriously. But our work? That’s another story. athenahealth develops and implements products and services that support US healthcare: It’s our chance to create healthier futures for ourselves, for our family and friends, for everyone.  

 

Our vibrant and talented employees — or athenistas, as we call ourselves — spark the innovation and passion needed to accomplish our goal. We continue to expand our workforce with amazing people who bring diverse backgrounds, experiences, and perspectives at every level, and foster an environment where every athenista feels comfortable bringing their best selves to work. 

 

Our size makes a difference, too: We are small enough that your individual contributions will stand out — but large enough to grow your career with our resources and established business stability. 
 
Giving back is integral to our culture. Our athenaGives platform strives to support food security, expand access to high-quality healthcare for all, and support STEM education to develop providers and technologists who will provide access to high-quality healthcare for all in the future. As part of the evolution of athenahealth’s Corporate Social Responsibility (CSR) program, we’ve selected nonprofit partners that align with our purpose and let us foster long-term partnerships for charitable giving, employee volunteerism, insight sharing, collaboration, and cross-team engagement. 

 

What can we do for you? 
Along with health and financial benefits, athenistas enjoy perks specific to each location, including commuter support, employee assistance programs, tuition assistance, employee resource groups, and collaborative workspaces — some offices even welcome dogs.  

 

In addition to our traditional benefits and perks, we sponsor events throughout the year, including book clubs, external speakers, and hackathons. And we provide athenistas with a company culture based on learning, the support of an engaged team, and an inclusive environment where all employees are valued 

 

We also encourage a better work-life balance for athenistas with our flexibility. While we know in-office collaboration is critical to our vision, we recognize that not all work needs to be done within an office environment, full-time. With consistent communication and digital collaboration tools, athenahealth enables employees to find a balance that feels fulfilling and productive for each individual situation. 

Top Skills

Cloud Infrastructure
Site Reliability Engineering
The Company
Chennai, Tamil Nadu
6,600 Employees
Hybrid Workplace
Year Founded: 1997

What We Do

At athenahealth, it’s our vision to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all. With a thoughtful balance of humanity and technology, we’re able to uncover meaningful healthcare insights that can help create healthier futures for our families, our communities, and ourselves.

athenahealth partners with healthcare organizations across the care continuum to drive clinical and financial results. Our expert teams build modern technology on an open, connected ecosystem, yielding insights that make a difference for our customers and their patients. We offer medical record, revenue cycle, patient engagement, care coordination, and population health services. We combine insights from our network of more than 160,000 providers and approximately 117 million patients with deep industry knowledge and perform administrative work at scale.

Our cloud-based and on-premises solutions deliver measurable financial and clinical results for healthcare organizations of all shapes and sizes. That’s why our top-performing clients are beating industry benchmarks across the board.

For more information, please visit www.athenahealth.com

Why Work With Us

Our vibrant and talented employees spark the innovation and passion needed to accomplish our goals. We continue to expand our workforce with amazing people who bring diverse backgrounds, experiences, and perspectives at every level, and foster an environment where we each feel comfortable bringing their best selves to work.

Similar Jobs

Pfizer Logo Pfizer

Associate Data Manager - Clinical Data Sciences

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
Chennai, Tamil Nadu, IND
121990 Employees

ZS Logo ZS

Business Technology Solutions Associate - ETL Developer

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
Chennai, Tamil Nadu, IND
13000 Employees

TransUnion Logo TransUnion

C Developer

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
2 Locations
13000 Employees

Pfizer Logo Pfizer

Manager, Statistical Data Sciences Lead

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
Chennai, Tamil Nadu, IND
121990 Employees

Similar Companies Hiring

TransUnion Thumbnail
Information Technology • Fintech • Financial Services • Cybersecurity • Business Intelligence • Big Data Analytics • Big Data
Chicago, IL
13000 Employees
Intelsat Thumbnail
Software • Mobile • Internet of Things • Information Technology • Digital Media • Aerospace
McLean, VA
2100 Employees
Pfizer Thumbnail
Pharmaceutical • Natural Language Processing • Machine Learning • Healthtech • Biotech • Artificial Intelligence
New York, NY
121990 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account