Harris healthcare Logo

Harris healthcare

Sr Site Reliability Engineer

Reposted 20 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
The Sr Site Reliability Engineer will automate infrastructure provisioning, manage CI/CD pipelines, optimize monitoring solutions, and ensure service reliability, collaborating with U.S. SREs on incident response and operational readiness.
The summary above was generated by AI

Business Unit:

 STChealth is a company focused on vaccine intelligence and immunization data management — it connects public and private healthcare sources to deliver real-time immunization information.

Their platform is used by thousands of locations, and they emphasize data integrity, real-time analytics, and enabling better decision-making in public health. Headquarters: Phoenix, Arizona (US).

Job Summary:

 The Site Reliability Engineer (SRE) supports a U.S. public health SaaS platform processing protected health information (PHI) under HIPAA. The role emphasizes automation, monitoring, and reliability engineering for regulated environments. The SRE will partner closely with U.S.-based teams to enhance observability, CI/CD automation, and operational maturity in non-production and staging systems—maintaining compliance with HIPAA, SOC2, and corporate data protection standards.

Core Responsibilities

- Automate infrastructure provisioning, configuration, and maintenance using Terraform, Ansible, and Python.
- Build, enhance, and maintain CI/CD pipelines using Jenkins, GitHub Actions, or AWS CodePipeline for continuous delivery and consistency across environments.
- Implement and optimize monitoring solutions using Datadog, Prometheus, Grafana, and ELK/EFK stacks to ensure high service reliability.
- Develop alerting strategies and escalation paths aligned to service-level objectives (SLOs) and key performance indicators (KPIs).
- Build custom scripts and automation for patching, validation, and system health checks.
- Partner with U.S. SREs and Engineering teams on environment management, change control, and incident response improvements.
- Analyze logs and performance metrics to identify stability issues, optimize cloud costs, and drive continuous improvement.
- Maintain detailed runbooks, SOPs, and documentation supporting operational readiness and knowledge transfer.
- Contribute to open-source or internal tooling that enhances automation, monitoring, or observability capabilities.
- Conduct periodic reliability reviews, performance tests, and failover simulations to validate readiness.
- Support adoption of infrastructure-as-code, immutable environments, and container orchestration (Docker/Kubernetes).
- Promote DevOps and SRE best practices across the engineering organization.
Tools & Technologies
AWS (EC2, S3, Lambda, CloudWatch, IAM, RDS, ECS/EKS), Terraform, Ansible, Python, Bash, Jenkins, GitHub Actions, Docker, Kubernetes, Prometheus, Grafana, ELK/EFK, Loki, Jira, Confluence.
 

Qualifications
- 5–7 years in SRE, DevOps, or Infrastructure Engineering.
- Bachelor’s degree in computer science or related field of study preferred, or equivalent experience
- Experience supporting U.S. healthcare or other regulated SaaS systems (HIPAA, SOC2, ISO27001).
- Strong scripting and automation (Ansible, Jenkins, Python, Bash, Terraform, CloudFormation).
- Understanding of CI/CD, networking, and secure cloud architecture.
- Proven collaboration with U.S. teams across time zones; clear written and spoken English.
- Familiarity with EHR, HL7/FHIR, or state/federal public health systems preferred.
- Knowledge of data privacy frameworks (HIPAA, HITRUST, GDPR) and ITIL-based change/incident management.
Work Model
- Aligns with U.S. Eastern hours for daily collaboration, stand-ups, and sprint planning.
- Documents work thoroughly to ensure audit readiness and operational transparency.
- Works closely with U.S. SRE leadership on automation priorities, sprint goals, and production readiness activities.
 

Soft Skills
- Analytical problem-solver with attention to detail.
- Self-driven, collaborative, and process-oriented.
- Excellent communication and time management across distributed teams.
- Passionate about automation, reliability, and continuous improvement.
Example Contributions
- Automated patching pipeline for pre-production validation of security updates.
- Designed Grafana dashboards reducing alert noise by 40%.
- Built Python scripts automating AWS cleanup, saving 15% cloud spend.
- Implemented environment consistency checks improving deployment success rates.
- Introduced CI/CD optimizations reducing release time by 25%.

Work Mode: Remote

Shift Timings: 6:30pm to 3:30am IST
Location: Mumbai – Remote

Benefits:

  • Annual Public Holidays as applicable
  • 30 days total leave per calendar year
  • Mediclaim policy
  • Lifestyle Rewards Program
  • Group Term Life Insurance
  • Gratuity
  • ...and more!

Top Skills

Ansible
AWS
Bash
Confluence
Docker
Elk/Efk
Github Actions
Grafana
Jenkins
JIRA
Kubernetes
Loki
Prometheus
Python
Terraform

Similar Jobs

7 Days Ago
Remote or Hybrid
Bengaluru, Karnataka, IND
Senior level
Senior level
Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning • Software
The Senior Site Reliability Engineer will manage cloud-native systems, improve infrastructure, monitor systems, and automate while ensuring high availability and performance of digital experiences at Nexthink.
Top Skills: AWSBashDatadogDockerGithub ActionsGitlab CiGoJenkinsKubernetesLinuxPythonTerraform
Yesterday
In-Office or Remote
Bengaluru, Bengaluru Urban, Karnataka, IND
Senior level
Senior level
Artificial Intelligence • Information Technology • Software
Lead deployment, maintenance, and reliability of Selector's on-premises and SaaS platform. Manage IaC (Terraform/OpenTofu), Kubernetes (RKE) and Kustomize deployments, troubleshoot Kafka, CI/CD (Jenkins), and GCP issues. Drive incident RCA, mentor engineers, improve on-call processes and automation, and support customer-facing installations including air-gapped environments.
Top Skills: Git,Github,Terraform,Opentofu,Kubernetes,Rke,Kustomize,Kafka,Jenkins,Google Cloud Platform,Python
2 Days Ago
Remote
India
Senior level
Senior level
Information Technology • Marketing Tech • Social Media
Design, build, and maintain automation and CI/CD tooling for security operations; manage configuration and lifecycle of security infrastructure; write Python tools to eliminate manual work; improve firewall/network security workflows; and ensure reliability, observability, and incident response for security systems.
Top Skills: Python,Ansible,Puppet,Gitlab Ci,Jenkins,Github Actions,Rhel,Aws,Gcp,Azure,Terraform,Cloudformation,Docker,Kubernetes

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account