Harman

Senior Site Reliability Engineer - SaaS Cloud Products

Posted Yesterday

Be an Early Applicant

Hyderabad, Telangana

Senior level

Hyderabad, Telangana

Senior level

Design, implement, and maintain scalable cloud-native architectures for SaaS products, leading incident management efforts, optimizing system performance, creating observability solutions, and automating infrastructure through IaC. Stay updated with industry trends and develop internal tools for enhanced SRE practices.

The summary above was generated by AI

HARMAN’s engineers and designers are creative, purposeful and agile. As part of this team, you’ll combine your technical expertise with innovative ideas to help drive cutting-edge solutions in the car, enterprise and connected ecosystem. Every day, you will push the boundaries of creative design, and HARMAN is committed to providing you with the opportunities, innovative technologies and resources to build a successful career.

A Career at HARMAN

As a technology leader that is rapidly on the move, HARMAN is filled with people who are focused on making life better. Innovation, inclusivity and teamwork are a part of our DNA. When you add that to the challenges we take on and solve together, you’ll discover that at HARMAN you can grow, make a difference and be proud of the work you do everyday.

## Key Responsibilities

### Reliability and Performance Management

- Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products.
- Develop and implement SLOs, SLIs, and SLAs to measure and improve service reliability.
- Continuously optimize system performance and resource utilization across multiple cloud platforms.
- Finetune/Optimize Application performance by analyzing the code, traces and database queries.

### Incident Management and Troubleshooting

- Lead incident response efforts, effectively troubleshooting complex issues to minimize downtime and impact.
- Reduce Mean Time to Recover (MTTR) through proactive monitoring, automated alerting, and efficient problem-solving techniques.
- Conduct thorough Root Cause Analysis (RCA) for all major incidents and implement preventive measures.

### Observability and Monitoring

- Design and implement end-to-end observability solutions across our distributed systems.
- Develop and maintain comprehensive monitoring strategies using tools like ELK Stack, Prometheus, Grafana.
- Create and optimize product status dashboards to provide real-time visibility into system health and performance.

### Automation and Infrastructure as Code (IaC)

- Implement Infrastructure as Code practices using tools like Terraform.
- Develop and maintain automated deployment pipelines and CI/CD workflows.
- Create self-healing systems and automate routine operational tasks to reduce manual intervention.

### Cloud-Agnostic Architecture

- Design and implement cloud-agnostic solutions that can operate efficiently across multiple cloud providers.
- Develop expertise in event-driven architectures and related technologies (e.g., Apache Kafka/Eventhub, Redis, Mongo Atlas, IoTHub).
- Implement and manage containerized applications using Kubernetes across different cloud environments.

### Continuous Improvement

- Regularly review and refine operational practices to enhance efficiency and reliability.
- Stay updated with the latest industry trends and technologies in SRE, cloud computing, and DevOps.
- Contribute to the development of internal tools and frameworks to support SRE practices.

## Requirements
- Strong knowledge of cloud platforms - Azure and their associated services.
- Expert in Observability tools (ELK Stack, Dynatrace, Prometheus )
- Expertise in containerization technologies such as Docker and Kubernetes
- Understanding of Event-driven architecture and database technologies (Mongo Atlas, Azure SQL, PostgresDB )
- Proficient in IaaC tools such as - Terraform and GitHub Actions.
- Proficiency in one or more programming languages - Python/.Net/Java
- Strong understanding of networking concepts, load balancing, and security practices.

HARMAN is proud to be an Equal Opportunity / Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Top Skills

.Net

Java

Python

Similar Jobs

JPMorganChase

Site Reliability Engineer III

21 Days Ago

Hybrid

Hyderabad, Telangana, IND

Mid level

Financial Services

As a Site Reliability Engineer III, you will guide and assist teams in developing scalable solutions, optimize applications and infrastructure, implement automated pipelines, and resolve complex technical problems while supporting the adoption of site reliability engineering best practices.

Top Skills: JavaPython

Zenoti

Lead Site Reliability Engineer

19 Hours Ago

Hyderabad, Telangana, IND

Senior level

Cloud • Software

As a Lead Site Reliability Engineer, you will champion DevOps adoption within the organization. Your role includes guiding teams in adopting AWS best practices, architecting cloud native solutions, mentoring junior staff, and influencing teams to improve productivity and effectiveness.

Top Skills: Python

Freshworks

Staff Engineer - Site Reliability

Yesterday

Hybrid

Hyderabad, Telangana, IND

Expert/Leader

Artificial Intelligence • Cloud • Enterprise Web • Software • Business Intelligence

The Staff Engineer - Site Reliability will design and deliver software to enhance the availability and performance of Freshworks' products. They will develop cloud-native architectures, implement self-healing mechanisms, and optimize infrastructure management, ensuring high service reliability while strategizing automation and cost-effectiveness.

Top Skills: Python

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.