Fortive Logo

Fortive

Site Reliability Engineer

Posted 13 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in India
Mid level
In-Office or Remote
Hiring Remotely in India
Mid level
Ensure availability and performance of customer-facing platform by provisioning and maintaining Windows/Linux/Kubernetes infrastructure, implementing IaC (Terraform/ARM/CloudFormation), automating workflows, monitoring with observability tools, managing backups and DR, analyzing logs and metrics, and promoting security and ITIL best practices across DevOps, DBA, and development teams.
The summary above was generated by AI

Job Summary 

As a Site Reliability Engineer, you will play a critical role in ensuring the availability and performance of our customer-facing platform. You will work closely with DevOps, DBA, and Development teams to provision and maintain infrastructure, deploy and monitor our applications, and automate workflows. Your contributions will have a direct impact on customer satisfaction and overall user experience. 


Responsibilities and Deliverables 

  • Manage, monitor, and maintain highly available systems (Windows and Linux) 
  • Analyze metrics and trends to ensure performance and rapid scalability. 
  • Address routine service requests while identifying ways to automate and simplify. 
  • Create infrastructure as code using Terraform, ARM Templates, Cloud Formation. 
  • Maintain data backups and disaster recovery plans. 
  • Adhere to security best practices through all stages of the software development lifecycle 
  • Follow and champion ITIL best practices and standards. 
     

 Organizational Alignment 

  • Reports to the Senior SRE Manager 
  • This role involves close collaboration with DevOps, DBA, and security teams. 
     

Technical Proficiencies 

  • Hands-on experience with AWS is a must-have. 
  • Proficiency analyzing application, IIS, system, security logs, and CloudTrail events. 
  • Experience with CI/CD tools such as Jenkins and GitHub Actions 
  • Experience maintaining and administering Windows, Linux, and Kubernetes. 
  • Experience in automation using scripting languages such as PowerShell, Bash, or Python. 
  • Good understanding of networking concepts (VPC, subnet, private link, peering). 
  • Familiarity with configuration management using Ansible, Azure Automation or similar. 
  • Familiarity with observability tools such as New Relic, AppDynamics, or DataDog. 

 
Experience 

  • 3+ years of experience in SRE or System Administration role. 
  • Demonstrated ability building and supporting high availability Windows/Linux servers. 
  • 2+ years of experience working with cloud technologies including AWS, Azure. 
  • Comfortable using Scrum, Kanban, or Lean methodologies. 

Education 

  • Bachelor’s Degree or College Diploma in Computer Science, Information Systems, or equivalent experience. 

Similar Jobs

19 Days Ago
Remote
India
Senior level
Senior level
Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI
The Lead Site Reliability Engineer will build, deploy, and manage microservices in Kubernetes, optimize cloud applications, and integrate emerging technologies in AI and GenAI, ensuring high reliability and scalability.
Top Skills: Amazon EksAWSAzureBashChefGCPGithub ActionsHelmKubernetesMySQLNew RelicPagerdutyPythonRundeckTerraform
3 Days Ago
In-Office or Remote
Mid level
Mid level
Information Technology • Software • Web3 • Infrastructure as a Service (IaaS)
Operate and improve the Pod platform: respond to incidents, investigate root causes, build automation and observability, design monitoring/alerting, reduce alert fatigue, and drive reliability improvements across production systems.
Top Skills: BashCi/CdCloudDockerGrafanaLinuxPagerdutyPrometheusPythonRust
6 Days Ago
Remote
IN
Expert/Leader
Expert/Leader
Big Data • Information Technology • Software • Database • Analytics • Infrastructure as a Service (IaaS) • Big Data Analytics
Lead proactive reliability engineering for a multi-cloud streaming platform: build automation and tooling, define SLO/SLA frameworks, analyze systemic failures, own incident response standards, serve as incident commander, coach teams through post-mortems, produce customer-facing root cause analyses, and partner across engineering to reduce incidents and scale reliability practices.
Top Skills: AWSAzureCi/CdConfluenceGCPGitJIRAKafkaKubernetesLoggingMetricsPagerdutyRootlySlackTracing

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account