Fortive Jobs

Site Reliability Engineer

Fortive

Site Reliability Engineer

Posted 13 Days Ago

Be an Early Applicant

In-Office or Remote

Hiring Remotely in India

Mid level

In-Office or Remote

Hiring Remotely in India

Mid level

Ensure availability and performance of customer-facing platform by provisioning and maintaining Windows/Linux/Kubernetes infrastructure, implementing IaC (Terraform/ARM/CloudFormation), automating workflows, monitoring with observability tools, managing backups and DR, analyzing logs and metrics, and promoting security and ITIL best practices across DevOps, DBA, and development teams.

The summary above was generated by AI

Job Summary

As a Site Reliability Engineer, you will play a critical role in ensuring the availability and performance of our customer-facing platform. You will work closely with DevOps, DBA, and Development teams to provision and maintain infrastructure, deploy and monitor our applications, and automate workflows. Your contributions will have a direct impact on customer satisfaction and overall user experience.

Responsibilities and Deliverables

Manage, monitor, and maintain highly available systems (Windows and Linux)

Analyze metrics and trends to ensure performance and rapid scalability.

Address routine service requests while identifying ways to automate and simplify.

Create infrastructure as code using Terraform, ARM Templates, Cloud Formation.

Maintain data backups and disaster recovery plans.

Adhere to security best practices through all stages of the software development lifecycle

Follow and champion ITIL best practices and standards.

Organizational Alignment

Reports to the Senior SRE Manager

This role involves close collaboration with DevOps, DBA, and security teams.

Technical Proficiencies

Hands-on experience with AWS is a must-have.

Proficiency analyzing application, IIS, system, security logs, and CloudTrail events.

Experience with CI/CD tools such as Jenkins and GitHub Actions

Experience maintaining and administering Windows, Linux, and Kubernetes.

Experience in automation using scripting languages such as PowerShell, Bash, or Python.

Good understanding of networking concepts (VPC, subnet, private link, peering).

Familiarity with configuration management using Ansible, Azure Automation or similar.

Familiarity with observability tools such as New Relic, AppDynamics, or DataDog.

Experience

3+ years of experience in SRE or System Administration role.

Demonstrated ability building and supporting high availability Windows/Linux servers.

2+ years of experience working with cloud technologies including AWS, Azure.

Comfortable using Scrum, Kanban, or Lean methodologies.

Education

Bachelor’s Degree or College Diploma in Computer Science, Information Systems, or equivalent experience.

Similar Jobs

Coupa

Site Reliability Engineer

19 Days Ago

Remote

India

Senior level

Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI

The Lead Site Reliability Engineer will build, deploy, and manage microservices in Kubernetes, optimize cloud applications, and integrate emerging technologies in AI and GenAI, ensuring high reliability and scalability.

Top Skills: Amazon EksAWSAzureBashChefGCPGithub ActionsHelmKubernetesMySQLNew RelicPagerdutyPythonRundeckTerraform

Pod Network

Site Reliability Engineer

3 Days Ago

In-Office or Remote

Mid level

Information Technology • Software • Web3 • Infrastructure as a Service (IaaS)

Operate and improve the Pod platform: respond to incidents, investigate root causes, build automation and observability, design monitoring/alerting, reduce alert fatigue, and drive reliability improvements across production systems.

Top Skills: BashCi/CdCloudDockerGrafanaLinuxPagerdutyPrometheusPythonRust

Confluent

Staff Software Engineer

6 Days Ago

Remote

Expert/Leader

Big Data • Information Technology • Software • Database • Analytics • Infrastructure as a Service (IaaS) • Big Data Analytics

Lead proactive reliability engineering for a multi-cloud streaming platform: build automation and tooling, define SLO/SLA frameworks, analyze systemic failures, own incident response standards, serve as incident commander, coach teams through post-mortems, produce customer-facing root cause analyses, and partner across engineering to reduce incidents and scale reliability practices.

Top Skills: AWSAzureCi/CdConfluenceGCPGitJIRAKafkaKubernetesLoggingMetricsPagerdutyRootlySlackTracing

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.