Genesys Logo

Genesys

Operations Reliability Engineer – Cloud Infrastructure

Reposted 2 Days Ago
Be an Early Applicant
In-Office
Chennai, Tamil Nadu, IND
Mid level
In-Office
Chennai, Tamil Nadu, IND
Mid level
Support reliability and operational health of AWS-centric cloud and Windows/Linux environments: troubleshoot incidents, validate remediation and patching, improve monitoring and alerting, assist IaC (Terraform) validation, and collaborate with Cloud, Security, IAM, Network, and ServiceNow teams to reduce noise and automate remediation.
The summary above was generated by AI

Genesys empowers organizations of all sizes to improve loyalty and business outcomes by creating the best experiences for their customers and employees. Through Genesys Cloud, the AI-powered Experience Orchestration platform, organizations can accelerate growth by delivering empathetic, personalized experiences at scale to drive customer loyalty, workforce engagement, efficiency and operational improvements.

We employ more than 6,000 people across the globe who embrace empathy and cultivate collaboration to succeed. And, while we offer great benefits and perks like larger tech companies, our employees have the independence to make a larger impact on the company and take ownership of their work. Join the team and create the future of customer experience together.

Operations Reliability Engineer – Cloud Infrastructure 

Location: India 
Level: P2 – Professional Track 
Timezone: 2nd shift (local time) 

 

Overview 

As an Operations Reliability Engineer with a specialization in Cloud Infrastructure, you will support the reliability, stability, and operational health of enterprise cloud and compute environments. This role focuses primarily on AWS infrastructure, with supporting responsibility for Azure and Windows-based systems. 

You will assist in incident detection, troubleshooting, patching, vulnerability remediation, and operational monitoring across AWS services, virtual machines, and Windows/Linux operating systems. In addition to hands-on support, you will contribute to improving monitoring accuracy, reducing alert noise, and enhancing the quality of operational data used across observability and AIOps platforms. 

This role blends cloud operations with reliability practices, including event correlation, automated remediation validation, and participation in early automation efforts. You will collaborate across Cloud Engineering, Security, IAM, and Network teams to strengthen operational maturity and contribute to long-term automation and reliability initiatives. 

 

Responsibilities 

General Reliability Operations 

  • Resolve cloud and OS-related incidents through hands-on troubleshooting, escalating complex issues to senior reliability engineers or cloud architects as needed. 

  • Monitor cloud observability and event management platforms to identify anomalies, performance degradation, and emerging risks. 

  • Perform incident triage and basic event correlation to determine probable cause and appropriate remediation. 

  • Validate automated remediation workflows and identify recurring manual steps that may be candidates for future automation. 

  • Participate in documenting runbooks, troubleshooting steps, and operational patterns used by automation and engineering teams. 

  • Execute manual and automated patching and vulnerability remediation workflows, troubleshooting failed updates and validating remediation outcomes. 

  • Suggest improvements to alert thresholds, noise-reduction logic, and correlation rules based on observed patterns. 

  • Contribute operational data and timelines to post-incident reviews. 

  • Collaborate with Cloud Engineering, Security, IAM, Network, and ServiceNow teams to support continuous operational improvement. 

  • Ensure event, alert, and configuration data aligns with monitoring, governance, and CMDB standards. 

 

Cloud & Windows Infrastructure Responsibilities 

  • Troubleshoot AWS operational issues, including EC2 performance, storage behavior, network connectivity, IAM policy misconfigurations, and service-level degradation. 

  • Perform OS-level diagnostics and remediation primarily on Windows Server environments, with supporting responsibilities across Linux systems. 

  • Analyze telemetry from AWS CloudWatch, OS logs, and vulnerability management tools to identify trends and underlying issues. 

  • Support patch management processes with a primary focus on Windows Server environments, while assisting with Linux patching and lifecycle management, ensuring compliance and remediation validation. 

  • Assist with tagging compliance checks, IAM access troubleshooting, backup validation, and account hygiene activities. 

  • Validate cloud configuration updates, patching workflows, and basic resilience or recovery tests with guidance from senior engineers. 

  • Contribute to infrastructure-as-code (Terraform) improvements by validating modules, testing updates, and providing operational feedback. 

  • Participate in cloud readiness activities for new services, infrastructure changes, or disaster recovery exercises. 

  • Provide troubleshooting guidance and knowledge-sharing to junior team members or adjacent support teams, particularly within Windows Server administration best practices. 

 

Requirements 

  • Bachelor’s degree in IT or related field, or equivalent experience. 

  • 3+ years of experience in cloud operations, systems administration, or infrastructure support roles. 

  • Hands-on experience with AWS services (EC2, VPC, IAM, EBS, monitoring). 

  • Familiarity with Azure cloud environments is a plus. 

  • Working knowledge of Windows Server administration is a must; familiarity with Linux systems a plus. 

  • Experience with cloud monitoring tools, OS log analysis, patching workflows, and vulnerability remediation processes. 

  • Understanding of cloud governance basics including tagging standards, IAM access principles, backups, and cost awareness. 

  • Ability to make light scripting updates (PowerShell, Python, YAML/JSON) to support operational tasks. 

  • Solid troubleshooting and analytical skills with the ability to interpret telemetry and system logs. 

  • Effective communication skills to collaborate across engineering and operations teams. 

  • Motivation to develop deeper skills in automation, AIOps, infrastructure-as-code, and cloud reliability engineering. 

 

Additional Information 

  • Working Hours: This role follows second-shift local hours (3:00 PM – 12:00 PM IST) to provide operational overlap with US-based teams. 

  • On-Call Support: Participation in a shared, rotational on-call schedule is required. 

 

 

If a Genesys employee referred you, please use the link they sent you to apply.

About Genesys:

Genesys® empowers more than 8,000 organizations worldwide to create the best customer and employee experiences. With agentic AI at its core, Genesys Cloud™ is the AI-Powered Experience Orchestration platform that connects people, systems, data and AI across the enterprise. As a result, organizations can drive customer loyalty, growth and retention while increasing operational efficiency and teamwork across human and AI workforces. To learn more, visit www.genesys.com.

Reasonable Accommodations:

If you require a reasonable accommodation to complete any part of the application process, or are limited in your ability to access or use this online application and need an alternative method for applying, you or someone you know may contact us at [email protected].

You can expect a response within 24–48 hours. To help us provide the best support, click the email link above to open a pre-filled message and complete the requested information before sending. If you have any questions, please include them in your email.

This email is intended to support job seekers requesting accommodations. Messages unrelated to accommodation—such as application follow-ups or resume submissions—may not receive a response.

Genesys is an equal opportunity employer committed to fairness in the workplace. We evaluate qualified applicants without regard to race, color, age, religion, sex, sexual orientation, gender identity or expression, marital status, domestic partner status, national origin, genetics, disability, military and veteran status, and other protected characteristics.

Please note that recruiters will never ask for sensitive personal or financial information during the application phase.

Top Skills

Aiops
AWS
Azure
Cloudwatch
Cmdb
Ebs
Ec2
Iam
JSON
Linux
Observability
Powershell
Python
Servicenow
Terraform
Vpc
Vulnerability Management Tools
Windows Server
Yaml

Genesys Chennai, Tamil Nadu, IND Office

Park, Block C, 7th Floor, Plot No. 40, M.G.R Salai, Perungudi, Chennai, Tamil Nadu, India, 600 096

Similar Jobs

An Hour Ago
Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The RTA Manager leads global real-time operations, ensuring optimal performance and adherence. They monitor performance, oversee team development, implement process improvements, and collaborate with multiple departments to enhance service levels.
Top Skills: Bi ToolsExcelPower BISQLTableauWorkforce Management Tools
An Hour Ago
Remote or Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Lead development, scaling, governance, and adoption of enterprise process capabilities (BPM, BPI, BPR). Manage tools, training, standards, integrations, and reusable assets to enable transformation, collaborate with cross-functional stakeholders, and drive delivery enablement and capability maturity.
Top Skills: Business Process Intelligence (Bpi)Business Process Management (Bpm)Business Process Reengineering (Bpr)Performance AnalyticsProcess Modeling ToolsSignavio
An Hour Ago
Hybrid
Chennai, Tamil Nadu, IND
Mid level
Mid level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Governance Reporting and Communication Specialist will support the GRC team and oversee internal communication strategies, reporting, and stakeholder engagement within TransUnion.
Top Skills: Power BIPowerPointSharepoint

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account