Citi Logo

Citi

Senior Incident Optimization Specialist – CPU, Memory, Unix, Linux, OS agents, Windows

Reposted Yesterday
Be an Early Applicant
In-Office
Chennai, Tamil Nadu, IND
Senior level
In-Office
Chennai, Tamil Nadu, IND
Senior level
The Optimization Lead oversees efficiency frameworks in production operations, focusing on AI-enabled optimizations, analytics, operational workflows, and leadership, ensuring actionable insights and continuous improvement across data handling.
The summary above was generated by AI

Position Summary

The Senior Incident Optimization Specialist is a specialized technical leadership role requiring deep expertise in understanding the technology foundations of core infra services in platforms and devices such as “CPU, Memory, Unix, Linux, OS agents, Windows” that helps the Business & Product Application in Enterprise to function.

This position is critical to the success of the Incident Reduction Program, providing delivery of solutions which optimize and automate operations workflows.

You will be responsible for building automated incident remediation workflows and achieving measurable incident reduction through intelligent alert optimization, correlation, and automation while preserving the critical observability required for business-critical mainframe applications and batch processing. This role offers the unique opportunity to modernize event management for legacy systems using cutting-edge AIOps platforms and automation technologies.

Key Responsibilities

  • Incident & Alert Analysis: Conduct in-depth analysis of mainframe and batch processing alerts to identify chronic issues, reduce operational noise, and develop strategies to address high-volume incident generators, including recurring job failures.
  • Intelligent Event Management: Design and implement domain-specific correlation, de-duplication, and suppression rules on AIOps and event management platforms. Develop logic that understands mainframe subsystem relationships and cascading batch job dependencies.
  • Automation & Self-Healing: Architect and develop automation playbooks for incident data enrichment, automated job restarts, and self-healing capabilities for common mainframe and batch processing failures.
  • Observability Enhancement: Assess monitoring gaps in mainframe and batch environments, proposing enhancements to ensure critical business processes have appropriate alerting coverage and align with enterprise standards.
  • Cross-Functional Collaboration: Partner closely with mainframe operations, batch scheduling, and application development teams to validate correlation logic, define automation initiatives, and provide expert guidance on modern event management practices.
  • Quality Assurance: Continuously validate the effectiveness of implemented rules and automation. Establish feedback loops with operational teams to conduct post-implementation reviews and iterative improvements.

Required Qualifications

  • Education: Bachelor’s degree in computer science, Information Technology, Computer Engineering, or a related technical field.
  • Experience: A minimum of 8+ years of hands-on experience in building and supporting technology foundations of core infra services in platforms and devices “CPU, Memory, Unix, Linux, OS agents, Windows”
  • Event Management & Incident Reduction: Proven track record in event management, alert tuning, and incident reduction within complex legacy environments pertaining to platforms “CPU, Memory, Unix, Linux, OS agents, Windows” with quantifiable results. Direct, hands-on experience with modern AIOps and event management platforms is required.
  • Technical Expertise:
    • Deep understanding the technology foundations of core infra services in platforms and devices such as “CPU, Memory, Unix, Linux, OS agents, Windows” that helps the Business & Product Application in Enterprise to function.
  • Automation & Scripting: Hands-on experience developing robust automation solutions using relevant scripting languages and modern automation frameworks.
  • Data Analysis: Proficiency in log analysis, pattern recognition, and using query languages for data analysis on log aggregation platforms.
  • Problem-Solving & Analytical Skills: Excellent analytical abilities with a systematic approach to troubleshooting complex
  • Communication & Leadership: Exceptional communication skills with the ability to bridge, Unix, Linux, Windows and technology teams, influence collaboration, and present technical concepts to diverse audiences.

Role Significance & Business Impact (Note)
This role is critical due to the scale and complexity of production data handled, requiring the Optimization Lead to examine and derive insights from approximately 4,000,000 (40 lakhs) operational data points annually across multiple technology sources.
At this scale:
• Manual or reactive analysis is not feasible
• Advanced data analytics and AI-driven frameworks become mandatory, not optional
• Accuracy in event correlation, prioritization, and response directly impacts system stability, customer experience, and operational cost
The Optimization Lead ensures that high-volume production data is transformed into actionable intelligence, enabling proactive issue identification, noise reduction, efficiency gains, and continuous improvement across production operations.

 

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Infrastructure

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Top Skills

Ansible
Appdynamics
Azure Ml
Control-M
Databricks
Dynatrace
Elk Stack
Geneos
Grafana
Power BI
Prometheus
Python
Servicenow
Servicenow Automation Frameworks
Snowflake
Splunk
SQL
Tableau

Citi Chennai, Tamil Nadu, IND Office

C P Ramaswamy Road, Chennai, Tamil Nadu, India, 600018

Similar Jobs

14 Minutes Ago
Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Aerospace • Digital Media • Information Technology • Internet of Things • Mobile • Software
The Software Architect designs, develops, and validates high-quality software, guiding teams and ensuring optimal architecture, testing, and integration processes.
Top Skills: AngularAzure DevopsJavaScriptPython
20 Minutes Ago
Remote or Hybrid
India
Expert/Leader
Expert/Leader
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The role involves managing finance change, regulatory reporting, project management, and software testing, requiring extensive experience in financial services and team leadership.
Top Skills: Automation Testing ToolsExcelSQL
Yesterday
Remote or Hybrid
India
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Drive new logo acquisition and revenue within a defined territory by executing end-to-end sales cycles, collaborating with SEs and SDRs, forecasting, and managing renewals and upsell opportunities while becoming an expert in CrowdStrike offerings and the competitive landscape.
Top Skills: Cloud-Native SecurityCrowdstrikeEdrSaaS

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account