Ensono Logo

Ensono

Manager of Monitoring Operations

Posted 2 Days Ago
Be an Early Applicant
Easy Apply
Hybrid
Chennai, Tamil Nadu, IND
Expert/Leader
Easy Apply
Hybrid
Chennai, Tamil Nadu, IND
Expert/Leader
Lead and manage the enterprise monitoring operations team to ensure availability, performance, and reliability of infrastructure and applications. Oversee BMC Helix, OpenShift, Prometheus/Grafana, and Entuity monitoring; manage upgrades, capacity, alerting quality, SOPs, DR tests, incident escalations, ITIL alignment, and stakeholder reporting.
The summary above was generated by AI

Job Description: Manager – Monitoring Operations

Role Summary

The Manager – Monitoring Operations will lead and manage the enterprise monitoring operations team responsible for the availability, performance, and reliability of IT infrastructure and applications. This role will oversee the day-to-day operations of BMC Helix On-Premises Monitoring tool deployed on RedHat OCP (OpenShift Container Platform), Network and Device monitoring using ParkPlace Entuity, along with OS Monitoring using Prometheus-Grafana, ensuring a high service quality, operational excellence, and continuous improvement.

The role requires strong people management skills, deep technical expertise in systems monitoring platforms, and experience operating monitoring solutions in containerized environments.

Key Responsibilities

· Lead, mentor, and manage a team of monitoring engineers/analysts, defining goals, KPIs, shift coverage, and on-call rotations.

· Drive skill development through performance reviews, training initiatives, and continuous learning plans.

· Act as escalation point for major monitoring incidents and outages, guiding quick workarounds to prevent monitoring gaps and loss of metrics.

· Ensure operational excellence aligned with ITIL practices (Incident, Problem, Change) and adherence to security, compliance, and operational standards.

· Manage upgrades, patches, capacity planning, and health checks across the monitoring estate to maintain high availability and performance.

· Oversee the Server (Windows/Linux/AIX), Network, Database & Synthetic URL Monitoring for the Enterprise and for the Global clients’ private cloud.

· Collaborate with Container Platform, Core Infrastructure, and Network teams on platform stability, scaling, resilience, and resource allocation.

· Optimize alert quality, reduce alert fatigue, standardize dashboards/alerting frameworks, and deliver actionable insights.

· Maintain SOPs, runbooks, and operational documentation; provide regular reports on platform health, incidents, and SLA compliance.

· Serve as the primary stakeholder contact for all monitoring services.

· Conduct annual disaster-recovery (DR) tests for the monitoring estate to validate resilience, recovery procedures, and business continuity readiness.




Required Experience & Qualifications

Experience

· 10+ years of overall IT industry experience, including 5+ years in monitoring operations in medium-to-large organizations.

· Hands-on operational expertise with at least two of the following monitoring platforms/tools:

o BMC Helix Monitoring (SaaS or On-Prem)

o RedHat OpenShift Container Platform (OCP) or Kubernetes Cluster Management

o Prometheus, Exporters, OTEL Collectors, and Grafana

o ParkPlace Entuity Network and Hardware Monitoring

· Proven experience in monitoring architecture design, capacity planning, performance tuning, and integration with ITSM tools for automated ticketing workflows.

· Strong knowledge of ITIL processes and operational best practices.

Leadership & Soft Skills

· Strong people-management and leadership capabilities

· Excellent communication and stakeholder-management skills

· Ability to handle high-pressure situations and lead incident response

· Strategic mindset with a focus on operational maturity and optimization

Education & Certifications

· Bachelor’s degree in computer science, Information Technology, or equivalent

· Relevant certifications (preferred, not mandatory):

o RedHat OpenShift / Kubernetes

o BMC Helix

o Foundation certifications in ITIL and/or AI

Nice-to-Have

· Exposure to hybrid or multi-cloud environments

· Experience in Automation, Scripting, APIs and AI-driven service improvements

· Application Performance Monitoring (APM) experience

Similar Jobs

51 Minutes Ago
Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Design, build, and maintain Java-based cloud-native backend services and microservices for a fraud platform. Contribute to system design, CI/CD, integrations, performance, and production reliability. Mentor junior engineers and collaborate across product, architecture, DevOps, and cross-functional teams in a hybrid environment.
Top Skills: AWSAzureCi/CdDevOpsGCPGitGradleJavaKafkaMavenMicroservicesNoSQLRabbitMQRdbmsRest ApisSpring Boot
18 Hours Ago
Hybrid
Chennai, Tamil Nadu, IND
Mid level
Mid level
Digital Media • Information Technology • News + Entertainment
Write and maintain clean, efficient code; design and develop new applications; implement unit and integration testing frameworks; document development and deployment processes; analyze integration needs and system architecture; collaborate with QA; troubleshoot performance and functional issues; exercise independent judgment and work variable schedules including nights and weekends.
18 Hours Ago
Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Digital Media • Information Technology • News + Entertainment
Perform penetration tests (web, network, API, mobile) and vulnerability assessments; simulate attacks, exploit and document weaknesses; produce remediation reports; run red team exercises and security audits; develop automated testing tools; collaborate with development, DevOps, QA and stakeholders; mentor junior staff and manage technical documentation to improve security and compliance (OWASP, NIST).
Top Skills: BashBurp SuiteDnsHTTPHttpsLinuxMetasploitNistNmapOwaspOwasp ZapPowershellPythonTcp/IpWindowsWireshark

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account