Comcast

System Analyst 3

Posted 2 Days Ago

Be an Early Applicant

In-Office

Chennai, Tamil Nadu

Senior level

In-Office

Chennai, Tamil Nadu

Senior level

The role involves designing and implementing machine learning models, integrating intelligent systems into operations, and collaborating with cross-functional teams to improve system reliability and performance.

The summary above was generated by AI

Comcast brings together the best in media and technology. We drive innovation to create the world's best entertainment and online experiences. As a Fortune 50 leader, we set the pace in a variety of innovative and fascinating businesses and create career opportunities across a wide range of locations and disciplines. We are at the forefront of change and move at an amazing pace, thanks to our remarkable people, who bring cutting-edge products and services to life for millions of customers every day. If you share in our passion for teamwork, our vision to revolutionize industries and our goal to lead the future in media and technology, we want you to fast-forward your career at Comcast.

Job Summary

Responsible for planning and designing new software and web applications. Analyzes, tests and assists with the integration of new applications. Documents all development activity. Assists with training non-technical personnel. Has in-depth experience, knowledge and skills in own discipline. Usually determines own work priorities. Acts as a resource for colleagues with less experience.

Job Description

About the Role:

We are seeking an experienced Sr. System Analyst to join our growing Global Operational Intelligence team. You will play a key role in building intelligent systems that help reduce alert noise, detect anomalies, correlate events, and proactively surface operational insights across our large-scale streaming infrastructure.

You’ll work at the intersection of machine learning, artificial intelligence, observability, and IT operations, collaborating closely with Platform Engineers, SREs, Incident Managers, Operators and Developers to integrate smart detection and decision logic directly into our operational workflows.

This role offers a unique opportunity to push the boundaries of AI/ML in large-scale operations. We welcome futuristic and innovative mindsets who want to stay ahead of the curve, bring innovative ideas to life, and improve the reliability of streaming infrastructure that powers millions of users globally.

What You’ll Do:

Analyze, Design and tune machine learning models for big data processing through a multitude of system analysis methods aligning with our design patterns in a cloud environment (AWS, Google, Azure)

System Testing and Quality Assurance with oversight of quality engineering

Apply NLP and ML techniques to classify and structure logs and unstructured alert messages

Develop and maintain real-time and batch data pipelines to process alerts, metrics, traces, and logs

Use Python, SQL, and time-series query languages (e.g., PromQL) to manipulate and analyze operational data

Collaborate with engineering teams to deploy models via API integrations, automate workflows, and ensure production readiness

Contribute to the development of self-healing automation, diagnostics, and ML-powered decision triggers

Design and validate entropy-based prioritization models to reduce alert fatigue and elevate critical signals

Conduct A/B testing, offline validation, and live performance monitoring of ML models

Build and share clear dashboards, visualizations, and reporting views to support SREs, engineers, and leadership

Research and diagnose complex application problems and identifying system improvements in an enterprise environment.

Testing the system on a regular basis to ensure quality and function while, writing instruction manuals for the systems

Collaborate on the design of hybrid ML/AI + rule-based systems to support dynamic correlation and intelligent alert grouping

Document business process and change algorithms for continuous improvements for assessing complexity in patterns

Preparing cost benefit analysis on systems platform, and feature and the value chain attributed to the deployed feature and providing recommendations on features that are not used.

Demonstrate a proactive, solution-oriented mindset with the ability to navigate ambiguity and learn quickly

Participate in on-call rotations and provide operational support as needed

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, Statistics or a related field

5+ years of experience building and deploying ML solutions in production environments

2+ years working with AIOps, observability, or real-time operations data

Strong coding skills in Python (including pandas, NumPy, Scikit-learn, PyTorch, or TensorFlow)

Experience working with SQL, time-series query languages (e.g., PromQL), and data transformation in pandas or Spark

Familiarity with LLMs, prompt engineering fundamentals, or embedding-based retrieval (e.g., sentence-transformers, vector DBs)

Strong grasp of modern ML techniques including gradient boosting (XGBoost/LightGBM), autoencoders, clustering (e.g., HDBSCAN), and anomaly detection

Experience managing structured + unstructured data, and building features from logs, alerts, metrics, and traces

Familiarity with real-time event processing using tools like Kafka, Kinesis, or Flink

Strong understanding of model evaluation techniques including precision/recall trade-offs, ROC, AUC, calibration

Comfortable working with relational (PostgreSQL), NoSQL (MongoDB), and time-series (InfluxDB, Prometheus) databases, GraphDB

Ability to collaborate effectively with SREs, platform teams, and participate in Agile/DevOps workflows

Clear written and verbal communication skills to present findings to technical and non-technical stakeholders

Comfortable working across Git, Confluence, JIRA, & collaborative agile environments

Nice to Have:

Experience building or contributing to the AIOps platform (e.g., Moogsoft, BigPanda, Datadog, Aisera, Dynatrace, BMC etc.)

Experience working in streaming media, OTT platforms, or large-scale consumer services

Exposure to Infrastructure as Code (Terraform, Pulumi) and modern cloud-native tooling

Working experience with Conviva, Touchstream, Harmonic, New Relic, Prometheus, & event-based alerting tools

Hands-on experience with LLMs in operational contexts (e.g., classification of alert text, log summarization, retrieval-augmented generation)

Familiarity with vector databases (e.g., FAISS, Pinecone, Weaviate) and embeddings-based search for observability data

Experience using MLflow, SageMaker, or Airflow for ML workflow orchestration

Knowledge of LangChain, Haystack, RAG pipelines, or prompt templating libraries

Exposure to MLOps practices (e.g., model monitoring, drift detection, explainability tools like SHAP or LIME)

Experience with containerized model deployment using Docker or Kubernetes

Use of JAX, Hugging Face Transformers, or LLaMA/Claude/Command-R models in experimentation

Experience designing APIs in Python or Go to expose models as services and/or GraphQL

Cloud proficiency in AWS/GCP, especially for distributed training, storage, or batch inferencing

Contributions to open-source ML or DevOps communities, or participation in AIOps research/benchmarking efforts

Certifications in cloud architecture, ML engineering, or data science specializations

We believe that benefits should connect you to the support you need when it matters most, and should help you care for those who matter most. That's why we provide an array of options, expert guidance and always-on tools that are personalized to meet the needs of your reality—to help support you physically, financially and emotionally through the big milestones and in your everyday life.

Please visit the benefits summary on our careers site for more details.

Education

Bachelor's Degree

While possessing the stated degree is preferred, Comcast also may consider applicants who hold some combination of coursework and experience, or who have extensive related professional experience.

Certifications (if applicable)

Relevant Work Experience

5-7 Years

Comcast is an equal opportunity workplace. We will consider all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, genetic information, or any other basis protected by applicable law.

Top Skills

AWS

Azure

Docker

Flink

GCP

Graphdb

Influxdb

Kafka

Kinesis

Kubernetes

MongoDB

Numpy

Pandas

Postgres

Prometheus

Promql

Python

PyTorch

Scikit-Learn

Spark

SQL

TensorFlow

Similar Jobs

Bounteous

Specialist, PowerPoint Design

2 Hours Ago

Hybrid

Chennai, Tamil Nadu, IND

Mid level

Agency • Digital Media • eCommerce • Professional Services • Software • Analytics • Consulting

The PowerPoint Design Specialist refines presentations, creates visuals, collaborates on design standards, and improves brand engagement through effective storytelling in presentations.

Top Skills: Adobe Creative SuiteAdobe XdCanvaFigmaIllustratorIndesignPhotoshopPowerPoint

Toast

Quality Assurance Analyst

2 Hours Ago

In-Office

Chennai, Tamil Nadu, IND

Senior level

Cloud • Fintech • Food • Information Technology • Software • Hospitality

The Sr. Lending Operations QA Analyst will ensure the accuracy and compliance of lending processes, optimize operations, and enhance quality assurance practices while collaborating with cross-functional teams.

Top Skills: ExcelGoogle SuiteSnowflakeSQL

Udemy

Senior Engineering Manager

Yesterday

Easy Apply

Hybrid

Chennai, Tamil Nadu, IND

Easy Apply

Senior level

Artificial Intelligence • Consumer Web • Edtech • Enterprise Web • HR Tech • Social Impact • Generative AI

Lead the Globalisation engineering team, manage engineers, collaborate on scalable solutions, mentor team members, and drive product vision and technical direction.

Top Skills: AWSDjangoDockerDynamoDBGraphQLGrpcHibernateJavaKotlinKubernetesMySQLPythonRestSpring Boot

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.