Trimble Logo

Trimble

Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
In-Office
Chennai, Tamil Nadu
Mid level
In-Office
Chennai, Tamil Nadu
Mid level
The Site Reliability Engineer will deploy and maintain AI/ML systems, support CI/CD pipelines, monitor performance, and contribute to infrastructure automation.
The summary above was generated by AI

Job Location: Chennai, India

Job Title: DevOps Engineer II (AI Ops / ML Ops)

Work Mode: Onsite

Our Department: Trimble's Construction Management Solutions (CMS) division is dedicated to transforming the construction industry. We provide technology solutions that streamline and optimize workflows for preconstruction, project management, and field operations. By connecting the physical and digital worlds, we help our customers improve productivity, efficiency, and project outcomes.

Are you passionate about deploying, monitoring, and scaling machine learning systems in production environments and eager to contribute to robust AI infrastructure within a collaborative team?

What You Will Do

This role offers an exciting opportunity to work in AI/ML Development and Operations (DevOps) engineering, working within a dynamic team that values reliability and continuous improvement. The successful candidate will contribute to the deployment and maintenance of AI/ML systems in production, gaining hands-on experience with MLOps best practices and infrastructure automation. This position provides a structured environment for developing core competencies in ML system operations, DevOps practices, and production ML monitoring, with direct guidance from experienced professionals.

  • Assist in the deployment and maintenance of machine learning models in production environments under direct supervision, learning containerization technologies like Docker and Kubernetes.

  • Support CI/CD pipeline development for ML workflows, including model versioning, automated testing, and deployment processes using tools like Azure DevOps.

  • Monitor ML model performance, data drift, and system health in production environments, implementing basic alerting and logging solutions.

  • Contribute to infrastructure automation and configuration management for ML systems, learning Infrastructure as Code (IaC) practices with tools like Terraform or CloudFormation.

  • Collaborate with ML engineers and data scientists to operationalize models, ensuring scalability, reliability, and adherence to established MLOps procedures and best practices.

What Skills & Experience You Should Bring

Required:

  • 2 to 5 years of professional experience in in DevOps, MLOps, or systems engineering environment.

  • Bachelor's degree in Computer Science, Engineering, Information Technology, or a closely related technical field. Trimble's Professional ladder typically requires four or more years of formal education.

  • Experience with Microsoft Azure and its services including ML/AI (Azure ML, Azure DevOps, etc.) – Must Have

  • Foundational knowledge of DevOps principles and practices, with understanding of CI/CD concepts and basic system administration.

  • Proficiency with Python or other scripting languages (Shell / Bash / PowerShell / Perl) for automation scripting and system integration.

  • Understanding of containerization technologies (Docker) and basic orchestration concepts (Kubernetes fundamentals).

  • Familiarity with version control systems (Git) and collaborative development workflows.

  • Basic understanding of machine learning concepts and the ML model lifecycle from development to production.

Preferred:

  • Familiarity with MLOps tools and frameworks (MLflow, Kubeflow, DVC, or similar).

  • Basic experience with monitoring and observability tools (Prometheus, Grafana, ELK stack).

  • Understanding of Infrastructure as Code (IaC) tools like Terraform or Ansible.

  • Experience with Windows/Linux system administration and command-line tools.

  • Knowledge of database systems and data pipeline technologies.

  • Exposure to model serving frameworks (TensorFlow Serving, TorchServe, ONNX Runtime).

  • Basic understanding of security best practices for ML systems and data governance.

About Our Division: Construction Management Solutions (CMS)

Trimble's Construction Management Solutions (CMS) division is dedicated to transforming the construction industry. We provide technology solutions that streamline and optimize workflows for preconstruction, project management, and field operations. By connecting the physical and digital worlds, we help our customers improve productivity, efficiency, and project outcomes.

How to Apply: Please submit an online application for this position by clicking on the ‘Apply Now’ button located in this posting.

Application Deadline: Applications could be accepted until at least 30 days from the posting date.

Join a Values-Driven Team: Belong, Grow, Innovate. 

At Trimble, our core values of Belong, Grow, and Innovate aren't just words—they're the foundation of our culture. We foster an environment where you are seen, heard, and valued (Belong); where you have an opportunity to build a career and drive our collective growth (Grow); and where your innovative ideas shape the future (Innovate). We believe in empowering local teams to create impactful strategies, ensuring our global vision resonates with every individual. Become part of a team where your contributions truly matter. 

Trimble’s Privacy Policy

If you need assistance or would like to request an accommodation in connection with the application process, please contact [email protected].

Top Skills

Azure Devops
Azure Ml
Docker
Elk Stack
Git
Grafana
Kubernetes
Azure
Prometheus
Python
Terraform

Trimble Chennai, Tamil Nadu, IND Office

Rajiv Gandhi Street, Chennai, Tamil Nadu, India, 600113

Trimble Tharamani, Tamil Nadu, IND Office

No. 4 Rajiv Gandhi Salai, , Tharamani, Chennai, India, 600 113,

Similar Jobs

2 Days Ago
In-Office
Chennai, Tamil Nadu, IND
Senior level
Senior level
Consumer Web • eCommerce • Fashion • Retail
As a Staff Site Reliability Engineer, you will ensure the health and scalability of systems, collaborate with development teams, automate operations, and manage infrastructure with a focus on performance and capacity.
Top Skills: Amazon Web ServicesAnsibleDatadogDockerElasticsearchHaproxyJavaScriptJenkinsKubernetesMongoDBNginxNode.jsPackerRabbitMQRedisRubyTerraformTomcat
16 Days Ago
In-Office
Chennai, Tamil Nadu, IND
Senior level
Senior level
Hardware • Information Technology • Other • Software • Analytics
Lead the management and optimization of observability platforms, utilizing automation and AI to enhance performance, reliability, and operational excellence. Collaborate with engineering teams on best practices and support customer onboarding to 24x7 services.
Top Skills: AIAnsibleAWSAzureBashDatadogDockerGoGoogle Cloud PlatformJavaKubernetesNewrelicObservabilityOpentelemetryPuppetPythonSplunkTerraform
16 Days Ago
In-Office
Chennai, Tamil Nadu, IND
Mid level
Mid level
Hardware • Information Technology • Other • Software • Analytics
The Lead Site Reliability Engineer focuses on optimizing observability through AI tools, maintaining scalable platforms, automating tasks, and ensuring operational excellence while collaborating with engineering teams.
Top Skills: AnsibleAWSAzureBashDatadogDockerGoGoogle Cloud PlatformJavaKubernetesNewrelicOpentelemetryPuppetPythonSplunkTerraform

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account