AHEAD Logo

AHEAD

AI Platform Engineer

Posted Yesterday
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
The AI Platform Engineer will design and optimize AI/ML infrastructure, automate pipelines, and manage Kubernetes for effective machine learning model deployment and collaboration.
The summary above was generated by AI
AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.

At AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We create spaces to empower everyone to speak up, make change, and drive the culture at AHEAD. 

We are an equal opportunity employer, and do not discriminate based on an individual's race, national origin, color, gender, gender identity, gender expression, sexual orientation, religion, age, disability, marital status, or any other protected characteristic under applicable law, whether actual or perceived. 

We embrace all candidates that will contribute to the diversification and enrichment of ideas and perspectives at AHEAD. 

We are seeking an experienced AI Platform Engineer to design, deploy, and optimize AI/ML infrastructure, AI workflows, and automated pipelines. This role focuses on building scalable environments for training and deploying machine learning models, leveraging modern orchestration, automation, and GPU acceleration technologies. You will collaborate with data scientists and platform engineers to drive efficient resource utilization and scalable operations across cloud and hybrid environments.

Key Responsibilities

  • Kubernetes for AI/ML: Architect and manage Kubernetes clusters tailored to AI/ML workloads.
  • GPU Orchestration: Implement Run:ai and operators for GPU resource orchestration and workload scheduling.
  • Automation & Pipelines: Develop and maintain Python-based automation scripts and ML pipelines; automate infrastructure provisioning with Terraform and configuration management with Ansible.
  • Notebooks & Collaboration: Create and manage Jupyter Notebooks for experimentation and collaboration.
  • NVIDIA Integration: Integrate and optimize NVIDIA Enterprise Suite components (CUDA, NeMo Framework, Triton, TensorRT, GPU drivers) for accelerated computing.
  • MLOps Practices: Establish and maintain MLOps best practices for model lifecycle management, CI/CD, and monitoring (e.g., MLflow, Kubeflow).
  • Collaboration: Work closely with data scientists and platform engineers to ensure efficient resource utilization and scalability across environments.

Required Skills & Experience

  • Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch).
  • Hands-on experience with Kubernetes and container orchestration.
  • Familiarity with Run:ai or similar GPU scheduling platforms.
  • Expertise in Terraform and Ansible for infrastructure automation.
  • Experience with Jupyter Notebooks for ML development.
  • Knowledge of NVIDIA Enterprise Suite (CUDA, NeMo Framework, Triton, GPU drivers).
  • Solid understanding of MLOps principles and tools (e.g., MLflow, Kubeflow).
  • Background in deploying and scaling AI workloads in cloud or hybrid environments.

Qualifications

  • 4+ years in platform architecture or solutions architecture, with 2+ years focused on AI/ML workloads.
  • Experience with high-performance computing (HPC) environments.
  • Familiarity with distributed training and model optimization techniques.
  • Certification in Kubernetes or cloud platforms (AWS, Azure, GCP).

Why AHEAD:

Through our daily work and internal groups like Moving Women AHEAD and RISE AHEAD, we value and benefit from diversity of people, ideas, experience, and everything in between.

We fuel growth by stacking our office with top-notch technologies in a multi-million-dollar lab, by encouraging cross department training and development, sponsoring certifications and credentials for continued learning.

India Employment Benefits include: 

Top Skills

Ansible
Cuda
Gpu Drivers
Jupyter Notebooks
Kubeflow
Kubernetes
Mlflow
Nemo Framework
Python
PyTorch
Run:Ai
TensorFlow
Terraform
Triton

Similar Jobs

9 Days Ago
In-Office or Remote
6 Locations
Senior level
Senior level
Artificial Intelligence • Information Technology
The engineer will build automation frameworks for Databricks, integrate AI capabilities, and develop deployment workflows in AWS.
Top Skills: AWSDatabricksDockerPython
23 Days Ago
Remote
India
Mid level
Mid level
AdTech • Marketing Tech
The AI Engineer will optimize and maintain AI-driven services on the Sootra platform, focusing on LLM/VLM pipelines, API management, and production stability.
Top Skills: AWSDockerDramatiqFastapiFlaskGCPGunicornOpenrouterPythonUvicorn
3 Hours Ago
In-Office or Remote
Hyderabad, Telangana, IND
Senior level
Senior level
Pharmaceutical
The Senior Power Platform Engineer will implement AI solutions, integrate Microsoft Copilot, and develop low-code applications to enhance business processes and productivity.
Top Skills: Azure AiAzure Bot ServicesAzure Cognitive ServicesAzure Machine LearningCo Pilot StudioMicrosoft Power PlatformMs SharepointPower AppsPower AutomatePower BIPower Pages

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account