Specialist, HPC Systems Research & Development

Posted 20 Days Ago
Be an Early Applicant
Industrial Estate, Mambalam Guindy, Chennai, Tamil Nadu
1-3 Years Experience
Hardware • Semiconductor
The Role
Specialist role in developing system-level HPC technologies for next-generation clusters used in KLA tools leveraging AI for semiconductor manufacturing. Responsibilities include exposing limitations in existing solutions, developing distributed frameworks for scaling out image processing & AI loads, and evaluating hardware for prototyping.
Summary Generated by Built In

Company Overview

KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world’s leading technology providers to accelerate the delivery of tomorrow’s electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.

Group/Division

KLA advanced computing Labs’ (ACL) mission in India is to deliver advanced parallel computing research and software architectures for AI + HPC + Cloud solutions to accelerate the performance of KLA's products. ACL explores high-risk approaches, pioneering technologies, and novel methods to accelerate KLA’s algorithms and contribute to KLA’s HPC technology roadmap. Located out of the IIT Madras Research Park in Chennai, India, we engage leading thinkers in academia, industry and KLA’s business units to create innovative parallel computing methods to enable KLA’s business growth.

Job Description

KLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools that leverage AI to push the boundaries of process control for conductor manufacturing. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation KLA tools.

 

Your Day-to-day Roles

  • Expose limitations in existing solutions, based on clusters of CPUs & GPUs, to deploy AI-based solutions on on-prem & cloud infrastructures at scale.
  • Develop distributed frameworks and system-level solutions that enable scaling out image processing & AI loads from single GPU to multi-node clusters with multiple GPUs.
  • Install, benchmark, and evaluate pre-release hardware for early-stage evaluation and prototyping by identifying (or developing) relevant workloads.

Minimum Qualifications

  • Masters / PhD in Computer Science or related fields; bachelors degree holders with relevant experience and extraordinary track-record will also be considered.
  • Deep understanding of operating systems, computer networks, and high performance applications
  • Good mental model of the architecture of a modern distributed systems that is comprised of CPUs, GPUs, and accelerators.
  • Experience with deployments of deep-learning frameworks based on TensorFlow, and PyTorch on large-scale on-prem or cloud infrastructures.
  • Strong background in modern and advanced C++ concepts
  • Strong Scripting Skills in Bash, Python, or similar.
  • Good communication.

Things to Make us go Wow!

  • Experience in heterogenous programming languages like CUDA, Triton, etc.
  • Experience with model development on DL frameworks such as TensorFlow, and PyTorch
  • Experience with building open-source operating systems and software stack on pre-release hardware.
  • Solid understanding of container infrastructure such as Docker or singularity, and Kubernetes.
  • Active participation in C++ standards bodies or similar

We offer a competitive, family friendly total rewards package. We design our programs to reflect our commitment to an inclusive environment, while ensuring we provide benefits that meet the diverse needs of our employees.

KLA is proud to be an equal opportunity employer

Top Skills

Deep Learning
High Performance Computing
Machine Learning
The Company
HQ: Milipitas, CA
10,001 Employees
On-site Workplace

What We Do

KLA develops industry-leading equipment and services that enable innovation throughout the electronics industry. We provide advanced process control and process-enabling solutions for manufacturing wafers and reticles. In close collaboration with leading customers across the globe, our expert teams of physicists, engineers, data scientists and problem-solvers design solutions that move the world forward.

Jobs at Similar Companies

Fusion92 Logo Fusion92

Account Executive

AdTech • Agency • Digital Media • Enterprise Web • Marketing Tech • Analytics • Web3
IL, USA
263 Employees

ForeFlight Logo ForeFlight

Product Designer II

Aerospace • Software • App development
Remote
Austin, TX, USA
466 Employees

IonQ Logo IonQ

Lead Ion Trap Design Engineer

Artificial Intelligence • Hardware • Information Technology • Internet of Things • Software
Easy Apply
Seattle, WA, USA
305 Employees

Snap Inc. Logo Snap Inc.

Application Engineer, Salesforce UI

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
New York, NY, USA
5000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account