Company Overview
Group/Division
Job Description/Preferred Qualifications
About the Company
KLA is a global leader in yield management and process control solutions for the semiconductor industry. With decades of innovation, KLA enables the world’s top chipmakers to accelerate next‑generation device manufacturing with precision and efficiency. Our software teams build advanced platforms powering electron-beam inspection systems—solving complex physics and data‑intensive challenges using high-performance, scalable computing.
This role focuses on designing, developing, and optimizing distributed, high-throughput software systems operating on advanced HPC infrastructure. The position requires strong technical ownership, hands-on Linux C++ development skills, deep performance engineering experience, and collaboration across multidisciplinary teams.
- Design and develop high-performance distributed software systems for large-scale HPC environments.
- Build and optimize Linux C/C++ components for compute-intensive and timing-critical workloads.
- Implement parallel/distributed computing frameworks using MPI, OpenMP, UCX, or similar technologies.
- Containerize and orchestrate compute workloads using Docker/Singularity with Kubernetes or SLURM.
- Profile, debug, and tune system performance using VTune, Nsight, perf, gdb, and related tools.
- Drive architectural discussions, code quality, and engineering best practices.
- Collaborate with algorithms, hardware, and systems teams to deliver tightly integrated solutions.
- Mentor team members in HPC concepts, system debugging, and performance optimization.
- Strong hands-on expertise in C/C++ development on Linux, including systems-level programming.
- Proven experience building or optimizing HPC or distributed computing systems.
- Solid understanding of concurrency, multi-threading, networking, IPC, and Linux OS internals.
- Experience with profiling/debugging tools such as VTune, Nsight, perf, ftrace, gdb.
- Experience with Docker/Singularity and orchestration frameworks (Kubernetes, SLURM).
- Knowledge of CPU/GPU architectures, high-bandwidth interconnects, and distributed storage systems.
- Experience using or optimizing MPI, OpenMP, UCX, SHMEM, or similar parallel programming models.
- Exposure to GPU compute frameworks (CUDA/RoC) or GPU-aware communication libraries.
- Familiarity with deep learning or ML pipeline workflows.
- Proficiency in Python and Bash scripting.
- Background in distributed microservices, observability tools, or large-scale system deployments.
- Bachelor’s or Master’s degree.
- Typically 6+ years of hands-on experience in HPC, Linux systems programming, or distributed systems development.
Minimum Qualifications
Education & Experience
- Bachelor’s or Master’s degree.
- Typically 6+ years of hands-on experience in HPC, Linux systems programming, or distributed systems development.
We offer a competitive, family friendly total rewards package. We design our programs to reflect our commitment to an inclusive environment, while ensuring we provide benefits that meet the diverse needs of our employees.
KLA is proud to be an equal opportunity employer
Be aware of potentially fraudulent job postings or suspicious recruiting activity by persons that are currently posing as KLA employees. KLA never asks for any financial compensation to be considered for an interview, to become an employee, or for equipment. Further, KLA does not work with any recruiters or third parties who charge such fees either directly or on behalf of KLA. Please ensure that you have searched KLA’s Careers website for legitimate job postings. KLA follows a recruiting process that involves multiple interviews in person or on video conferencing with our hiring managers. If you are concerned that a communication, an interview, an offer of employment, or that an employee is not legitimate, please send an email to [email protected] to confirm the person you are communicating with is an employee. We take your privacy very seriously and confidentially handle your information.

