Data Engineer (Databrick + Pyspark)

Sorry, this job was removed at 06:09 a.m. (IST) on Thursday, Apr 30, 2026

Hybrid

Pune, Maharashtra

Hybrid

Pune, Maharashtra

Similar Jobs at Capco

Capco

Technical Project Management

12 Hours Ago

Hybrid

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

As a Technical Project Manager, you will oversee end-to-end delivery tracking, manage cross-team dependencies, and ensure alignment between project commitments and execution. You will coordinate with multiple engineering teams in Agile environments, and leverage your technical analysis skills to drive application designs and documentation.

Top Skills: APIsJavaJSONMicroservicesRestSpringXML

Capco

Product Manager

12 Hours Ago

Hybrid

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

The Product Manager / Tech BA converts business requirements into functional specifications, collaborates with engineering teams, designs features, and ensures product quality through metrics and user stories.

Top Skills: AgileBacklog ManagementMetricsPrototypingUser Stories

Capco

Back-end Engineer

17 Hours Ago

Hybrid

Entry level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Design and maintain backend services using Java and Spring Boot, optimize RESTful APIs, and collaborate with teams for efficient coding.

Top Skills: JavaMicroservicesRestful ApisSpring Boot

Job Title: Data Engineer (PySpark / Databricks)

Experience: 5–9 Years Location: Pune (Hybrid – Capco Office)

Job Summary

We are looking for a skilled Data Engineer with strong expertise in PySpark, Databricks, and modern data engineering practices. The ideal candidate will have hands-on experience in building scalable data pipelines, working with large datasets, and leveraging cloud-based data platforms.

Key Responsibilities Design, develop, and maintain scalable ETL/ELT data pipelines Work extensively with PySpark and Apache Spark for large-scale data processing Build and manage workflows using Apache Airflow Develop and optimize data solutions on Databricks (Jobs, Delta Lake) Work with cloud-based data lakes (S3 or equivalent) Write efficient and complex SQL queries for data transformation and analysis Run and manage Spark workloads on EMR Serverless or other managed Spark platforms Ensure data quality, reliability, and performance optimization of pipelines Must Have Skills Strong hands-on experience with PySpark and Apache Spark internals Experience with Databricks (Jobs, Delta Lake) Proficiency in Apache Airflow for workflow orchestration Solid experience building ETL/ELT pipelines at scale Strong SQL skills and experience with Data Warehouse (DWH) systems Experience running Spark workloads on EMR Serverless or managed Spark platforms Hands-on experience with cloud data lakes (S3 or equivalent) Good to Have Skills Experience with Delta Lake / Apache Iceberg Exposure to streaming frameworks (Spark Structured Streaming, Kafka) Familiarity with CI/CD pipelines for data engineering workflows Knowledge of data governance, cataloging, and lineage tools

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.