Citi Logo

Citi

Senior PySpark Data Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Pune, Mahārāshtra
Senior level
In-Office
Pune, Mahārāshtra
Senior level
The Senior PySpark Data Engineer will design and maintain data pipelines, optimize Spark jobs, mentor junior engineers, and ensure data integrity.
The summary above was generated by AI

Key Responsibilities About the Role

We are seeking a highly skilled and experienced Senior PySpark Data Engineer to join our dynamic data engineering team. The ideal candidate will have a strong background in building and managing large-scale data processing systems and a proven track record of working with cutting-edge Big Data technologies. You will be responsible for designing, developing, and maintaining our data pipelines, ensuring they are efficient, reliable, and scalable to meet our growing business needs.

Key Responsibilities

  • Design, develop, and maintain robust, scalable, and high-performance data pipelines using PySpark.
  • Develop, schedule, and monitor complex data workflows using orchestration tools like Apache Airflow.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver high-quality data solutions.
  • Optimize and tune Spark jobs for performance and efficiency.
  • Implement data quality checks and ensure data integrity across all data pipelines.
  • Design and implement data models for optimal storage and retrieval.
  • Mentor junior data engineers and promote best practices in data engineering.
  • Ensure compliance with data governance and security policies.
  • Troubleshoot and resolve data-related issues in a timely manner.

Required Qualifications

  • 6+ years of professional relevant experience in a data engineering role
  • Extensive hands-on experience with PySpark and advanced Python programming skills.
  • Proven experience with Big Data ecosystems, including Cloudera and/or DataBricks.
  • Hands-on experience with distributed query engines like Starburst (Trino/Presto).
  • Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow.
  • Strong expertise in SQL and experience with relational and non-relational databases.
  • Solid understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques.
  • Experience working in a Linux/Unix environment.
  • GIT HUB, CI/CD Pipeline

Education:

  • Bachelor’s degree/University degree or equivalent experience

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group: Technology

------------------------------------------------------

Job Family:Applications Development

------------------------------------------------------

Time Type:Full time

------------------------------------------------------

Most Relevant Skills Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Citi Chennai, Tamil Nadu, IND Office

C P Ramaswamy Road, Chennai, Tamil Nadu, India, 600018

Similar Jobs

16 Days Ago
In-Office or Remote
IN
Senior level
Senior level
Insurance
Design and develop scalable data pipelines using PySpark and Databricks, focusing on data ingestion, transformation, validation, and performance optimization.
Top Skills: DatabricksPysparkPythonSparkSQL
28 Minutes Ago
Hybrid
Senior level
Senior level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Sr. Software Engineer will lead software delivery, enhance existing code, participate in Agile processes, and ensure quality in software solutions at Mastercard.
Top Skills: Apache FlinkApache KafkaAWSCheckmarxGitJavaNatsSonarSpring Boot
28 Minutes Ago
Hybrid
Mid level
Mid level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Software Engineer II position involves developing high-quality, reusable microservices using Java and Spring Boot while adhering to Agile practices and corporate security guidelines.
Top Skills: Ci/CdCloud EnvironmentEvent Driven ArchitectureJava 8+NoSQLObject-Oriented ProgrammingRdbmsSpring Boot

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account