Citi Logo

Citi

Python Data Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Chennai, Tamil Nadu
Senior level
In-Office
Chennai, Tamil Nadu
Senior level
The Python Data Engineer is responsible for designing and developing data pipelines, ensuring data quality, collaborating with data teams, and optimizing database performance.
The summary above was generated by AI

The Engineer Intmd Analyst is an intermediate level position responsible for a variety of engineering activities including the design, acquisition and development of software and infrastructure in coordination with the Technology team. The overall objective of this role is to ensure quality standards are being met within existing and planned frameworks.
Responsibilities:

  • Design, develop, and optimize scalable data pipelines and ETL/ELT processes using Apache Spark (preferably with Scala or Python) to ingest, transform, and load large datasets from diverse sources.
  • Write, optimize, and troubleshoot complex SQL queries, stored procedures, and functions for data extraction, transformation, and reporting within relational and analytical databases.
  • Develop and maintain data models, schema definitions, and database objects in various data storage solutions (e.g., data warehouses, data lakes, operational databases).
  • Ensure data quality, integrity, accuracy, and consistency across all data assets through robust validation and monitoring mechanisms.
  • Collaborate closely with data scientists, data analysts, business intelligence developers, and application teams to understand data requirements and deliver appropriate data solutions.
  • Monitor data pipeline performance, identify bottlenecks, and implement optimizations to improve efficiency and reduce processing times.
  • Manage data lifecycle, including data archival, retention, and compliance with data governance policies and security standards.
  • Participate in code reviews, contribute to documentation, and adhere to engineering best practices.
  • Troubleshoot and resolve data-related issues in production environments.
  • Contribute to the evaluation and selection of new data technologies and tools.

Qualifications:

  • Experience: 5+ years of professional experience in data engineering, backend development with a strong data focus, or a related field.
  • Data Acumen: Strong understanding of data warehousing concepts, dimensional modeling, and data lake architectures.
  • Problem-Solving: Excellent analytical and problem-solving skills, with a keen attention to detail.
  • Communication: Good verbal and written communication skills, with the ability to articulate technical concepts to both technical and non-technical audiences.
  • Teamwork: Ability to work effectively in a collaborative team environment and contribute positively to team goals.
  • Agile: Experience working in an Agile/Scrum development methodology.

Education:

  • Bachelor’s degree/University degree or equivalent experience

Technical Skills

  • Big Data Processing: Strong proficiency with Apache Spark (DataFrames API, Spark SQL) using Scala or Python.
  • Databases: Expert-level SQL skills. Extensive experience with relational databases (e.g., PostgreSQL, Oracle, SQL Server, MySQL) and experience with cloud-native data warehouses (e.g., Snowflake, Google BigQuery, AWS Redshift) or data lake technologies (e.g., Delta Lake).
  • Programming Languages: Strong proficiency in Python or Scala.
  • ETL/ELT Tools: Experience with ETL/ELT methodologies and tools, including data orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Step Functions, GCP Cloud Composer).
  • Cloud Platforms: Exposure to major cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, GCS, EC2, Azure VMs, Kubernetes).
  • Version Control: Proficiency with Git and standard version control workflows.
  • Data Modeling: Experience in designing and implementing efficient and scalable data models.
  • Performance Tuning: Ability to optimize Spark jobs, SQL queries, and database performance.
  • Linux/Unix: Familiarity with Linux/Unix environments for scripting and job execution.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Systems & Engineering

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Top Skills

Apache Airflow
Spark
AWS
Aws Redshift
Aws Step Functions
Azure
Azure Data Factory
Delta Lake
GCP
Gcp Cloud Composer
Git
Google Bigquery
Linux
MySQL
Oracle
Postgres
Python
Scala
Snowflake
SQL
SQL Server
Unix

Citi Chennai, Tamil Nadu, IND Office

C P Ramaswamy Road, Chennai, Tamil Nadu, India, 600018

Similar Jobs

Yesterday
In-Office
Chennai, Tamil Nadu, IND
Senior level
Senior level
Fintech • Financial Services
The Python Data Engineer designs and optimizes data pipelines and ETL processes, ensuring data quality and collaborating with teams to deliver data solutions.
Top Skills: Apache AirflowSparkAws RedshiftAws Step FunctionsAzure Data FactoryDelta LakeGcp Cloud ComposerGitGoogle BigqueryMySQLOraclePostgresPythonScalaSnowflakeSQL Server
7 Days Ago
In-Office
Chennai, Tamil Nadu, IND
Senior level
Senior level
Fintech • Financial Services
Lead end-to-end analytics including data extraction, statistical analysis and modeling, complex SQL development, BI dashboarding, cross-functional collaboration, mentoring junior analysts, ensuring data quality, and identifying process improvements to inform business decisions.
Top Skills: Aws AthenaAws RedshiftAws S3Azure Data LakeAzure Synapse AnalyticsDplyrGgplot2Google BigqueryGoogle Cloud StorageHadoopLookerNumpyPandasPower BIPythonQlik SenseRSparkSQLTableau
40 Minutes Ago
Remote or Hybrid
India
Senior level
Senior level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Manage implementation and support of Ping Identity services including PingFederate and PingID MFA, ensuring reliability and security of authentication systems. Oversee project management and deliver operational efficiency through troubleshooting and service improvement.
Top Skills: C#JavaOauth 2.0Openid ConnectPing DirectoryPingfederatePingid MfaPowershellSAML

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account