Citi Logo

Citi

Data Engineer - Controls Technology

Job Posted 10 Days Ago Posted 10 Days Ago
Be an Early Applicant
2 Locations
Mid level
2 Locations
Mid level
The Data Engineer will design and implement data pipelines, manage cloud integrations, ensure data governance and security, and collaborate with teams to optimize data solutions.
The summary above was generated by AI

We are seeking a highly skilled and hands-on Data Engineer to join Controls Technology to support the design, development, and implementation of our next-generation Data Mesh and Hybrid Cloud architecture. This role is critical in building scalable, resilient, and future-proof data pipelines and infrastructure that enable the seamless integration of Controls Technology data within a unified platform. The Data Engineer will work closely with the Data Mesh and Cloud Architect Lead to implement data products, ETL/ELT pipelines, hybrid cloud integrations, and governance frameworks that support data-driven decision-making across the enterprise.

Key Responsibilities:

Data Pipeline Development:

  • Design, build, and optimize ETL/ELT pipelines for structured and unstructured data.
  • Develop real-time and batch data ingestion pipelines using distributed data processing frameworks.
  • Ensure pipelines are highly performant, cost-efficient, and secure.

Apache Iceberg & Starburst Integration:

  • Work extensively with Apache Iceberg for data lake storage optimization and schema evolution.
  • Manage Iceberg Catalogs and ensure seamless integration with query engines.
  • Configure and maintain Hive MetaStore (HMS) for Iceberg-backed tables and ensure proper metadata management.
  • Utilize Starburst and Stargate to enable distributed SQL-based analytics and seamless data federation.
  • Optimize performance tuning for large-scale querying and federated access to structured and semi-structured data.

Data Mesh Implementation:

  • Implement Data Mesh principles by developing domain-specific data products that are discoverable, interoperable, and governed.
  • Collaborate with data domain owners to enable self-service data access while ensuring consistency and quality.

Hybrid Cloud Data Integration:

  • Develop and manage data storage, processing, and retrieval solutions across AWS and on-premise environments.
  • Work with cloud-native tools such as AWS S3, RDS, Lambda, Glue, Redshift, and Athena to support scalable data architectures.
  • Ensure hybrid cloud data flows are optimized, secure, and compliant with organizational standards.

Data Governance & Security:

  • Implement data governance, lineage tracking, and metadata management solutions.
  • Enforce security best practices for data encryption, role-based access control (RBAC), and compliance with policies such as GDPR and CCPA.

Performance Optimization & Monitoring:

  • Monitor and optimize data workflows, performance tuning of queries, and resource utilization.
  • Implement logging, alerting, and monitoring solutions using CloudWatch, Prometheus, or Grafana to ensure system health.

Collaboration & Documentation:

  • Work closely with data architects, application teams, and business units to ensure seamless integration of data solutions.
  • Maintain clear documentation of data models, transformations, and architecture for internal reference and governance.

Required Technical Skills:

Programming & Scripting:

  • Strong proficiency in Python, SQL, and Shell scripting.
  • Experience with Scala or Java is a plus.

Data Processing & Storage:

  • Hands-on experience with Apache Spark, Kafka, Flink, or similar distributed processing frameworks.
  • Strong knowledge of relational (PostgreSQL, MySQL, Oracle) and NoSQL databases (DynamoDB, MongoDB).
  • Expertise in Apache Iceberg for managing large-scale data lakes, schema evolution, and ACID transactions.
  • Experience working with Iceberg Catalogs, Hive MetaStore (HMS), and integrating Iceberg-backed tables with query engines.
  • Familiarity with Starburst and Stargate for federated querying and cross-platform data access.

Cloud & Hybrid Architecture:

  • Experience working with AWS data services (S3, Redshift, Glue, Athena, EMR, RDS).
  • Understanding of hybrid data storage and integration between on-prem and cloud environments.

Infrastructure as Code (IaC) & DevOps:

  • Experience with Terraform, AWS CloudFormation, or Kubernetes for provisioning infrastructure.
  • CI/CD pipeline experience using GitHub Actions, Jenkins, or GitLab CI/CD.

Data Governance & Security:

  • Familiarity with data cataloging, lineage tracking, and metadata management.
  • Understanding of RBAC, IAM roles, encryption, and compliance frameworks (GDPR, SOC2, etc.).

Required Soft Skills:

  • Problem-Solving & Analytical Thinking - Ability to troubleshoot complex data issues and optimize workflows.
  • Collaboration & Communication - Comfortable working with cross-functional teams and articulating technical concepts to non-technical stakeholders.
  • Ownership & Proactiveness - Self-driven, detail-oriented, and able to take ownership of tasks with minimal supervision.
  • Continuous Learning - Eager to explore new technologies, improve skill sets, and stay ahead of industry trends.

Qualifications:

  • 4-6 years of experience in data engineering, cloud infrastructure, or distributed data processing.
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or a related field.
  • Hands-on experience with data pipelines, cloud services, and large-scale data platforms.
  • Strong foundation in SQL, Python, Apache Iceberg, Starburst, and cloud-based data solutions (AWS preferred), Apache Airflow orchestration

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Technology Project Management

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Citi is an equal opportunity and affirmative action employer.

Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View the "EEO is the Law" poster. View the EEO is the Law Supplement.

View the EEO Policy Statement.

View the Pay Transparency Posting

Top Skills

Apache Iceberg
Spark
Athena
Aws Cloudformation
Aws S3
Cloudwatch
DynamoDB
Emr
Flink
Github Actions
Gitlab Ci/Cd
Glue
Grafana
Java
Jenkins
Kafka
Kubernetes
MongoDB
MySQL
Oracle
Postgres
Prometheus
Python
Rds
Redshift
Scala
Shell Scripting
SQL
Terraform

Citi Chennai, Tamil Nadu, IND Office

C P Ramaswamy Road, Chennai, Tamil Nadu, India, 600018

Similar Jobs

10 Days Ago
2 Locations
Mid level
Mid level
Fintech • Financial Services
The Data Engineer will design, develop, and implement data pipelines for a new data architecture, ensure data governance, and optimize performance.
Top Skills: Apache IcebergSparkAws AthenaAws CloudformationAws EmrAws GlueAws RdsAws RedshiftAws S3DynamoDBFlinkGithub ActionsGitlab Ci/CdJenkinsKafkaKubernetesMongoDBMySQLOraclePostgresPythonShell ScriptingSQLStarburstTerraform
7 Days Ago
Hybrid
5 Locations
Senior level
Senior level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
The Senior Associate Principal Engineer leads packaging projects, utilizing expertise in polymer science to innovate and support packaging technologies, while enhancing team capabilities and managing budgets.
Top Skills: Finite Element AnalysisL6SMinitabPolymer Science
8 Days Ago
Hybrid
Pune, Mahārāshtra, IND
Junior
Junior
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Join the End-to-End testing Automation and Content Release Team to manage test operations, optimize release processes, and ensure product reliability using Python and Jenkins.
Top Skills: DockerJenkinsKafkaPython

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account