Citi Logo

Citi

Senior Data Engineer - Assistant Vice President

Reposted 21 Days Ago
Be an Early Applicant
In-Office
Chennai, Tamil Nadu, IND
Senior level
In-Office
Chennai, Tamil Nadu, IND
Senior level
This role involves designing and optimizing scalable data solutions, implementing ETL processes, and ensuring data quality while collaborating with teams.
The summary above was generated by AI

We are seeking a highly skilled and motivated Big Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem, with a strong focus on PySpark and Hive. This role is crucial for building robust ETL pipelines, ensuring data quality, and driving performance improvements across our Big Data initiatives.

Key Responsibilities
  • Design, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools (e.g., Sqoop).

  • Should have strong Python knowledge

  • Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions, including Fact, Dimension, and Slowly Changing Dimensions (SCD-2).

  • Conduct in-depth data analysis, troubleshoot complex data issues, and ensure the accuracy, reliability, and integrity of data.

  • Optimize Big Data workflows, including Spark job tuning and Hive query optimization, leveraging partitioning strategies and indexing techniques in distributed storage systems.

  • Perform rigorous unit testing and validation of data pipelines and transformations.

  • Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver robust data solutions.

Technical Qualifications
  • Big Data Technologies: Demonstrated proficiency with Apache Hadoop, Apache Hive, and PySpark for data processing and analysis.

  • Data Warehousing & Modeling: Strong understanding and practical experience with data warehousing concepts, dimensional modeling, and SCD-2 implementation.

  • ETL Development: Proven experience in designing and developing ETL pipelines; familiarity with various ETL tools is an advantage.

  • Database & SQL: Advanced SQL knowledge, including complex joins, subqueries, and performance tuning of SQL queries.

  • Scripting: Proficient in shell scripting for automation of batch processes.

  • DevOps & CI/CD: Experience with CI/CD tools such as Bitbucket and Jenkins.

  • BI Tools: Familiarity with business intelligence (BI) reporting tools like Tableau.

Desirable Qualifications (Added Advantage)
  • Experience and/or certifications with major cloud platforms and their Big Data services (e.g., AWS, Azure Databricks, Google Cloud).

  • Advanced knowledge of Unix shell scripting for system administration and automation.

Skills & Attributes
  • Excellent critical thinking and problem-solving skills with a strong analytical mindset.

  • Ability to work independently and collaboratively in a fast-paced environment.

  • Strong communication skills to articulate technical concepts and solutions effectively.

Education:

  • Bachelor’s degree/University degree or equivalent experience

If you are a passionate Big Data Engineer looking to make a significant impact, we encourage you to apply!

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Citi Chennai, Tamil Nadu, IND Office

C P Ramaswamy Road, Chennai, Tamil Nadu, India, 600018

Similar Jobs

3 Days Ago
In-Office
Senior level
Senior level
Fintech • Financial Services
The Senior Data Scientist will lead Generative AI projects, focusing on model implementation, technical leadership, and communication with stakeholders. Responsibilities include developing GenAI solutions, troubleshooting models, and mentoring teams in a fast-paced environment.
Top Skills: LangchainLightgbmNumpyPandasPythonPyTorchScikit-LearnScipyTensorFlowTransformersXgboost
3 Days Ago
In-Office
Senior level
Senior level
Fintech • Financial Services
The Senior Data Scientist will lead technical development in Generative AI, mentor teams, and implement full lifecycle GenAI projects, while collaborating with stakeholders and applying advanced machine learning techniques.
Top Skills: LangchainLightgbmNumpyPandasPythonPyTorchScikit-LearnScipyTensorFlowTransformersXgboost
3 Days Ago
In-Office
Senior level
Senior level
Fintech • Financial Services
The role involves leading Gen AI development projects, designing prompt strategies, collaborating with stakeholders, and maintaining production-grade AI applications. Candidates must have extensive Python skills and experience with Machine Learning and Generative AI technologies.
Top Skills: AutogenAWSAzureChromadbCrewaiDockerFastapiGCPGithub ActionsGitlab CiHugging Face TransformersKubernetesLangchainLanggraphLightgbmLlamaindexMilvusMlflowNumpyPandasPgvectorPineconePythonPyTorchQdrantScikit-LearnScipySemantic KernelTensorFlowWeaviateWeights & BiasesXgboost

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account