The Senior Data Engineer is responsible for developing data pipelines, improving data models, and integrating new data management technologies while consulting on complex projects.
Job Title: Senior Data Engineer/Developer
Number of Positions: 2
Job Description:
The Senior Data Engineer will be responsible for designing, developing, and maintaining scalable data pipelines and building out new API integrations to support continuing increases in data volume and complexity. They will collaborate with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organization.
Responsibilities:
- Design, construct, install, test and maintain highly scalable data management systems & Data Pipeline.
- Ensure systems meet business requirements and industry practices.
- Build high-performance algorithms, prototypes, predictive models, and proof of concepts.
- Research opportunities for data acquisition and new uses for existing data.
- Develop data set processes for data modeling, mining and production.
- Integrate new data management technologies and software engineering tools into existing structures.
- Create custom software components and analytics applications.
- Install and update disaster recovery procedures.
- Collaborate with data architects, modelers, and IT team members on project goals.
- Provide senior level technical consulting to peer data engineers during data application design and development for highly complex and critical data projects.
Qualifications:
- Bachelor's degree in computer science, Engineering, or related field, or equivalent work experience.
- Proven 5-8 years of experience as a Senior Data Engineer or similar role.
- Experience with big data tools: Hadoop, Spark, Kafka, Ansible, chef, Terraform, Airflow, and Protobuf RPC etc.
- Expert level SQL skills for data manipulation (DML) and validation (DB2).
- Experience with data pipeline and workflow management tools.
- Experience with object-oriented/object function scripting languages: Python, Java, Go lang etc.
- Strong problem solving and analytical skills.
- Excellent verbal communication skills.
- Good interpersonal skills.
- Ability to provide technical leadership for the team.
Top Skills
Airflow
Ansible
Chef
Hadoop
Hive
Kafka
Protobuf Rpc
Python
Scala
Spark
SQL
Terraform
Similar Jobs at Capco
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The Sr. Data Engineer will work on projects involving PySpark and Scala with a focus on data analysis and debugging. They will utilize their skills in Spark, GIT, and familiar CICD tools to manage the Big Data Application Life Cycle while ensuring efficient incident management using Control-M and Service Now.
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The Data Analyst role involves delivering insights through data analysis, migration planning, and stakeholder management in financial services. Candidates must have strong analytical skills, knowledge of data models, and experience in Python and Pyspark.
Top Skills:
Python,Pyspark,Sql,Hive
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The Tech Business Analyst will define data requirements, manage data migration processes, and ensure stakeholder communication to drive successful project delivery.
Top Skills:
AgileCmdHiveJavaNote++PuttyPysparkPythonScalaSQLTdd
What you need to know about the Chennai Tech Scene
To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.