Synechron Logo

Synechron

Cloud Data Engineer – Python, Spark/Scala, AWS & Data Warehousing

Reposted 2 Days Ago
Be an Early Applicant
In-Office or Remote
2 Locations
Senior level
In-Office or Remote
2 Locations
Senior level
The role requires designing end-to-end data-centric solutions, developing ETL pipelines, collaborating with teams, and promoting industry best practices in data engineering projects.
The summary above was generated by AI

Job Summary

Synechron is seeking a highly skilled Data Engineer to lead the design, development, and optimization of enterprise data pipelines and analytics solutions. This role involves working on big data platforms, leveraging cloud-native AWS services, and integrating AI-driven applications to support business intelligence and operational excellence. You will develop scalable, secure, and high-performance data architectures, working closely with various stakeholders to meet data governance, compliance requirements, and strategic objectives.

Software Requirements

Required:

  • Hands-on experience with Python, PySpark, and Scala for building scalable data processing jobs (4+ years)

  • Expertise in big data platforms such as Spark, Hadoop, Hive, and related processing frameworks

  • Proven experience in end-to-end ETL pipeline development including ingestion, transformation, and data validation

  • Strong familiarity with cloud data ecosystems on AWS, including EMR, S3, Glue, CloudFormation, CDK, and Data Pipeline

  • Experience with relational databases: PostgreSQL, Oracle, SQL Server

  • Experience working with NoSQL databases like DynamoDB or Cassandra (preferred)

  • Working knowledge of metadata management, data lineage, and data governance tools

Preferred:

  • Exposure to AI/ML applications, especially with Generative AI and frameworks like LangChain, Llama, or Hugging Face

  • Familiarity with model management and prompt engineering (preferred)

  • Knowledge of containerization (Docker) and Kubernetes for scalable deployment

Overall Responsibilities

  • Design, develop, and support data pipelines and architectures to enable enterprise analytics, reporting, and data governance

  • Build scalable batch and streaming data workflows supporting business-critical functions

  • Optimize performance, latency, and throughput of data processing jobs through profiling and tuning

  • Implement security, privacy, and regulatory standards in data pipelines and repositories

  • Collaborate with data scientists, BI teams, and applications teams to ensure data quality and availability

  • Automate data ingestion, transformation, and deployment processes for operational efficiency

  • Ensure high system reliability and availability, supporting infrastructure and platform monitoring

  • Lead or contribute to cloud migration and data modernization initiatives

  • Stay updated on emerging data technologies, automation, and AI/ML advancements to incorporate into existing platforms

Technical Skills (By Category)

Programming Languages (Essential):

  • Python, Scala, PySpark for big data processing

Preferred:

  • Additional scripting languages such as Shell or Bash for automation

Frameworks & Libraries:

  • Spark, Hadoop ecosystem (Hive, Kafka preferred)

  • Data validation, lineage, and governance tools

  • AI/ML frameworks such as LangChain, Hugging Face (preferred)

Databases & Data Storage:

  • Relational: PostgreSQL, Oracle, SQL Server

  • NoSQL: DynamoDB, Cassandra (preferred)

Cloud Technologies:

  • AWS: EMR, S3, Glue, CloudFormation, CDK, Lambda, Data Pipeline, CloudWatch

Data Governance & Security:

  • Metadata management, data lineage, security best practices, and compliance standards (PCI, GDPR)

Experience Requirements

  • 4+ years designing and implementing enterprise data pipelines in cloud environments

  • Proven experience working with big data frameworks and AWS cloud solutions

  • Strong understanding of data governance, security standards, and regulatory compliance in enterprise contexts

  • Experience integrating AI/ML solutions or supporting AI workflows (preferred)

  • Past involvement in data migration, modernization, or platform automation projects

Day-to-Day Activities

  • Architect, develop, and optimize complex data pipelines and architectures supporting enterprise analytics

  • Implement data ingestion, transformation, and validation workflows across diverse data sources

  • Troubleshoot and resolve pipeline performance and security issues

  • Automate infrastructure provisioning, deployment, and management in cloud environments

  • Collaborate with data scientists, BI teams, and application developers to align data solutions with business goals

  • Monitor system health, perform root cause analysis, and implement efficiency improvements

  • Document data architecture, lineage, and governance procedures

  • Stay current on new tools, frameworks, and AI/ML innovations relevant for data engineering

Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field

  • 4+ years of experience in cloud data engineering, big data processing, and ETL development

  • Proven track record of successfully supporting large-scale, secure, and compliant data solutions

  • Relevant certifications such as AWS Data Analytics or Big Data certifications are advantageous

Professional Competencies

  • Strong analytical and troubleshooting capabilities for data processing and pipeline issues

  • Excellent collaboration and stakeholder management skills

  • Leadership qualities for guiding junior team members and influencing best practices

  • Ability to adapt to evolving industry standards, tools, and compliance regulations

  • Results-driven with a focus on data quality, security, and operational reliability

S​YNECHRON’S DIVERSITY & INCLUSION STATEMENT
 

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.

All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice

Top Skills

Spark
AWS
Big Data
Hive
Hugging Face
Langchain
Llama
Obiee
Oracle Sql
Pl/Sql
Pyspark
Python
SAS
Scala
Spark
Tableau
Teradata
Unix Shell Scripting

Similar Jobs

Yesterday
Remote or Hybrid
India
Mid level
Mid level
Financial Services
As a Software Engineer III, you'll design and deliver secure technology products, troubleshoot software solutions, and ensure quality in production code, while working in an agile environment and contributing to diverse team culture.
Top Skills: CicsCobolDb2ImsJclVsam
Yesterday
Easy Apply
Remote
India
Easy Apply
Senior level
Senior level
Cloud • Security • Software • Cybersecurity • Automation
Lead and scale GitLab's Enterprise Technology & AI operations in India, focusing on delivery excellence, QA, and building high-performing teams in a remote environment.
Top Skills: Agile MethodologiesAi ToolsCi/CdCpqDevOpsErpIntegration TechnologiesSalesforce
Yesterday
Remote or Hybrid
India
Mid level
Mid level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
The role involves identifying defects in code, writing test automation, validating application functionality, and managing bug reports for software testing.
Top Skills: Automation FrameworksCoding StandardsTest AutomationTest Strategies

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account