The Data Engineer III will build and maintain cloud-native data infrastructure, focusing on ETL pipelines, CI/CD, and data governance. Responsibilities include evolving data models, automating deployments, and ensuring data quality.
For nearly 40 years, PDI has helped convenience retailers and petroleum wholesalers adapt to changes in the industry by leveraging the latest technologies. Simplifying the complexity in your world is our main focus. That's why we're delivering an integrated portfolio of global, cloud based solutions and services to meet our customers' needs today and well into the future.
From the back office to fuel logistics and digital commerce, PDI solutions deliver measurable value across the supply chain. We are proud to support over 1,500 customers in 50+ countries, powering 200,000+ sites worldwide.
The Opportunity: We’re looking for a seasoned Data Engineer III who is passionate about building scalable and resilient cloud-native data infrastructure — with a focus on governance, CI/CD, automation, and platform maturity. You will be a key contributor in evolving our modern data stack, ensuring operational excellence and code quality across ETL pipelines, metadata frameworks, and real-time/batch data services.
You’ll work at the intersection of data engineering, DevOps, and governance, setting standards across code repositories, orchestrators (Airflow), compute layers (Glue/EMR), and ingestion tools (DMS, Kafka, etc.)
Key Responsibilities:
- Maintain and evolve OLTP (Postgres) and OLAP (Redshift) data models /data lakes by evaluating new feature requirements, ensuring alignment with dimensional modeling best practices, and executing schema changes via Liquibase pipelines.
- Develop and maintain metadata-driven data pipeline frameworks that support validation, logging, auditing, and job orchestration.
- Standardize and govern Bitbucket/Git repositories, manage branching strategies, enforce code review and CI pipelines for ETL/data jobs.
- Design and implement CI/CD workflows for data services using tools like Jenkins, Liquibase, and Shell/Python scripting.
- Support automated deployment of ETL, Airflow DAGs, Glue jobs, and DB schema changes across environments (QA, Stage, Prod).
- Collaborate with DataOps and DevOps teams to maintain infrastructure as code (IaC) standards and shared configuration patterns.
- Build and scale data quality frameworks, including pre/post validations, job restorability, and alerting (CloudWatch, SNS).
- Implement data masking and access control standards (RBAC, column-level masking, role-based access) across Redshift and Iceberg.
- Optimize DMS/Kafka-based CDC pipelines and help reduce dependency through automation or zero-ETL patterns.
- Define standards for data retention, archival, and operational efficiency across OLTP/OLAP environments.
- Partner with data engineers and analysts to align platform standards with business needs and analytical readiness
Qualifications:
- 8+ years of experience in data engineering or platform engineering with exposure to production-grade data pipelines and systems.
- Deep expertise in Python and SQL, with strong understanding of pipeline design patterns and modular codebases.
- 3+ years of experience with CI/CD tooling (e.g., Jenkins, Liquibase, Bitbucket Pipelines) and managing deployment pipelines for data workloads.
- Solid understanding of AWS cloud services: S3, Glue, Redshift, DMS, Lambda, EMR, IAM, CloudWatch.
- Experience with workflow orchestration tools like Airflow (DAG scheduling, dependency mapping, alerts).
- Hands-on experience maintaining data lakehouse platforms (e.g., Apache Iceberg, Delta Lake) and managing batch vs. streaming ingestion.
- Experience managing schema changes, migrations, and rollback strategies across databases (Postgres, Redshift).
- Strong understanding of data security practices, including PII masking, row/column-level controls, and audit logging.
- Familiarity with dimensional modeling and differences between OLTP vs. OLAP patterns.
- Strong documentation and process-driven mindset to define standards and maintain operational transparency
Behavioral Competencies:
- Ensures Accountability
- Manages Complexity
- Communicates Effectively
- Balances Stakeholders
- Collaborates Effectively
PDI is committed to offering a well-rounded benefits program, designed to support and care for you, and your family throughout your life and career. This includes a competitive salary, market-competitive benefits, and a quarterly perks program. We encourage a good work-life balance with ample time off [time away] and, where appropriate, hybrid working arrangements. Employees have access to continuous learning, professional certifications, and leadership development opportunities. Our global culture fosters diversity, inclusion, and values authenticity, trust, curiosity, and diversity of thought, ensuring a supportive environment for all.
Top Skills
Airflow
Apache Iceberg
AWS
Bitbucket
Cloudwatch
Delta Lake
Dms
Emr
Glue
Jenkins
Lambda
Liquibase
Python
Redshift
S3
SQL
Similar Jobs
Information Technology • Marketing Tech • Analytics
Administers and maintains databases, ensuring security and performance. Resolves issues, manages change requests, and supports developers in SQL creation and tuning.
Top Skills:
LinuxWindowsMySQLOraclePostgresSQL ServerUnix
Software
Develop efficient database solutions using MS SQL Server and PostgreSQL, optimize queries, implement ETL processes, and create reports while ensuring data integrity and performance tuning.
Top Skills:
AdfAsp.NetC#DynatraceGitJIRAMs Sql ServerNewrelicPostgresSQLSsisSsrsSvn
Cloud • Software
The Senior Database Engineer will support and maintain MS SQL Server and MySQL databases, ensuring high availability, performance tuning, and troubleshooting complex issues while automating database tasks.
Top Skills:
LinuxMicrosoft Windows ServerMs Sql ServerMySQLPythonShell Scripting
What you need to know about the Chennai Tech Scene
To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.