Sopra Steria Logo

Sopra Steria

PySpark Technical Lead

Job Posted 11 Days Ago Reposted 11 Days Ago
Be an Early Applicant
Chennai, Tamil Nadu
Senior level
Chennai, Tamil Nadu
Senior level
As a Data Engineer, collaborate with Data Scientists to design and implement machine learning pipelines. Utilize PySpark for data preparation and use AWS EMR and S3 for data storage. Manage ETL workflows with Streamsets and optimize pipelines for performance and reliability. Ensure secure data access through proper IAM configurations.
The summary above was generated by AI

Company Description

About Sopra Steria
Sopra Steria, a major Tech player in Europe with 56,000 employees in nearly 30 countries, is recognized for its consulting, digital services and software development. It helps its clients drive their digital transformation and obtain tangible and sustainable benefits. The Group provides end-to-end solutions to make large companies and organizations more competitive by combining in-depth knowledge of a wide range of business sectors and innovative technologies with a fully collaborative approach. Sopra Steria places people at the heart of everything it does and is committed to putting digital to work for its clients in order to build a positive future for all. In 2023, the Group generated revenues of €5.8 billion.
The world is how we shape it.

Job Description

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Proficiency in below listed skills will be crucial in building and maintaining pipelines for training and inference datasets.

Responsibilities: 

• Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines. 

• Utilize PySpark for data processing, transformation, and preparation for model training. 

• Leverage AWS EMR and S3 for scalable and efficient data storage and processing. 

• Implement and manage ETL workflows using Streamsets for data ingestion and transformation. 

• Design and construct pipelines to deliver high-quality training and inference datasets. 

• Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities. 

• Optimize and fine-tune pipelines for performance, scalability, and reliability. 

• Ensure IAM policies and permissions are appropriately configured for secure data access and management. 

• Implement Spark architecture and optimize Spark jobs for scalable data processing. 

 

Total Experience Expected: 06-08 years

Qualifications

Professional degree

Additional Information

Requirements: 

Mandatory

• Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop.

• Proven expertise in designing and deploying data pipelines.

• Strong problem-solving skills and ability to work effectively in a collaborative team environment. 

• Excellent communication skills and ability to translate technical concepts to non-technical stakeholder

Desirable

• Hands-on experience with Airflow, S3, and Stream sets or similar ETL tools. [ can be trained locally ]

• Understanding of real-time or near real-time inferencing architectures. 

  • •Basic Knowledge on Kafka ,AWS IAM, AWS EMR and Snowflake.

At our organization, we are committed to fighting against all forms of discrimination. We foster a work environment that is inclusive and respectful of all differences.

All of our positions are open to people with disabilities.

Top Skills

Advanced Sql
Airflow
Aws Emr
Hadoop
Kafka
Pyspark
S3
Scala
Snowflake
Streamsets

Sopra Steria Chennai, Tamil Nadu, IND Office

2/G-2 SIPCOT IT Park, Siruseri 603103 Kelambakkam (Off Chennai) - Kanchipuram District - , Chennai, Tamil Nadu, India

Similar Jobs

2 Hours Ago
Hybrid
2 Locations
Senior level
Senior level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Senior DevOps Engineer will design and manage infrastructure using Terraform and GCP, develop CI/CD pipelines with Jenkins, and automate tasks in a hybrid working environment.
Top Skills: AnsibleBashChefDockerGoogle Cloud PlatformGrafanaJenkinsKubernetesOtelPrometheusPuppetPythonTerraform
3 Hours Ago
Hybrid
Chennai, Tamil Nadu, IND
Senior level
Senior level
Agency • Digital Media • eCommerce • Professional Services • Software • Analytics • Consulting
The Senior Braze Developer manages end-to-end marketing campaigns using Braze, collaborates with marketing teams, and ensures data security and personalization in campaigns.
Top Skills: APIsBrazeCSSHTMLJavaScriptLiquidRest ApisSoap
3 Hours Ago
Chennai, Tamil Nadu, IND
Senior level
Senior level
Agency • Digital Media • eCommerce • Professional Services • Software • Analytics • Consulting
The Senior Technical Architect will design and implement scalable solutions for eCommerce, focusing on collaboration, performance optimization, and mentoring teams. Responsibilities include requirement analysis, technical guidance, and developing architecture artifacts.
Top Skills: Api GatewayAWSAzureGCPGraphQLJavaMicronautNode JsSpring Boot

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account