H1 Logo

H1

Sr. Big Data Engineer

Job Posted 9 Days Ago Reposted 9 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
Design and maintain scalable data pipelines for complex datasets, enhance data processing systems, and collaborate across teams to ensure data quality and performance optimization.
The summary above was generated by AI
At H1, we believe access to the best healthcare information is a basic human right. Our mission is to provide a platform that can optimally inform every doctor interaction globally. This promotes health equity and builds needed trust in healthcare systems. To accomplish this our teams harness the power of data and AI-technology to unlock groundbreaking medical insights and convert those insights into action that result in optimal patient outcomes and accelerates an equitable and inclusive drug development lifecycle.  Visit h1.co to learn more about us.

Data Engineering is responsible for the development and delivery of our most important asset—our data. With thousands of data sources from around the world, the team ensures that data is accurate, normalized, and delivered at a velocity that keeps up with real-world changes. As we expand our markets and the scope of data we provide to our customers, our team must scale to meet that demand.

WHAT YOU'LL DO AT H1
We’re looking for a seasoned Senior Data Engineer who is operating at a high level and is either ready or nearly ready to step into a Staff-level individual contributor role. You will take ownership of designing and scaling the systems and pipelines that power H1’s data platform. You will work cross-functionally with other engineers, product managers, and stakeholders to deliver high-performance, reliable, and maintainable data solutions. This is an opportunity to play a key role in shaping the future of our data infrastructure while mentoring others and driving best practices.

You will:
- Design, develop, and maintain scalable data extraction frameworks that ingest structured and unstructured data from diverse sources.
- Build and optimize robust ETL/ELT pipelines using big data technologies, especially Apache Spark on cloud platforms (preferably AWS EMR).
- Improve the efficiency, reliability, and performance of data processing systems through thoughtful design and continuous optimization.
- Transform, clean, and normalize complex datasets for downstream use, ensuring high standards of data quality and consistency.
- Partner with senior engineers to evolve H1’s data architecture and infrastructure in support of product and platform scalability.
- Lead data integration efforts across multiple systems, ensuring accuracy and seamless collaboration across teams.
- Monitor and troubleshoot data flows and pipelines, proactively identifying and resolving performance issues.
- Maintain clear documentation of systems, workflows, and processes to promote transparency and operational excellence.
- Participate in code reviews and promote a culture of engineering excellence, mentorship, and continuous improvement.
- Collaborate closely with cross-functional teams to align technical execution with business goals

ABOUT YOU

You are a seasoned data engineer with a track record of building and maintaining large-scale data systems. You’re excited by the opportunity to work on complex problems, enjoy collaborative work, and are passionate about building high-quality, performant solutions that impact real-world healthcare outcomes.

- You have an understanding of Large Language Models (LLMs) and their applications.
- It’s a bonus if you’re familiar with model training and fine-tuning, particularly in NLP (Natural Language Processing) contexts.
- You possess a basic knowledge of network, security, and encryption protocols such as HTTP/HTTPS/TLS.
- You’re able to work collaboratively across teams and communicate effectively with both technical and non-technical stakeholders.
- You have strong analytical and problem-solving skills with a focus on data quality and performance optimization.
- You have a passion for writing clean, efficient code and following best practices

REQUIREMENTS

- 6+ years of experience in data engineering, working with large-scale data systems and pipelines.
- Proficiency in programming languages like Python, Java, or similar languages.
- Strong SQL skills, including the ability to write optimized complex queries for  large datasets using advanced SQL operators  such as GROUP BY, HAVING, window functions, and complex joins.
- Experience with big data tools like Apache Spark, particularly on cloud platforms, with a preference for AWS EMR.
- Experience with Docker or other containerization technologies.


Not meeting all the requirements but still feel like you’d be a great fit? Tell us how you can contribute to our team in a cover letter! 

H1 OFFERS
- Full suite of health insurance options, in addition to generous paid time off
- Pre-planned company-wide wellness holidays
- Retirement options
- Health & charitable donation stipends
- Impactful Business Resource Groups
- Flexible work hours & the opportunity to work from anywhere
- The opportunity to work with leading biotech and life sciences companies in an innovative industry with a mission to improve healthcare around the globe

Top Skills

Spark
AWS
Docker
Java
Python
SQL

Similar Jobs

9 Days Ago
Remote or Hybrid
India
Senior level
Senior level
Big Data • Healthtech
The Sr. Big Data Engineer will design, develop, and maintain scalable data extraction frameworks and ETL/ELT pipelines, optimizing data processing systems while collaborating with cross-functional teams.
Top Skills: SparkAws EmrDockerJavaPythonSQL
6 Days Ago
Remote
IN
Senior level
Senior level
Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
Responsible for guiding teams in utilizing technology effectively, mentoring colleagues and devising advanced solutions to meet client challenges.
An Hour Ago
Easy Apply
Remote or Hybrid
Chennai, Tamil Nadu, IND
Easy Apply
Mid level
Mid level
Artificial Intelligence • Big Data • Logistics • Machine Learning • Software • Transportation
The Senior Data Engineer will architect and develop data pipelines, ensuring optimal data extraction and transformation for the data warehouse platform, with a focus on scalability and maintainability.
Top Skills: AWSAzureBig DataCassandraHadoopJavaKafkaPostgresPythonSparkSpark-StreamingSQLStorm

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account