Clario Jobs

Lead Data Engineer (GenAI / LLM Applications)

Clario

Lead Data Engineer (GenAI / LLM Applications)

Reposted 7 Days Ago

Be an Early Applicant

In-Office or Remote

Hiring Remotely in Bangalore, Bengaluru Urban, Karnataka

Senior level

In-Office or Remote

Hiring Remotely in Bangalore, Bengaluru Urban, Karnataka

Senior level

The role involves designing and maintaining scalable data architectures and pipelines, collaborating with teams to deliver data-driven solutions, and optimizing complex SQL across various databases.

The summary above was generated by AI

We are looking for a skilled and motivated Lead Engineer to join our Data Science and Delivery group at Clario, a part of Thermo Fisher Scientific. This role combines software development, data engineering, and analytical problem‑solving to design, build, and maintain scalable data platforms that support clinical trial operations and business intelligence. You will work across the full software development lifecycle (SDLC)—from requirements gathering through production support—collaborating closely with data scientists, analysts, product managers, and engineering teams to deliver high‑quality, data‑driven solutions.

What We Offer

Competitive compensation aligned with local market practices
Comprehensive health and wellness benefits
Paid time off and company holidays
Opportunities for professional development, learning, and career growth
The flexibility of working from Bangalore or remotely within India, while collaborating with global teams

What You’ll Be Doing

Design, develop, and maintain scalable software architectures and data pipelines that integrate with analytical and operational systems.
Write clean, reusable, and well‑tested Python code using frameworks such as Flask and related libraries.
Leverage AI‑assisted development tools, including GitHub Copilot and LangChain, to design, build, and integrate LLM‑powered solutions such as retrieval‑augmented generation (RAG) pipelines, intelligent agents, and automated workflows using AWS Bedrock or similar services.
Develop and optimize complex SQL across Oracle, MS SQL Server, PostgreSQL, and Snowflake, including procedures, functions, views, analytical functions, and dynamic SQL.
Design and implement ETL pipelines using Snowflake and related data processing technologies.
Implement scheduling and orchestration using Apache Airflow or similar workflow orchestration frameworks.
Establish and maintain data quality frameworks, versioning, and governance practices to ensure data reliability, integrity, and compliance.
Develop and maintain data architectures and models for both structured and unstructured data sources.
Troubleshoot production issues and drive continuous improvement in software quality, performance, and reliability.
Deploy, manage, and support solutions on AWS, including storage, compute, and pipeline services.
Create source‑to‑target mappings and support data and code migration initiatives.
Partner with stakeholders to gather requirements, translate business needs into technical solutions, and produce clear, well‑structured documentation.
Collaborate with product managers, analysts, and cross‑functional teams to deliver data‑driven insights and reporting using tools such as Plotly and Power BI.

What We Look For

Bachelor’s or higher degree in Computer Science, Information Technology, or a related technical field.
5+ years of professional experience in software engineering, data engineering, or data‑focused development roles.
Strong proficiency in Python, including frameworks and libraries such as Django or Flask, pandas, NumPy, Plotly, and ag‑Grid.
Strong SQL expertise with Oracle, MS SQL Server, PostgreSQL, and/or Snowflake.
Proven experience writing complex SQL, including analytical and window functions, subqueries, all join types, DML/DDL/TCL statements, CASE expressions, and performance tuning.
Working knowledge of cloud platforms, with a preference for AWS (S3, EC2, Secrets Manager, Bedrock, Lambda).
Experience using AI‑assisted development tools and frameworks such as GitHub Copilot and LangChain for building LLM‑powered applications and workflows.
Experience with Git‑based version control systems and CI/CD pipelines.
Familiarity with data modeling concepts for both structured and unstructured data.
Strong analytical thinking, problem‑solving abilities, and communication skills.
Willingness to work across all phases of the SDLC, including requirements gathering, design, development, deployment, and production support.
Preferred experience includes exposure to the clinical trial lifecycle or clinical data management, data visualization tools (Plotly, Power BI), front‑end technologies (HTML5, CSS3, JavaScript), collaboration tools (Jira, Confluence, Microsoft Teams), and hands‑on data analysis or data cleansing using programming languages, SQL, and Excel.

At Clario, our purpose is to transform lives by unlocking better evidence. It’s a cause that unites and inspires us. It’s why we come to work—and how we empower our people to make a positive impact every day. Whether you’re starting your clinical data career or building long‑term expertise, your work helps bring life‑changing therapies to patients faster.

Similar Jobs

GitLab

Business Development Representative

An Hour Ago

Easy Apply

Remote

India

Easy Apply

Entry level

Cloud • Security • Software • Cybersecurity • Automation

As a Business Development Representative, you'll lead outreach to potential accounts, generate qualified meetings, and collaborate with marketing and sales teams to identify prospects and opportunities.

Top Skills: Linkedin Sales NavigatorOutreach.IoSalesforce

CrowdStrike

Sr. Threat Researcher (Remote, IND)

6 Hours Ago

Remote or Hybrid

India

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

As a Senior Threat Researcher, you will lead initiatives in threat detection, malware analysis, and automation, mentoring team members and enhancing scalable solutions to combat complex cyber threats.

Top Skills: Binary NinjaC++CassandraElasticsearchGhidraGoIda ProMongoDBMySQLPostgresPythonRustSplunkX64Dbg

CrowdStrike

Threat Researcher III (Remote, IND)

6 Hours Ago

Remote or Hybrid

India

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

The Threat Researcher will analyze malware threats, validate their relevance to the Falcon platform, work on automating malware processing tasks, and improve threat response efficiency through collaboration and communication with other teams.

Top Skills: CassandraElasticsearchGoMySQLPythonRust

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.