StarTree Logo

StarTree

Site Reliability Engineer (Pinot)

Sorry, this job was removed at 08:43 a.m. (IST) on Tuesday, Feb 18, 2025
Be an Early Applicant
Remote
Hiring Remotely in India
Remote
Hiring Remotely in India

At StarTree we're a group of passionate individuals that desire to improve the lives of many by developing tools and technologies that support availability and speed in the world of real-time analytics. 

Our aim is to make it simple for every company to delight their users - external and internal - and create new revenue streams from their data, by building the world’s most comprehensive and accessible cloud analytics system.

About the role:

StarTree is seeking exceptional Site Reliability Engineers (SRE), to manage, tune and debug the large-scale highly available distributed systems. You will be working with a team of passionate and talented engineers in automation, tuning, and troubleshooting of Apache Pinot and SQL DBs. We are looking for motivated, hardworking and focused individuals who have a real passion for operational excellence, data systems, and automation.

Responsibilities:

  • Leverage various monitoring and alerting services to solve intricate programming problems at scale.
  • Manage and tune multiple critical customer-facing Apache Pinot clusters
  • Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues
  • Build a rapport with and work closely with customers to mitigate and resolve incidents
  • Execute disaster recovery strategies with minimal downtime
  • Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams

Requirements:

  • 5+ years of experience as an engineer (SRE, SDET, or development)
  • Experience managing highly available production facing distributed systems and in-depth knowledge of Java are a plus
  • Experience with cloud platforms such as AWS, GCP, or Azure
  • Experience with Kubernetes and container orchestration
  • Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar
  • Knowledge of standard methodologies related to security, performance, and disaster recovery
  • Strong troubleshooting and critical thinking skills

About StarTree:  

StarTree is a cloud-based software company that enables business customers to derive advanced insights from real-time and historical data. StarTree was founded by the core software engineering team and inventors of Apache Pinot, which currently powers hundreds of user-facing applications at companies across industries, including LinkedIn, Uber, Target, 7Eleven, Etsy, Walmart, WePay, Factual, Weibo, and more. StarTree Cloud has enabled even more companies to deploy and operate real-time analytics at scale, including Stripe, Sovrn, Roadie, Just Eat Takeaway.com, Dialpad, Guitar Center, Blinkit, and more.

StarTree recently announced our Series B Funding with investment from GGV Capital, Sapphire Ventures, Bain Capital Ventures, and CRV. We have been named one of The Information's 50 Most Promising Startups and one of CRN's 10 Coolest Cloud Computing Startup Companies of 2022!

Similar Jobs

2 Days Ago
Remote
India
Senior level
Senior level
Cloud • Information Technology • Software
As a Site Reliability Engineer at Rackspace, you will implement observability solutions, build scalable systems, and develop monitoring tools. You will collaborate with development teams to ensure reliability and performance while identifying performance bottlenecks and resolving service issues.
2 Days Ago
Remote
20 Locations
Mid level
Mid level
Healthtech
The Site Reliability Engineer II is responsible for managing platform infrastructure performance, reliability, and security using SRE practices and tools. Key duties include monitoring applications, troubleshooting service disruptions, maintaining documentation, and optimizing system reliability through collaboration with development teams.
4 Days Ago
Remote
Delhi, Connaught Place, New Delhi, Delhi, IND
Senior level
Senior level
Information Technology • Internet of Things • Marketing Tech
The Lead Site Reliability Engineer will ensure the availability, performance, and scalability of our systems, collaborating with development and operations teams to enhance reliability and observability, automate processes, and drive cost optimization efforts.

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account