Speechify Logo

Speechify

Senior Software Engineer, ML Platform & Data Acquisition (Python)

Posted 3 Days Ago
Remote
Senior level
Remote
Senior level
The Senior Software Engineer, Data Acquisition is responsible for finding new sources of audio data, operating and extending the cloud infrastructure, collaborating with scientists, and contributing to the AI team's dataset roadmap.
The summary above was generated by AI

Mission

Speechify is the easiest way to listen to the world’s information. Articles on the web, documents in the cloud, books on your phone. We absorb it all and let you listen to it at your desk, on the go, at your own speed, and with tools that make learning easier, deeper, and faster.

What streaming services have done for audio entertainment, we’re doing for audio information. And whatever we’re doing seems to be working. We’re #1 in our category, and experiencing exponential growth.

Overview

We're looking to hire for our Data Acquisition side of our AI team at Speechify. This role is responsible for all aspects of data collection to support our model training operations. We are able to build high-quality datasets at petabyte-scale and low cost through a tight integration of infrastructure, engineering, and research work. We are looking for a skilled Senior Software Engineer to join us.

What You’ll Do

  • Be scrappy to find new sources of audio data and bring it into our ingestion pipeline
  • Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform.
  • Collaborate closely with our Scientists to shift the cost/throughput/quality frontier, delivering richer data at bigger scale and lower cost to power our next-generation models.
  • Collaborate with others on the AI Team and Speechify Leadership to craft the AI Team’s dataset roadmap to power Speechify’s next-generation consumer and enterprise products.

An Ideal Candidate Should Have

  • BS/MS/PhD in Computer Science or a related field.
  • 5+ years of industry experience in software development.
  • Proficiency with bash/Python scripting in Linux environments
  • Proficiency in Docker and Infrastructure-as-Code concepts and professional experience with at least one major Cloud Provider (we use GCP)
  • Experience with web crawlers, large-scale data processing workflows is a plus
  • Ability to handle multiple tasks and adapt to changing priorities.
  • Strong communication skills, both written and verbal.

What we offer

  • A fast-growing environment where you can help shape the company and product.
  • An entrepreneurial-minded team that supports risk, intuition, and hustle.
  • A hands-off management approach so you can focus and do your best work.
  • An opportunity to make a big impact in a transformative industry.
  • Competitive salaries, a friendly and laid-back atmosphere, and a commitment to building a great asynchronous culture.
  • Opportunity to work on a life-changing product that millions of people use.
  • Build products that directly impact and support people with learning differences like dyslexia, ADD, low vision, concussions, autism, and more.
  • Work in one of the fastest-growing sectors of tech, the intersection of artificial intelligence and audio.

 

What We Offer 

  • A dynamic environment where your contributions shape the company and its products.
  • A team that values innovation, intuition, and drive.
  • Autonomy, fostering focus and creativity.
  • The opportunity to have a significant impact in a revolutionary industry.
  • Competitive compensation, a welcoming atmosphere, and a commitment to an exceptional asynchronous work culture.
  • The privilege of working on a product that changes lives, particularly for those with learning differences like dyslexia, ADD, and more.
  • An active role at the intersection of artificial intelligence and audio – a rapidly evolving tech domain.

Think you’re a good fit for this job? 

Tell us more about yourself and why you're interested in the role when you apply.
And don’t forget to include links to your portfolio and LinkedIn.

Not looking but know someone who would make a great fit? 

Refer them! 

Speechify is committed to a diverse and inclusive workplace. 

Speechify does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Top Skills

Python

Similar Jobs

24 Days Ago
Colorado Springs, CO, USA
Remote
Hybrid
4,600 Employees
Mid level
4,600 Employees
Mid level
Aerospace • Artificial Intelligence • Cloud • Machine Learning • Cybersecurity • Defense
The AI/ML Remote Sensing Scientist will develop and apply AI/ML applications for information extraction, including computer vision and anomaly detection, focusing on government needs. The role requires collaboration with experts across domains to deliver robust solutions on national missions.
Be an Early Applicant
3 Hours Ago
Boston, MA, USA
Remote
Hybrid
2,400 Employees
Expert/Leader
2,400 Employees
Expert/Leader
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
The Distinguished Software Engineer will lead architectural development for Rapid7's Cloud Security products, drive innovation in security technology, engage with customers to refine product offerings, and foster a collaborative culture among engineering teams.
Be an Early Applicant
3 Hours Ago
St. Paul, MN, USA
Remote
Hybrid
2,400 Employees
Senior level
2,400 Employees
Senior level
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
As a Senior Sales Engineer at Rapid7, you will work closely with the sales team to provide technical expertise and support during the pre-sales process. Responsibilities include conducting product demonstrations, articulating the value of security solutions, and engaging with both technical and non-technical stakeholders to meet customer needs.

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account