Uniphore is one of the largest B2B AI-native companies—decades-proven, built-for-scale and designed for the enterprise. The company drives business outcomes, across multiple industry verticals, and enables the largest global deployments.
Uniphore infuses AI into every part of the enterprise that impacts the customer. We deliver the only multimodal architecture centered on customers that combines Generative AI, Knowledge AI, Emotion AI, workflow automation and a co-pilot to guide you. We understand better than anyone how to capture voice, video and text and how to analyze all types of data.
As AI becomes more powerful, every part of the enterprise that impacts the customer will be disrupted. We believe the future will run on the connective tissue between people, machines and data: all in the service of creating the most human processes and experiences for customers and employees.
Job Description:
Uniphore is The Business AI Company. We enable businesses to rapidly adopt, significantly transform, and immediately unlock value through AI, by providing a platform that allows business users to effortlessly harness agentic AI, tapping into enterprise knowledge that is grounded in their own proprietary data. We enable the Global top 1000 customers to achieve their business outcomes through our core principles of providing composable, sovereign and secure AI.
Key Responsibilities:
- End-to-end development of production-grade ASR and Speech AI systems for real-time voice agents.
- Design and optimize streaming architectures including end-pointing, end-of-turn detection, and conversational turn-taking.
- Focus on LLM fine-tuning and building Retrieval-Augmented Generation (RAG) pipelines as well as agentic systems for enterprise conversational AI use cases.
- Drive integration/benchmarking of allied modules such as VAD, LID, speaker diarization, and paralinguistic modeling (emotion, prosody).
- Own benchmarking strategy against 3P ASR/LLM and define evaluation frameworks (accuracy, latency, cost).
- Collaborate with engineering teams to deploy scalable, low-latency, and cost-efficient speech/NLP systems.
Qualifications:
- Master’s or Ph.D in Computer Science, Electrical/Computer Engineering or related field.
- Minimum of around 2 years of industry experience in AI, NLP, Vision or ASR for candidates with Master’s degree.
- Expertise in Python and deep learning frameworks such as PyTorch or TensorFlow, with experience using modern ML tooling (e.g., Hugging Face, vLLM).
- Experience optimizing streaming ASR (RTF, decoding strategies, end-pointing latency).
- Ability to drive benchmarking, system optimization, and deployment readiness.
- Good publication record in leading conferences.
- [Good to have] Prior Experience in pre-training, fine-tuning, agentic systems and reinforcement learning.
Location preference:
Uniphore is an equal opportunity employer committed to diversity in the workplace. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, disability, veteran status, and other protected characteristics.
For more information on how Uniphore uses AI to unify—and humanize—every enterprise experience, please visit www.uniphore.com.



