Deepgram Launches Flux - The World’s First Conversational Speech Recognition Model

News > Technology News

Audio By Carbonatix

2:30 PM on Thursday, October 2

The Associated Press

SAN FRANCISCO--(BUSINESS WIRE)--Oct 2, 2025--

VapiCon 2025 –Deepgram, the world’s most realistic and real-time Voice AI platform, today announced from VapiCon 2025 the launch of Flux, the world’s first conversational speech recognition (CSR) model designed specifically for real-time voice agents. Unlike traditional automatic speech recognition (ASR), which was built for transcription use cases like captions or meeting notes, Flux is trained to understand the nuances of dialogue. It doesn’t just capture what was said. It knows when a speaker has finished, when to respond, and how to keep the flow of conversation natural and engaging.

The global voice AI agents market is projected to reach nearly $47.5 billion by 2034, growing at a compound annual rate of about 34.8%. This growth is primarily due to the enterprise shift toward automated customer self-service, smarter agent assist tools, and embedded conversational experiences across industries. But traditional STT systems weren’t designed to participate in live dialogue. To recreate conversational flow, developers have been forced to piece together transcription, voice activity detection, and turn-taking logic — a patchwork that leads to latency, errors, and frustrating user experiences.

Flux eliminates these problems by embedding turn-taking directly into recognition. It transforms speech recognition from simply transcribing words to modeling the flow of dialogue itself. This provides developers with the tools to build responsive, human-like voice agents without the complexity of workaround code or endless threshold tuning.

What Flux Delivers:

Embedded turn-taking intelligence - Conversation-aware recognition that handles timing inside the model itself, with context-aware turn detection and native barge-in handling for fluid exchanges.
Lightning-fast performance - Ultra-low latency where it matters most with ~260ms end-of-turn detection, plus distinct events to support eager response generation before a turn is complete.
Simpler development - Turn-complete transcripts and structured conversational cues replace fragile client-side logic, so teams can ship production-ready agents in weeks, not months.
Enterprise-ready scalability - Nova-3 level accuracy, GPU-efficient concurrency with 100+ streams per GPU, and predictable costs that avoid the hidden overhead of bolted-on systems.

“At Vapi, our mission has always been to give engineering teams a platform to build their conversational front-door,” said Jordan Dearsley, Founder, CEO, Vapi. “Deepgram’s launch of Flux is a perfect example of that vision coming to life. By embedding turn-taking directly into recognition, Flux solves one of the hardest challenges in conversational AI. We’re thrilled Deepgram chose VapiCon to introduce this breakthrough, and we can’t wait to see the incredible voice agents developers create with it.”

“Flux redefines what speech recognition can do for real-time AI,” said Scott Stephenson, CEO and Co-Founder, Deepgram. “For decades, ASR was built to listen and record. Flux is different — it listens, understands, and guides conversations with human-like timing. It’s the foundation voice agents have been waiting for and is our latest milestone towards solving the Audio Turing Test.”

"At Lindy, our mission is to build the world's most capable AI employees, and voice is a big part of this," said Flo Crivello, Founder and CEO, Lindy. "Deepgram has been our partner of choice since the earliest days, and Flux brings things to the next level: there is simply nothing coming close on the market in terms of latency or conversation awareness. It's enabled us to deliver the smoothest, most natural, interruption-free conversations for our customers."

Who It’s For:

Voice AI builders - Developers, engineering leads, and AI teams creating real-time agents.
Enterprise innovators - Leaders modernizing customer experience with agent assist and conversational AI platforms.
Ecosystem partners - Platform providers, consultancies, and cloud architects looking to integrate CSR into larger AI stacks.

Flux is generally available (GA) today. Developers can start building with CSR immediately.

To celebrate the launch, Deepgram is announcing OktoberFLUX – making Flux FREE to use for the entire month of October. Developers can use Flux to build and test real-time voice agents at no cost, with support for up to 50 concurrent connections. The goal: remove every barrier to experimentation so teams can experience how conversational speech recognition changes what’s possible in voice AI. To learn more, please visit: https://deepgram.com/flux

About Deepgram

Deepgram is the world’s most realistic and real-time Voice AI platform, offering speech-to-text (STT), text-to-speech (TTS), and full speech-to-speech (STS) capabilities–all powered by our enterprise-grade runtime. 200,000+ developers build with Deepgram’s voice-native foundational models – accessed through cloud APIs or as self-hosted / on-premises APIs – due to our unmatched accuracy, low latency, and pricing. Customers include technology ISVs building voice products or platforms, co-sell partners working with large enterprises, and enterprises solving internal use cases. Having processed over 50,000 years of audio and transcribed over 1 trillion words, there is no organization in the world that understands voice better than Deepgram. To learn more, visit www.deepgram.com, read our developer docs, or follow @DeepgramAI on X and LinkedIn.

View source version on businesswire.com:https://www.businesswire.com/news/home/20251002758871/en/

CONTACT: PR Contact:

Nicole Gorman

Gorman Communications, for Deepgram

M: 508-397-0131

[email protected]

KEYWORD: UNITED STATES NORTH AMERICA CALIFORNIA

INDUSTRY KEYWORD: SOFTWARE TECHNOLOGY ARTIFICIAL INTELLIGENCE VOIP

SOURCE: Deepgram

PUB: 10/02/2025 02:30 PM/DISC: 10/02/2025 02:29 PM

http://www.businesswire.com/news/home/20251002758871/en