Why Word Error Rate Alone Can't Tell You How Good Your Voice AI Really Is
Word Error Rate (WER) has long been the gold standard for measuring voice AI accuracy. But as voice technology evolves from transcription to conversation, WER alone tells an increasingly incomplete story.
What WER Measures
WER calculates the percentage of words incorrectly transcribed:
- Substitutions (wrong word)
- Insertions (extra words)
- Deletions (missing words)
A 5% WER sounds impressive, and for transcription, it is. But voice AI has moved beyond transcription.
What WER Misses
Emotional Context
"That's fine" transcribed correctly has 0% WER. But was it said with enthusiasm or sadness? That emotional context determines meaning far more than the words themselves.
Prosody
Pitch, pace, and pauses carry meaning. A customer who pauses before saying "yes" is communicating something different than one who responds immediately. WER captures neither.
Conversational Flow
Natural conversation includes "um," "uh," and incomplete thoughts. These aren't errors; they're signals. Systems that penalize natural speech patterns miss crucial information.
Better Metrics for Voice AI
Emotional Accuracy
Does the system correctly identify the emotional tone of responses?
Intent Recognition
Does it understand what the speaker means, not just what they said?
Conversational Quality
Does it feel natural to talk to? Do users open up or shut down?
The ReadingMinds Approach
We measure what matters for customer insight:
- Expression signal accuracy: Can Emma identify hesitation, enthusiasm, anger?
- Probe relevance: Does she ask the right follow-up questions?
- Insight quality: Do interviews surface actionable understanding?
WER matters for getting the words right. But for voice-based customer research, understanding matters more than accuracy.
Experience voice-based research yourself: try the 3-Minute Live Test Drive or see what a report looks like.
About the author

Stu Sjouwerman
CEO and Co-Founder, ReadingMinds.AI
Stu founded KnowBe4 in 2010 and grew it into the world's largest security-awareness training platform before its acquisition by Vista Equity Partners in 2023. He co-founded ReadingMinds with Marcio Castilho and Alin Irimie, the same leadership team that built KnowBe4. Author of the USA Today bestseller Agent-Powered Growth and a regular contributor to Forbes Tech Council and Greenbook on AI, agentic marketing, and customer intelligence.
Know what your customers feel. Not just what they say.
ReadingMinds conducts AI voice interviews that classify emotion type and intensity. Try a 3-minute Live Test Drive with Emma.
Start 3‑Minute Live Test Drive