June 2, 2026
Microsoft has released MAI-Transcribe-1.5: an exceptionally fast speech transcription model at a speed factor of ~276x, while still achieving 2.4% on AA-WER (#3), leading the accuracy-speed Pareto frontier
Microsoft has released MAI-Transcribe-1.5: an exceptionally fast speech transcription model at a speed factor of ~276x, while still achieving 2.4% on AA-WER (#3), leading the accuracy-speed Pareto frontier
MAI-Transcribe-1.5 is Microsoft AI (MAI)’s latest speech transcription model, coming in at 3rd overall on the on the Artificial Analysis Word Error Rate (AA-WER) leaderboard, behind Alibaba’s Fun-Realtime-ASR-preview (1.7% WER), and ElevenLabs Scribe v2 (2.2% WER). The model stands out as the fastest STT model in the top 10 for accuracy, processing audio at ~276x real-time - this is more than double the speed of the second fastest model in the top 10 for accuracy.
The new model supports keyword biasing (improved recognition of rarer vocabulary such as names and medical terminology), in addition to support for 43 languages including English, French, Arabic, Japanese, and Chinese.
See more details below ⬇️

MAI-Transcribe-1.5 ranks 2nd on VoxPopuli-Cleaned-AA (1.6% WER), 4th on Earnings22-Cleaned-AA (4.0% WER), and 5th on AA-AgentTalk (2.0% WER).

MAI-Transcribe-1.5 is the fastest model in the top 10 models for accuracy, leading the accuracy-speed Pareto frontier with a speed factor of ~276x.

MAI-Transcribe-1.5 is available at $6 per 1,000 minutes of audio via Microsoft Foundry.

See full results on the Artificial Analysis Speech to Text leaderboard: https://artificialanalysis.ai/speech-to-text
See our new Streaming Speech to Text leaderboard: https://artificialanalysis.ai/speech-to-text/streaming
Read the latest

AA-WER Streaming: New Speech to Text Streaming Benchmark
Announcing AA-WER Streaming, our new benchmark measuring streaming Speech to Text models on accuracy and latency for voice agent use cases. Pareto optimal models on this new benchmark include those from Cartesia, ElevenLabs, and Deepgram
June 2, 2026

Nemotron 3 Ultra announced: high-speed, leading US open weights intelligence
NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model
June 1, 2026

Claude Opus 4.8 - The new #1 AI model
Anthropic retakes #1 on GDPval-AA and advances in terminal use and scientific reasoning
May 28, 2026