April 3, 2026
MAI-Transcribe-1: Everything you need to know
Microsoft has released MAI-Transcribe-1: a speech transcription model achieving 3.0% on AA-WER (#4), and is fast at 69x realtime
The model was developed by Microsoft AI (MAI)’s Superintelligence team and supports 25 languages including English, French, Arabic, Japanese, and Chinese. MAI-Transcribe-1 API is currently available in public preview via Azure Speech on Microsoft Foundry.
On the Artificial Analysis Speech to Text (STT) leaderboard, MAI-Transcribe-1 achieves a 3.0% word error rate on AA-WER for speech transcription accuracy, positioning it 4th overall behind Mistral’s Voxtral Small (2.9% AA-WER), Google’s Gemini 3.1 Pro High (2.9% AA-WER) and ElevenLabs’ Scribe v2 (2.3% AA-WER). It also stands out as one of the faster high-accuracy transcription models available, processing audio at ~69x real-time.

On speed, MAI-Transcribe-1 transcribes approximately 69 seconds of audio per second of processing, making it the fastest model in the top 5 by accuracy.

MAI-Transcribe-1 is available at $6 per 1000 minutes of audio via Microsoft Foundry.

See full results on the Artificial Analysis Speech to Text leaderboard: https://artificialanalysis.ai/speech-to-text
Read the latest

AA-WER Streaming: New Speech to Text Streaming Benchmark
Announcing AA-WER Streaming, our new benchmark measuring streaming Speech to Text models on accuracy and latency for voice agent use cases. Pareto optimal models on this new benchmark include those from Cartesia, ElevenLabs, and Deepgram
June 2, 2026

Nemotron 3 Ultra announced: high-speed, leading US open weights intelligence
NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model
June 1, 2026

Claude Opus 4.8 - The new #1 AI model
Anthropic retakes #1 on GDPval-AA and advances in terminal use and scientific reasoning
May 28, 2026