Speech Arena LeaderboardArtificial Analysis
Frequently Asked Questions
The top Text to Speech models by Elo rating are: 1. Sonic 3.5 (Elo 1210), 2. Gemini 3.1 Flash TTS (Elo 1210), 3. Realtime TTS-2 - Research Preview (Elo 1208), 4. Realtime TTS 1.5 Max (Elo 1195), 5. StepAudio 2.5 TTS (Elo 1183). Rankings are based on blind user votes in the Speech Arena.
Kokoro 82M v1.0 is the most affordable at $0.65 per 1M characters with an Elo score of 1059. Other affordable options include StyleTTS 2 at $2.82 per 1M characters.
The top 5 open weights Text to Speech models are: 1. Fish Audio S2 Pro (Elo 1123), 2. Step Audio EditX (Mar 2026) (Elo 1109), 3. Voxtral TTS (Elo 1073), 4. Kokoro 82M v1.0 (Elo 1059), 5. Magpie-Multilingual 357M (Feb 2026) (Elo 1055).