Text to Speech AI Model & Provider Leaderboard
Analysis and comparison of Text to Speech generation models & API providers. Artificial Analysis has analyzed text to speech models and hosting providers across quality, generation time, and price. For further details, see our methodology page.
Text to speech models & providers compared: TTS-1, TTS-1 HD, Studio, Journey, Neural2, WaveNet, Standard, Polly Long-Form, Polly Neural, Polly Standard, Azure Neural, MetaVoice v1, XTTS v2, StyleTTS 2, OpenVoice v2, Sonic English (Oct '24), 3.0 mini, Turbo v2.5, Multilingual v2, T2A-01-HD, T2A-01-Turbo, Zonos-v0.1, Kokoro 82M v1.0, Polly Generative, Flash v2.5, Dialog, Murf Speech Gen 2, Simba, Step TTS Mini, Mist V2, and LMNT.
Highlights
Summary Analysis
Quality vs. Price
Quality vs. Speed
Speed vs. Price
Quality
Quality Arena ELO (Text to Speech Arena)
Arena Win Rate
Participate in the Speech Arena to contribute to the crowdsourced quality evaluations
Speed
Characters Per Second
Characters Per Second, Variance

Characters per Second, Over Time
Price
Price
Streaming
Streaming Support
Provider | Streaming Support |
---|---|
![]() | |
![]() | |
![]() | |
![]() | |
![]() | |
![]() |
Provider | Model | Streaming support | Footnotes | Model Arena ELO | Characters per Second | Price per 1M Characters (USD) | Further Details |
---|---|---|---|---|---|---|---|
TTS-1 HD | 1156 | 579.7 | $30.00 | ||||
TTS-1 | 1141 | 471.8 | $15.00 | ||||
![]() | Multilingual v2 | 1116 | 75.1 | $206.00 | |||
![]() | Turbo v2.5 | 1113 | 386.7 | $103.00 | |||
Sonic English (Oct '24) | 1110 | 33.7 | $46.70 | ||||
![]() | Flash v2.5 | 1099 | 341.1 | $103.00 | |||
Kokoro 82M v1.0 | 1092 | 287.2 | $0.65 | ||||
![]() | T2A-01-HD | 1075 | 79.5 | $50.00 | |||
Azure Neural | 1062 | 300.1 | $15.00 | ||||
![]() | Polly Long-Form | 1062 | 352.3 | $100.00 | |||
![]() | Polly Generative | 1055 | 85.1 | $30.00 | |||
Studio | 1043 | 278.7 | $160.00 | ||||
![]() | T2A-01-Turbo | 1042 | 79.5 | $30.00 | |||
![]() | Simba | 1022 | 115.7 | $187.50 | |||
![]() | Dialog | 1000 | 78.1 | $150.00 | |||
![]() | 3.0 mini | 998 | 75.5 | $150.00 | |||
![]() | Zonos-v0.1 | 993 | 27.2 | $20.00 | |||
![]() | Murf Speech Gen 2 | 978 | 130.3 | $100.00 | |||
OpenVoice v2 | 977 | 9.5 | $8.33 | ||||
LMNT | 974 | 318.5 | $43.60 | ||||
Journey | 958 | 120.0 | $160.00 | ||||
![]() | Mist V2 | 946 | 402.8 | $31.13 | |||
![]() | Step TTS Mini | 946 | 16.1 | $12.38 | |||
XTTS v2 | 899 | 35.9 | $40.44 | ||||
StyleTTS 2 | 893 | 2.7 | $2.82 | ||||
![]() | Polly Neural | 887 | 473.6 | $16.00 | |||
WaveNet | 874 | 431.1 | $16.00 | ||||
Standard | 839 | 532.3 | $4.00 | ||||
Neural2 | 835 | 576.5 | $16.00 | ||||
![]() | Polly Standard | 798 | 1031.1 | $4.00 | |||
MetaVoice v1 | 787 | 2.0 | $123.97 |