English Language AI Models Benchmark Compare Multilingual LLM Performance
The top 5 English language AI models are Claude Opus 4.6 (max), Gemini 3.1 Pro Preview, Gemini 3 Flash, Claude Opus 4.5, and GPT-5.1 (high). They achieve the highest English language reasoning scores in the Artificial Analysis Multilingual Index.
To compare performance across all supported languages, see the full Multilingual AI Model Benchmark page.
🇬🇧 Top English language models
#1
Claude Opus 4.6 (max)
95
#2
Gemini 3.1 Pro Preview
95
#3
Gemini 3 Flash
95
#4
Claude Opus 4.5
94
#5
GPT-5.1 (high)
94
Intelligence
Multilingual Index: English; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Multilingual Index: English Language
Artificial Analysis Multilingual Index; Higher is better
Multilingual Index: English Language vs. Price
Artificial Analysis Multilingual Index; Price: USD per 1M Tokens
Most attractive quadrant
Claude 4.5 Sonnet
Claude Opus 4.5
DeepSeek V3.2
Gemini 3 Pro Preview (high)
GPT-5.2 (medium)
gpt-oss-120B (high)
Grok 4
Llama 4 Maverick
Magistral Medium 1.2
MiniMax-M2.1
MiniMax-M2.5
Multilingual Index: English Language vs. Output Speed
Artificial Analysis Multilingual Index; Output Speed: Output Tokens per Second
Most attractive quadrant
Claude 4.5 Sonnet
Claude Opus 4.5
DeepSeek V3.2
Gemini 3 Pro Preview (high)
gpt-oss-120B (high)
Grok 4
Llama 4 Maverick
Magistral Medium 1.2
MiniMax-M2.1
MiniMax-M2.5
Multilingual Index: English Language vs. Context Window
Artificial Analysis Multilingual Index; Context Window: Tokens Limit
Most attractive quadrant
Claude 4.5 Sonnet
Claude Opus 4.5
DeepSeek V3.2
Gemini 3 Pro Preview (high)
GPT-5.2 (medium)
gpt-oss-120B (high)
Grok 4
K-EXAONE
K2-V2 (high)
Llama 4 Maverick
Magistral Medium 1.2
MiniMax-M2.1
MiniMax-M2.5
Multilingual Global-MMLU-Lite: English Language
Multilingual Global-MMLU-Lite; Higher is better
Pricing: Input and Output Prices
Price: USD per 1M Tokens
Input price
Output price
Reasoning models are indicated by a lightbulb icon.
Output Speed
Output Tokens per Second; Higher is better
Reasoning models are indicated by a lightbulb icon.
Latency: Time To First Answer Token
Seconds to First Answer Token Received; Accounts for Reasoning Model 'Thinking' time
Input processing
Thinking (reasoning models, when applicable)
Reasoning models are indicated by a lightbulb icon.
End-to-End Response Time
Seconds to Output 500 Tokens, including reasoning model 'thinking' time; Lower is better
Input processing time
'Thinking' time (reasoning models)
Outputting time
Reasoning models are indicated by a lightbulb icon.