QwQ 32B: Intelligence, Performance & Price Analysis
Analysis of Alibaba's QwQ 32B and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more. Click on any model to compare API providers for that model. For more details including relating to our methodology, see our FAQs.
Comparison Summary
Intelligence:
QwQ-32B is of higher quality compared to average, with a MMLU score of 0.758 and a Intelligence Index across evaluations of 49.
Price:QwQ-32B is cheaper compared to average with a price of $0.65 per 1M Tokens (blended 3:1).
QwQ-32B Input token price: $0.50, Output token price: $0.65 per 1M Tokens.
Speed:QwQ-32B Input token price: $0.50, Output token price: $0.65 per 1M Tokens.
QwQ-32B is slower compared to average, with a output speed of 80.1 tokens per second.
Latency:QwQ-32B has a lower latency compared to average, taking 0.61s to receive the first token (TTFT).
Context Window:QwQ-32B has a smaller context windows than average, with a context window of 130k tokens.
Highlights
Intelligence
Artificial Analysis Intelligence Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Comparisons to QwQ-32B
o1
GPT-4o (Nov '24)
GPT-4o mini
o3-mini (high)
GPT-4.5 (Preview)
Llama 3.3 Instruct 70B
Llama 3.1 Instruct 405B
Llama 3.1 Instruct 8B
Gemini 2.0 Pro Experimental (Feb '25)
Gemini 2.0 Flash (Feb '25)
Claude 3.5 Haiku
Claude 3.7 Sonnet (Extended Thinking)
Claude 3.7 Sonnet (Standard)
Mistral Large 2 (Nov '24)
Mistral Small 3
DeepSeek R1
DeepSeek V3
Grok 3
Grok 3 Reasoning Beta
Nova Pro
MiniMax-Text-01
Further details