Comparison Summary
Features | Model Intelligence | Price | Output tokens/s | Latency | End-to-End Response Time | |||
---|---|---|---|---|---|---|---|---|
Qwen3 235B (Reasoning) Base | 33k | 48 | $0.30 | 42.8 | 0.66 | 59.12 | 46.77 | |
Qwen3 235B (Reasoning) | 131k | 48 | $0.39 | 84.4 | 0.59 | 30.21 | 23.70 | |
Qwen3 235B (Reasoning) (FP8) | 41k | 48 | $0.25 | 35.3 | 0.57 | 71.40 | 56.66 | |
![]() | Qwen3 235B (Reasoning) (FP8) | 41k | 48 | $0.35 | 37.5 | 0.81 | 67.41 | 53.28 |
Qwen3 235B (Reasoning) (FP8) | 41k | 48 | $0.30 | 34.3 | 0.30 | 73.14 | 58.27 |
Measured by Output Speed (tokens per second)
Measured by Time (seconds) to First Token
Seconds to output 500 Tokens, calculated based on time to first token, 'thinking' time for reasoning models, and output speed