GPT-4o (March 2025, chatgpt-4o-latest) vs. GPT-4o Realtime (Dec '24)
Comparison between GPT-4o (March 2025, chatgpt-4o-latest) and GPT-4o Realtime (Dec '24) across intelligence, price, speed, context window and more.
For details relating to our methodology, see our Methodology page.
Highlights
Model Comparison
Metric | Analysis | ||
---|---|---|---|
Creator | |||
Context Window | 128k tokens (~192 A4 pages of size 12 Arial font) | 128k tokens (~192 A4 pages of size 12 Arial font) | Both GPT-4o (March 2025, chatgpt-4o-latest) and GPT-4o Realtime (Dec '24) have the same sized context window |
Release Date | March, 2025 | December, 2024 | GPT-4o (March 2025, chatgpt-4o-latest) has a more recent release date than GPT-4o Realtime (Dec '24) |
Image Input Support | Yes | No | GPT-4o (March 2025, chatgpt-4o-latest) has image input support while GPT-4o Realtime (Dec '24) does not |
Open Source (Weights) | No | No | Both GPT-4o (March 2025, chatgpt-4o-latest) and GPT-4o Realtime (Dec '24) are proprietary |
Intelligence
Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index by Model Type
Artificial Analysis Intelligence Index by Open Weights vs Proprietary
Artificial Analysis Coding Index
Artificial Analysis Math Index
Intelligence Evaluations
Intelligence vs. Price
Intelligence vs. Output Speed
Intelligence vs. End-to-End Response Time
- Input time: Time to receive the first response token
- Thinking time (only for reasoning models): Time reasoning models spend outputting tokens to reason prior to providing an answer. Amount of tokens based on the average reasoning tokens across a diverse set of 60 prompts (methodology details).
- Answer time: Time to generate 500 output tokens, based on output speed
Intelligence Index Token Use & Cost
Output Tokens Used to Run Artificial Analysis Intelligence Index
Intelligence vs. Output Tokens Used in Artificial Analysis Intelligence Index
Cost to Run Artificial Analysis Intelligence Index
Intelligence vs. Cost to Run Artificial Analysis Intelligence Index
Context Window
Context Window
Intelligence vs. Context Window
Pricing
Pricing: Input and Output Prices
Pricing: Cached Input Prompts
Pricing: Image Input Pricing
Performance Summary
Output Speed vs. Price
Latency vs. Output Speed
Speed
Measured by Output Speed (tokens per second)
Output Speed
Output Speed by Input Token Count (Context Length)
Output Speed Variance

Output Speed, Over Time
Latency
Measured by Time (seconds) to First Token
Latency: Time To First Answer Token
Latency: Time To First Token
Time to First Token by Input Token Count (Context Length)
Time to First Token Variance

Time to First Token, Over Time
End-to-End Response Time
Seconds to output 500 Tokens, calculated based on time to first token, 'thinking' time for reasoning models, and output speed
End-to-End Response Time
- Input time: Time to receive the first response token
- Thinking time (only for reasoning models): Time reasoning models spend outputting tokens to reason prior to providing an answer. Amount of tokens based on the average reasoning tokens across a diverse set of 60 prompts (methodology details).
- Answer time: Time to generate 500 output tokens, based on output speed
End-to-End Response Time by Input Token Count (Context Length)
End-to-End Response Time, Over Time
- Input time: Time to receive the first response token
- Thinking time (only for reasoning models): Time reasoning models spend outputting tokens to reason prior to providing an answer. Amount of tokens based on the average reasoning tokens across a diverse set of 60 prompts (methodology details).
- Answer time: Time to generate 500 output tokens, based on output speed
Comparisons to GPT-4o (March 2025)
GPT-4.1 mini
GPT-4.1
o3
o4-mini (high)
Llama 4 Maverick
Llama 4 Scout
Gemini 2.5 Flash Preview (Reasoning)
Gemma 3 27B Instruct
Gemini 2.5 Pro Preview (Mar' 25)
Claude 3.7 Sonnet (Extended Thinking)
Mistral Large 2 (Nov '24)
DeepSeek R1
DeepSeek V3 0324 (Mar' 25)
Grok 3 mini Reasoning (high)
Grok 3
Nova Premier
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Qwen3 235B A22B (Reasoning)
GPT-4o (Nov '24)
Comparisons to GPT-4o Realtime (Dec '24)
GPT-4.1 mini
GPT-4.1
o3
o4-mini (high)
Llama 4 Maverick
Llama 4 Scout
Gemini 2.5 Flash Preview (Reasoning)
Gemma 3 27B Instruct
Gemini 2.5 Pro Preview (Mar' 25)
Claude 3.7 Sonnet (Extended Thinking)
Mistral Large 2 (Nov '24)
DeepSeek R1
DeepSeek V3 0324 (Mar' 25)
Grok 3 mini Reasoning (high)
Grok 3
Nova Premier
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Qwen3 235B A22B (Reasoning)
GPT-4o (Nov '24)
Models compared: OpenAI: GPT 4o Audio, GPT 4o Realtime, GPT 4o Speech Pipeline, GPT-3.5 Turbo, GPT-3.5 Turbo (0125), GPT-3.5 Turbo (0314), GPT-3.5 Turbo (1106), GPT-3.5 Turbo Instruct, GPT-4, GPT-4 Turbo, GPT-4 Turbo (0125), GPT-4 Turbo (1106), GPT-4 Vision, GPT-4.1, GPT-4.1 mini, GPT-4.1 nano, GPT-4.5 (Preview), GPT-4o (April 2025), GPT-4o (Aug '24), GPT-4o (ChatGPT), GPT-4o (March 2025), GPT-4o (May '24), GPT-4o (Nov '24), GPT-4o Realtime (Dec '24), GPT-4o mini, GPT-4o mini Realtime (Dec '24), o1, o1-mini, o1-preview, o1-pro, o3, o3-mini, o3-mini (high), and o4-mini (high), Meta: Code Llama 70B, Llama 2 Chat 13B, Llama 2 Chat 70B, Llama 2 Chat 7B, Llama 3 70B, Llama 3 8B, Llama 3.1 405B, Llama 3.1 70B, Llama 3.1 8B, Llama 3.2 11B (Vision), Llama 3.2 1B, Llama 3.2 3B, Llama 3.2 90B (Vision), Llama 3.3 70B, Llama 4 Behemoth, Llama 4 Maverick, Llama 4 Scout, and Llama 65B, Google: Gemini 1.0 Pro, Gemini 1.0 Ultra, Gemini 1.5 Flash (May), Gemini 1.5 Flash (Sep), Gemini 1.5 Flash-8B, Gemini 1.5 Pro (May), Gemini 1.5 Pro (Sep), Gemini 2.0 Flash, Gemini 2.0 Flash (exp), Gemini 2.0 Flash Thinking exp. (Dec '24), Gemini 2.0 Flash Thinking exp. (Jan '25), Gemini 2.0 Flash-Lite (Feb '25), Gemini 2.0 Flash-Lite (Preview), Gemini 2.0 Pro Experimental, Gemini 2.5 Flash, Gemini 2.5 Flash (Reasoning), Gemini 2.5 Pro, Gemini 2.5 Pro Preview (May' 25), Gemini Experimental (Nov), Gemma 2 27B, Gemma 2 9B, Gemma 3 12B, Gemma 3 1B, Gemma 3 27B, Gemma 3 4B, Gemma 7B, and PALM-2, Anthropic: Claude 2.0, Claude 2.1, Claude 3 Haiku, Claude 3 Opus, Claude 3 Sonnet, Claude 3.5 Haiku, Claude 3.5 Sonnet (June), Claude 3.5 Sonnet (Oct), Claude 3.7 Sonnet Thinking, Claude 3.7 Sonnet, and Claude Instant, Mistral: Codestral (Jan '25), Codestral (May '24), Codestral-Mamba, Ministral 3B, Ministral 8B, Mistral 7B, Mistral Large (Feb '24), Mistral Large 2 (Jul '24), Mistral Large 2 (Nov '24), Mistral Medium, Mistral Medium 3, Mistral NeMo, Mistral Saba, Mistral Small (Feb '24), Mistral Small (Sep '24), Mistral Small 3, Mistral Small 3.1, Mixtral 8x22B, Mixtral 8x7B, Pixtral 12B, and Pixtral Large, DeepSeek: DeepSeek Coder V2 Lite, DeepSeek LLM 67B (V1), DeepSeek Prover V2 671B, DeepSeek R1, DeepSeek R1 (FP4), DeepSeek R1 Distill Llama 70B, DeepSeek R1 Distill Llama 8B, DeepSeek R1 Distill Qwen 1.5B, DeepSeek R1 Distill Qwen 14B, DeepSeek R1 Distill Qwen 32B, DeepSeek V3 (Dec '24), DeepSeek V3 (Mar' 25), DeepSeek-Coder-V2, DeepSeek-V2, DeepSeek-V2.5, DeepSeek-V2.5 (Dec '24), DeepSeek-VL2, and Janus Pro 7B, Perplexity: PPLX-70B Online, PPLX-7B-Online, R1 1776, Sonar, Sonar 3.1 Huge, Sonar 3.1 Large, Sonar 3.1 Small , Sonar Large, Sonar Pro, Sonar Reasoning, Sonar Reasoning Pro, and Sonar Small, xAI: Grok 2, Grok 3, Grok 3 Reasoning Beta, Grok 3 mini, Grok 3 mini Reasoning (low), Grok 3 mini Reasoning (high), Grok Beta, and Grok-1, OpenChat: OpenChat 3.5, Amazon: Nova Lite, Nova Micro, Nova Premier, and Nova Pro, Microsoft Azure: Phi-3 Medium 14B, Phi-3 Mini, Phi-4, Phi-4 Mini, Phi-4 Multimodal, Phi-4 mini reasoning, Phi-4 reasoning, and Phi-4 reasoning plus, Liquid AI: LFM 1.3B, LFM 3B, and LFM 40B, Upstage: Solar Mini, Solar Pro, and Solar Pro (Nov '24), Databricks: DBRX, MiniMax: MiniMax-Text-01, NVIDIA: Cosmos Nemotron 34B, Llama 3.1 Nemotron 70B, Llama 3.1 Nemotron Nano 8B, Llama 3.3 Nemotron Nano 8B v1 (Reasoning), Llama 3.1 Nemotron Ultra 253B Reasoning, Llama 3.3 Nemotron Super 49B v1, and Llama 3.3 Nemotron Super 49B Reasoning, IBM: Granite 3.0 2B, OpenVoice: Granite 3.0 8B, Inceptionlabs: Mercury Coder Mini, Mercury Coder Small, and Mercury Instruct, Reka AI: Reka Core, Reka Edge, Reka Flash (Feb '24), Reka Flash, and Reka Flash 3, Xiaomi: MiMo 7B RL, Other: LLaVA-v1.5-7B, Cohere: Aya Expanse 32B, Aya Expanse 8B, Command, Command A, Command Light, Command R7B, Command-R, Command-R (Mar '24), Command-R+ (Apr '24), and Command-R+, Bytedance: Skylark Lite and Skylark Pro, AI21 Labs: Jamba 1.5 Large, Jamba 1.5 Large (Feb '25), Jamba 1.5 Mini, Jamba 1.5 Mini (Feb 2025), Jamba 1.6 Large, Jamba 1.6 Mini, and Jamba Instruct, Snowflake: Arctic and Snowflake Llama 3.3 70B, Alibaba: QwQ-32B, QwQ 32B-Preview, Qwen Chat 14B, Qwen Chat 72B, Qwen Plus, Qwen Turbo, Qwen1.5 Chat 110B, Qwen1.5 Chat 14B, Qwen1.5 Chat 32B, Qwen1.5 Chat 72B, Qwen1.5 Chat 7B, Qwen2 72B, Qwen2 Instruct 7B, Qwen2 Instruct A14B 57B, Qwen2-VL 72B, Qwen2.5 Coder 32B, Qwen2.5 Coder 7B , Qwen2.5 Instruct 14B, Qwen2.5 Instruct 32B, Qwen2.5 72B, Qwen2.5 Instruct 7B, Qwen2.5 Max, Qwen2.5 Max 01-29, Qwen2.5 Omni 7B, Qwen2.5 VL 72B, Qwen2.5 VL 7B, Qwen3 0.6B, Qwen3 0.6B (Reasoning), Qwen3 1.7B, Qwen3 1.7B (Reasoning), Qwen3 14B, Qwen3 14B (Reasoning), Qwen3 235B A22B, Qwen3 235B A22B (Reasoning), Qwen3 30B A3B, Qwen3 30B A3B (Reasoning), Qwen3 32B, Qwen3 32B (Reasoning), Qwen3 4B, Qwen3 4B (Reasoning), Qwen3 8B, and Qwen3 8B (Reasoning), and 01.AI: Yi-Large.