Gemini 3 Flash - Everything you need to know

See model page

Google has released Gemini 3 Flash Preview - 2x cheaper than Gemini 3 Pro Preview, with only a 2-point drop in our Intelligence Index, making it the most intelligent model for its price range

Google gave us pre-release access to Gemini 3 Flash Preview. The model scores 71 on the Artificial Analysis Intelligence Index, a 13 point improvement from Gemini 2.5 Flash (Sep), making it the most intelligent model for its cost. Gemini 3 Flash Preview has particularly strong knowledge and reasoning abilities, obtaining the highest score in our knowledge and hallucination benchmark, AA-Omniscience, and placing second in Humanity’s Last Exam. Google now holds the top two spots on both of these evaluations, cementing them as a leader in model knowledge. This increased performance comes with a tradeoff, as Gemini 3 Flash Preview more than doubles token usage vs Gemini 2.5 Flash (Sep) when running the Artificial Analysis Intelligence Index, making it one of the highest token use models we’ve tested.

Artificial Analysis Intelligence Index (16 Dec 25)

Key takeaways:

📖 Significant intelligence improvements: Gemini 3 Flash Preview has significant improvements across nearly all evaluations in the Artificial Analysis Intelligence Index. It has particular strengths in reasoning settings, scoring second to Gemini 3 Pro Preview in Humanity’s Last Exam (35%), and third in both MMLU-Pro (89%) and GPQA Diamond (90%) (behind Gemini 3 Pro Preview and GPT-5.2 xhigh).
🧠 AA-Omniscience performance: Gemini 3 Flash Preview achieves the highest score in our knowledge and hallucination benchmark, AA-Omniscience. This is driven by increased accuracy (percentage correct), rather than lower hallucination. The model has the highest knowledge accuracy of any model tested, but has a hallucination rate of 91%, 3 percentage points higher than Gemini 2.5 Flash and Gemini 3 Pro Preview. We measure hallucination rate as how often the model answers incorrectly when it should have refused or admitted to not knowing the answer.

AA-Omniscience Index (16 Dec 25)

🖼️ Multimodal capabilities: Gemini 3 Flash Preview is a multi-modal model, with the ability to take text, images, video and audio as input. It scores the second highest of any model on MMMU-Pro, a benchmark that tests reasoning abilities with image inputs, behind only Gemini 3 Pro Preview.
➕ Significantly increased token use: Gemini 3 Flash Preview uses ~160M tokens on the Artificial Analysis Intelligence Index, more than double used by Gemini 2.5 Flash (Sep). This makes it one of the highest token use models we have tested, beating out other high use models Kimi K2 thinking and Grok 4 (thinking).

Output Tokens Used to Run Artificial Analysis Intelligence Index (16 Dec 25)

$ Cost efficiency: Despite its high token use, Gemini 3 Flash Preview is still the most cost efficient model for its level of intelligence, measured by the overall cost to run the Artificial Intelligence Index. This is driven by low token prices of $0.5/$3 per 1M input/output tokens.

Intelligence vs Cost to Run Artificial Analysis Intelligence Index (16 Dec 25)

⚡ Speed: Gemini 3 Flash Preview is 22% slower than Gemini 2.5 Flash (Sep), measuring at 218 output tokens per second. This is still much faster than similarly intelligent models GPT-5.1 (high) (125 tokens/s), Kimi K2 Thinking (82 tokens/s) and DeepSeek V3.2 (Reasoning) (30 tokens/s).

Output Speed (16 Dec 25)

Other details: Gemini 3 Flash Preview has a 1 million token context window, and includes support for tool calling, structured outputs, and JSON mode.

Individual results across the 10 evals we run independently for the Artificial Analysis Intelligence Index: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME 2025, IFBench, AA-LCR, Terminal-Bench Hard, 𝜏²-Bench Telecom

Further analysis on Artificial Analysis:

https://artificialanalysis.ai/models/gemini-3-flash

Gemini 3 Flash - Everything you need to know

Read the latest

Claude Opus 5: the new leader in agentic knowledge work

Opus 5: Fable 5 level intelligence at a lower cost per task

How Thinking Machines Lab’s Inkling performs on agentic knowledge work