Gemini 3.1 Flash | hokai.io
Gemini 3.1 Flash: 1M context, 86.9% GPQA, 180 tokens/sec. $0.50/$3 per million tokens. Dec 2025 fast reasoning model.
Gemini 3.1 Flash: fast reasoning at 180 tokens/sec, 1M context, 86.9% GPQA. Dec 2025 GA at $0.50/$3 per 1M tokens.
Gemini 3.1 Flash is Google's mid-tier fast model with 1M context, 86.9% GPQA, 180 tokens/sec. Released Dec 2025 at $0.50/$3 per 1M tokens.
Provider: Google · Family: Gemini 3.1
Context window: 1,000,000 tokens · Max output: 65,536
Input modalities: text, image, audio, video, pdf, tool-calls · Output: text, tool-calls
About Gemini 3.1 Flash
Gemini 3.1 Flash is Google's mid-tier reasoning model released Dec 17, 2025. Dense Transformer 40-50B params. 1M context, 65K output. GPQA 86.9%, MMLU-Pro 76.8%, SWE-bench est. 75%. Speed: 180 tokens/sec, 500ms p50 latency. Pricing: $0.50/$3 per 1M tokens, 4X cheaper than Pro. Multimodal: text, images, audio, video, PDFs. Function calling, structured output, Search, code execution GA. Computer use not available (only Pro and Flash-Lite). Training cutoff Jan 2025. Safety balanced. Use for: high-volume agentic loops where slight quality drop justifies cost savings. Not for: max reasoning (use Pro), max savings (use Flash-Lite).
Benchmarks
- mmlu pro: 76.8
- gpqa diamond: 86.9
- swe bench verified: 75
Frequently Asked Questions
What is Gemini 3.1 Flash?
Mid-tier reasoning model released Dec 17, 2025. Dense Transformer, 1M context. GPQA 86.9% (vs Pro 94.3%), 180 t/s speed, $0.50/$3 pricing. 4X cheaper than Pro with modest quality drop.