Gemini 3.5 Pro: 2M Context Window, Deep Think (2026)
Gemini 3.5 Pro, announced May 2026 by Google DeepMind, offers a 2M-token context window and Deep Think reasoning. Limited Vertex preview; GA expected June 2026.
Gemini 3.5 Pro is Google DeepMind's frontier multimodal model, announced May 19, 2026 at Google I/O, targeting a 2M-token context window and Deep Think reasoning mode. As of June 9, 2026, access is limited to Vertex AI enterprise preview; official pricing has not been announced, with the Flash sibling priced at $1.50 input / $9 output per 1M tokens.
Gemini 3.5 Pro is Google DeepMind's frontier multimodal model, announced at Google I/O on May 19, 2026. It targets a 2M-token context window and a Deep Think reasoning mode. As of June 9, 2026, it is in limited Vertex AI enterprise preview with general availability expected in late June 2026. Official benchmarks and pricing have not been published; the Flash-tier sibling runs $1.50 input / $9 output per 1M tokens for comparison.
Provider: Google · Family: Gemini 3.5
Context window: 2,000,000 tokens
Input modalities: text, image, audio, video, tool-calls · Output: text, tool-calls
About Gemini 3.5 Pro
Gemini 3.5 Pro is Google DeepMind's frontier multimodal language model, announced at Google I/O on May 19, 2026. It represents the Pro tier of the Gemini 3.5 family and is positioned to absorb the use cases previously routed to Google's Ultra tier: the hardest reasoning tasks, deep multimodal work, and very long-context analysis. As of June 9, 2026, the model is in limited preview for Vertex AI enterprise customers; general availability is expected in late June 2026. Architecture details have not been publicly disclosed; prior Gemini Pro releases use dense Transformer architectures with Google's native multimodal training pipeline, and 3.5 Pro is built on the same lineage. Published benchmarks for Gemini 3.5 Pro are not available as of June 9, 2026, as the model has not been released to the general public for independent evaluation. For context: its predecessor Gemini 3.1 Pro scored 94.3% on GPQA Diamond, 80.6% on SWE-bench Verified, and 90.99% on MMLU-Pro, leading the GPQA leaderboard at launch. Gemini 3.5 Flash, the public sibling released the same day at Google I/O, scored 78% on SWE-bench Verified with a 42% improvement over Flash 3. One source citing internal data suggests Gemini 3.5 Pro will post 10-15 point SWE-bench gains over the 3.1 generation; these figures have not been independently verified. Gemini 3 Deep Think achieved 84.6% on ARC-AGI-2 and gold-medal performance at international math and programming olympiads in early 2026. Gemini 3.5 Pro targets a 2M-token context window, the largest announced for any production frontier model as of June 2026. This doubles the 1M offered by Gemini 3.5 Flash and GPT-5.5. Maximum output tokens have not been officially specified. Whether the 2M context achieves production-stable recall at extreme depths has not been publicly evaluated. Gemini 3.1 Pro demonstrated reliable long-context recall in prior evaluations as a reference point. Gemini 3.5 Pro accepts text, image, audio, and video in a single native API call, following the joint-reasoning multimodal design of Gemini 3.1 Pro. Deep Think reasoning mode, a specific feature of the Pro tier, offers extended chain-of-thought for hard reasoning tasks similar to the Gemini 3 Deep Think tier. Function calling, structured output, tool use, and Google Search grounding are expected based on Gemini 3.5 Flash's confirmed capabilities. Computer use and code execution capabilities have not been officially confirmed for 3.5 Pro as of June 9, 2026. Official pricing for Gemini 3.5 Pro has not been announced by Google. Based on historical Pro-to-Flash pricing ratios across prior Gemini generations, analysts estimate approximately $15.00 input / $60.00 output per 1M tokens. This estimate has not been confirmed. For reference, Gemini 3.5 Flash is priced at $1.50 input / $9.00 output per 1M, and Gemini 3.1 Pro costs $2.00 input / $12.00 output per 1M. Cost examples will be accurate after official launch pricing is confirmed. Gemini 3.5 Pro is accessible only through Vertex AI in limited enterprise preview as of June 9, 2026. It is not yet available via Google AI Studio, the Gemini API for general developers, AWS Bedrock, Azure, Together, or Fireworks. GA access is expected to open through both Vertex AI and the Gemini API, consistent with how prior Gemini Pro tiers launched. Authentication uses Google Cloud IAM on Vertex. Access requests go through the Vertex AI console. Gemini 3.5 Pro has no published model card as of June 9, 2026. The Gemini 3.5 Flash model card is available at deepmind.google/models/model-cards/gemini-3-5-flash/. Prior Gemini Pro safety training included filtering for CSAM, violent content, and harmful outputs, aligned with Google's AI Principles and safety framework. Deep Think reasoning is expected to include the same safety controls applied to Gemini 3.1 Deep Think, which outperformed Gemini 2.5 Pro on safety and tone metrics while keeping unjustified refusals low. Gemini 3.5 Pro is the natural choice for teams requiring the longest available context window (2M tokens) combined with Deep Think reasoning and Vertex AI integration. Organizations already on Vertex AI who need more context than Gemini 3.1 Pro's limits or need the strongest reasoning for scientific and mathematical tasks are the primary upgrade target. Teams who need a production model now should use Gemini 3.5 Flash (78% SWE-bench, $1.50/$9 per 1M) or Gemini 3.1 Pro ($2/$12 per 1M) until 3.5 Pro reaches GA. Teams not on Google Cloud should wait for Gemini API access before planning integrations around 3.5 Pro. Gemini 3.5 Pro's training data cutoff has not been published. Gemini 3.1 Pro's training date was February 2026 as a reference point. Google's pipeline includes deduplication, quality filtering, CSAM filtering, and safety filtering consistent with prior generations. API inputs are not used to train production models per Google's standard data policy. GDPR compliance and Google Cloud data residency options apply on Vertex. No system card URL is available for Gemini 3.5 Pro as of June 9, 2026. Gemini 3.5 Pro was announced at Google I/O on May 19, 2026, alongside Gemini 3.5 Flash (GA same day, $1.50/$9 per 1M) and Gemini Spark. Gemini 3 Pro was deprecated as of March 26, 2026. Gemini 3.1 Pro Preview became the active Pro tier thereafter. Gemini 3.5 Pro is positioned to succeed 3.1 Pro Preview as the primary Pro tier upon reaching GA, at which point benchmarks and official pricing will be confirmed.
Pricing
Official pricing not yet announced. Analyst estimate based on Flash-to-Pro ratio: approximately $15/$60 per 1M tokens. Gemini 3.5 Flash (the GA sibling) is $1.50/$9 per 1M. Gemini 3.1 Pro (the prior Pro tier) is $2/$12 per 1M. Confirm at official launch.
Key Features
- 2M-Token Context Window: The largest announced context window of any production frontier model as of June 2026. Handles entire codebases, multi-document research corpora, or book-length texts in one API call.
- Deep Think Reasoning Mode: Extended chain-of-thought reasoning for hard scientific, mathematical, and coding problems. Prior Deep Think tier scored 84.6% on ARC-AGI-2 and achieved gold-medal performance at international olympiads.
- Native Multimodal Joint Reasoning: Text, image, audio, and video processed in one API request. No separate model routing for different modalities, consistent with Gemini 3.1 Pro architecture.
- Vertex AI Enterprise Integration: Deep integration with Google Cloud services including BigQuery, Cloud Storage, and Workspace. Enterprise data residency, IAM-based access control, and provisioned throughput available.
- Function Calling and Structured Output: Native function calling and JSON structured output, consistent with Gemini 3.5 Flash's confirmed capabilities. Enables agentic workflows without custom orchestration.
Pros
- 2M-token context window, largest of any frontier model announced as of June 2026, eliminating chunking for full-codebase or document-corpus ingestion.
- Deep Think reasoning mode targets gold-medal standards on math and programming olympiad problems, following the prior Gemini 3 Deep Think tier's 84.6% ARC-AGI-2 score.
- Native multimodal joint reasoning across text, image, audio, and video in one request, with deep Vertex AI enterprise integration.
Cons
- Not GA as of June 9, 2026: production deployments are blocked until general availability launches in late June.
- Official benchmarks and pricing have not been published; planning around analyst estimates of ~$15/$60 per 1M carries risk.
- Closed weights, API-only: no self-hosting, fine-tuning of base weights, or air-gapped deployment.
Benchmarks
Frequently Asked Questions
What is Gemini 3.5 Pro and who built it?
Gemini 3.5 Pro is a frontier multimodal large language model developed by Google DeepMind, announced at Google I/O on May 19, 2026. It is the Pro tier of the Gemini 3.5 family, positioned above Gemini 3.5 Flash and designed to handle the hardest reasoning, multimodal, and long-context tasks that were previously routed to Google's Ultra tier. The model targets a 2M-token context window, the largest announced for any production frontier model as of June 2026. It includes a Deep Think reasoning mode for extended chain-of-thought on hard scientific, mathematical, and coding tasks. As of June 9, 2026, Gemini 3.5 Pro is in limited preview for Vertex AI enterprise customers only; general availability is expected in late June 2026. Architecture details and parameter counts have not been disclosed. Its predecessor Gemini 3.1 Pro scored 94.3% on GPQA Diamond and 80.6% on SWE-bench Verified for reference.
How much does Gemini 3.5 Pro cost per 1M tokens?
Google has not officially announced pricing for Gemini 3.5 Pro as of June 9, 2026. Analysts estimate approximately $15.00 per 1M input tokens and $60.00 per 1M output tokens, based on historical Pro-to-Flash pricing ratios across prior Gemini generations. These figures are unconfirmed. For reference, the GA sibling Gemini 3.5 Flash is priced at $1.50 input / $9.00 output per 1M tokens, and Gemini 3.1 Pro (the prior Pro tier) costs $2.00 input / $12.00 output per 1M. Worked cost examples are not available until official pricing is confirmed. Do not finalize budget projections based on estimated figures. Pricing will be published at the Vertex AI pricing page and the Gemini API pricing page upon general availability. Check cloud.google.com/vertex-ai/pricing for the confirmed rates at launch.
What is Gemini 3.5 Pro's context window and max output?
Gemini 3.5 Pro targets a 2,000,000-token (2M) context window, announced by Google at I/O 2026 on May 19. This is the largest context window announced for any production frontier model as of June 2026, doubling the 1M context offered by Gemini 3.5 Flash and GPT-5.5. Maximum output tokens have not been officially specified. Whether the 2M context achieves production-stable recall at extreme depths has not been publicly evaluated via needle-in-haystack or equivalent long-context benchmarks. For context, Gemini 3.1 Pro demonstrated reliable recall in prior long-context evaluations. Gemini 3.5 Flash supports 1M context with confirmed production availability. The 2M context positions 3.5 Pro to ingest large codebases, multi-book research corpora, or hours of audio transcript in a single API call without chunking.
How does Gemini 3.5 Pro compare on benchmarks vs GPT-5.5?
Published benchmark scores for Gemini 3.5 Pro are not available as of June 9, 2026, as the model has not been released for independent evaluation. GPT-5.5, by contrast, has confirmed scores: 88.7% SWE-bench Verified and 92.4% MMLU. Gemini 3.1 Pro (the current GA Pro tier) scored 94.3% GPQA Diamond and 80.6% SWE-bench Verified, leading the GPQA leaderboard at its February 2026 launch. Internal data cited by one source suggests Gemini 3.5 Pro will post 10-15 point SWE-bench Verified gains over the 3.1 generation; this has not been independently verified. GPT-5.5 and Gemini 3.5 Pro are expected to be direct competitors on agentic coding and reasoning benchmarks; a definitive comparison is only possible after 3.5 Pro reaches GA. Until then, Gemini 3.1 Pro is the reference point for Google's current benchmark standing.
Is Gemini 3.5 Pro open source or proprietary?
Gemini 3.5 Pro is proprietary and closed-weights, accessible only via API through Vertex AI. There is no HuggingFace release, no downloadable weights, and no self-hosting path. Google's open-weight family is the separate Gemma line (Gemma 4 launched April 2, 2026 with Apache 2.0 license), which is distinct from the Gemini Pro series. As of June 9, 2026, access to Gemini 3.5 Pro requires Vertex AI enterprise credentials through the limited preview program. Upon GA, the model will be available via the Gemini API (api.google.com) and Vertex AI using Google Cloud IAM or API keys. There are no commercial self-hosting, fine-tuning, or air-gapped deployment options. For teams requiring open weights, Gemma 4 or Llama 4 are the Google and Meta alternatives respectively.
What modalities does Gemini 3.5 Pro support?
Gemini 3.5 Pro supports text, images, audio, and video as inputs in a single native API request, following the joint-reasoning multimodal design established in Gemini 3.1 Pro. Output modalities are text and tool-calls. The model does not produce audio or video output. Function calling, structured output, and tool use are confirmed based on Gemini 3.5 Flash's capabilities and Gemini 3.1 Pro's feature set. Deep Think reasoning mode adds extended chain-of-thought for hard problems without changing the input/output modality contract. Computer use and code execution have not been officially confirmed for Gemini 3.5 Pro as of June 9, 2026. Google Search grounding is expected based on prior Gemini API support. Precise multimodal rate limits and maximum file sizes per modality will be published at GA.
Does Gemini 3.5 Pro train on user data?
Google does not use API inputs to train production Gemini models by default, consistent with Google Cloud's standard data policy. Enterprise customers on Vertex AI have access to data residency controls, allowing them to select processing regions (US, EU, Asia). Gemini 3.5 Pro's model card and specific data retention terms have not been published as of June 9, 2026; they are expected at GA. Google Cloud holds SOC 2 Type II and ISO 27001 certifications. HIPAA-eligible configurations are available on Vertex AI for healthcare workloads. GDPR compliance applies for EU users. Per the EU AI Act, Gemini 3.5 Pro is expected to carry general-purpose AI obligations with systemic risk reporting requirements, consistent with prior Gemini Pro tiers. Data governance terms on Vertex AI apply per the Google Cloud Data Processing Amendment.
Who is Gemini 3.5 Pro best for and who should avoid it?
Gemini 3.5 Pro is best for enterprise teams on Google Cloud who need the longest available context (2M tokens) for full-codebase or research corpus analysis without chunking, and for researchers who need Deep Think extended reasoning for scientific, mathematical, and olympiad-level problems. Organizations already using Gemini 3.1 Pro on Vertex and needing more context or stronger reasoning are the primary upgrade target. Teams who need a production model today (June 9, 2026) should use Gemini 3.5 Flash (GA, 78% SWE-bench, $1.50/$9 per 1M) or Gemini 3.1 Pro ($2/$12 per 1M) rather than planning around the preview. Teams outside Google Cloud should wait until the Gemini API endpoint launches at GA before designing integrations. Cost-sensitive text-only workloads should route to Gemini 3.5 Flash or Claude Haiku, not Pro. Any team building production infrastructure around Gemini 3.5 Pro before GA risks breaking changes.