Fugu: Sakana's Multi-Agent Orchestrator AI (2026)

Fugu is Sakana AI's fully autonomous orchestrator, launched June 2026. Fugu Ultra scores 73.7% on SWE-bench Pro, from $20/mo, with an OpenAI-compatible API.

Fugu is Sakana AI's fully autonomous orchestrator that hires a team of AI models per task instead of answering directly, launched June 22, 2026. Fugu Ultra hits 73.7% on SWE-bench Pro and 95.5% on GPQA Diamond, ahead of Claude Opus 4.8 and GPT-5.5 on Sakana's own comparisons. It costs $20 to $200 per month as a subscription, or $5/$30 per 1M tokens via its OpenAI-compatible API.

Fugu is Sakana AI's fully autonomous multi-agent orchestrator, launched June 22, 2026. It routes tasks across a pool of specialist AI agents through one OpenAI-compatible API. Fugu Ultra scores 73.7% on SWE-bench Pro, beating Claude Opus 4.8's 69.2%. Subscriptions start at $20 per month; API pricing is $5 per 1M input and $30 per 1M output tokens.

Maker: Sakana AI · Autonomy: fully autonomous · Maturity: GA

Underlying models: Sakana Conductor (proprietary orchestrator, trained via reinforcement learning), Pool of closed and open partner models (not fully disclosed by Sakana AI)

About Fugu

Fugu is Sakana AI's multi-agent orchestration system, delivered as a single OpenAI-compatible API endpoint rather than one monolithic model. Instead of answering every prompt itself, Fugu is a trained conductor model that decides which specialist models to call from an agent pool, whether to split a task into planning and execution steps, when to have another agent verify the result, and how to synthesize the final answer. Sakana AI, the Tokyo-based lab behind it, built Fugu on two internal research frameworks: TRINITY, which explores evolutionary coordination across Thinker, Worker, and Verifier roles, and Conductor, which uses reinforcement learning to discover collaboration strategies between agents rather than hand-wiring them. Fugu ships in two tiers. The base Fugu model is tuned for low latency and interactive use, acting as a default engine for chat and coding assistants. Fugu Ultra trades speed for quality, drawing on a larger pool of expert agents for complex, high-stakes work such as automated ML research, cybersecurity analysis, and multi-step patent investigation. What makes this an agent rather than a chatbot is that the orchestration is learned end-to-end: Fugu can recursively call itself and other agents without a human approving each intermediate step, deciding on its own when to delegate, when to verify, and when a first answer is shaky enough to warrant a second opinion. On benchmarks, Fugu Ultra scores 73.7% on SWE-bench Pro, 93.2% on LiveCodeBench v6, 95.5% on GPQA Diamond, 50.0% on Humanity's Last Exam, and 82.1% on TerminalBench 2.1, according to Sakana's June 2026 release materials. Sakana positions these scores as ahead of Claude Opus 4.8 (69.2% SWE-bench Pro), GPT-5.5 (58.6%), and Gemini 3.1 Pro (54.2%) on the same task, and roughly matching Anthropic's Fable 5 and Mythos Preview despite Fugu not being a single frontier-scale model itself. Sakana has not comprehensively disclosed every model in Fugu's underlying agent pool, framing the system's value as vendor independence rather than any single named backend. Fugu launched publicly on June 22, 2026 with three subscription tiers at $20, $100, and $200 per month, plus a standalone API priced at roughly $5 per 1M input tokens and $30 per 1M output tokens, with higher rates for context beyond 272K tokens. The API is OpenAI-format compatible, so teams can swap an existing endpoint URL and get orchestrated multi-agent behavior without rearchitecting their application. Standard-tier users can opt out of specific underlying providers; the Ultra tier uses a fixed agent pool with no opt-out. Fugu is not yet available in the EU or EEA while Sakana completes GDPR compliance work. Fugu is a newer product line for Sakana AI, following a $135M Series B in November 2025 (roughly $2.65B valuation, led by Mitsubishi UFJ, Lux Capital, and In-Q-Tel) that funded the lab's shift from research demos toward production AI systems. Sakana frames Fugu's core pitch as reduced single-vendor lock-in: by routing across multiple providers behind one endpoint, teams get insulated from any single vendor's outages, export-control changes, or policy shifts. An open-source alternative called Maestro solves the same orchestration problem from the opposite direction, self-hosted and fully transparent, versus Fugu's closed and centrally trained approach.

Pricing

Three subscription tiers at $20, $100, and $200 per month. Standalone API bills per token at roughly $5 per 1M input tokens and $30 per 1M output tokens, with higher rates above 272K tokens of context. No confirmed free tier as of the June 22, 2026 launch.

Key Features

Strengths

Weaknesses

Frequently Asked Questions

What is Fugu and what does it do?

Fugu is a multi-agent AI orchestration system built by Sakana AI, a Tokyo-based lab founded in 2023. Rather than answering every prompt with a single model, Fugu is a trained conductor that decides which specialist agents from its pool should handle a task, whether to split planning from execution, when to bring in a verifier agent, and how to combine the results into one answer. It launched publicly on June 22, 2026 through a single OpenAI-compatible API endpoint. Fugu Ultra, its flagship tier, scores 73.7% on SWE-bench Pro and 95.5% on GPQA Diamond. Sakana built it on two internal research frameworks, TRINITY and Conductor, both trained with reinforcement learning rather than hand-coded routing rules. It ships in two variants: a low-latency base tier and the higher-quality Ultra tier.

How much does Fugu cost?

Fugu offers three subscription tiers priced at $20, $100, and $200 per month as of its June 22, 2026 launch. A separate standalone API bills per token, at roughly $5 per 1M input tokens and $30 per 1M output tokens, with higher rates applying once a request's context exceeds 272,000 tokens. Sakana has not confirmed a free tier for Fugu. The subscription tiers appear to scale by usage volume and access to the Ultra model line rather than by a hard feature gate. Because Fugu orchestrates multiple underlying models per request, actual per-task cost can be less predictable than a single-model API, since a complex task might route through several specialist agents before returning an answer. Enterprise or high-volume pricing beyond the $200 tier has not been publicly disclosed.

Is Fugu fully autonomous?

Yes. Fugu is built to plan, delegate, and execute multi-step tasks without a human approving each intermediate step. It can recursively call itself and other agents from its pool for test-time scaling on harder problems, deciding on its own when a task needs a specialist, when an answer should be checked by a second agent, and when to synthesize a final response. This puts it at the fully-autonomous end of the spectrum compared to copilot-style tools that require step-by-step human sign-off. Sakana has not published details on any built-in human-in-the-loop checkpoint for destructive or high-stakes actions, so users running Fugu on sensitive workflows should add their own review step before acting on its output. Standard-tier users can opt out of specific underlying providers, but the Ultra tier runs a fixed agent pool with no such override.

What AI model powers Fugu?

Fugu itself is not one single model but a trained orchestrator, sometimes called a conductor, that calls out to a pool of both closed and open models from multiple providers. Sakana has not comprehensively named every model in that pool in its public materials, framing Fugu's value proposition around vendor independence rather than any one named backend. The orchestration layer itself, built on Sakana's TRINITY and Conductor research, is proprietary and trained end-to-end with reinforcement learning rather than a hard-coded routing script. Users cannot directly choose which specific underlying model handles a given sub-task; Fugu makes that routing decision itself based on the request. This is a deliberate design choice: Sakana argues that a learned router beats a manually assigned one at deciding which specialist to call for a given step.

What are the best alternatives to Fugu?

Claude Code is a strong alternative if you want a single-vendor coding agent that runs directly in your terminal or IDE rather than an API-based multi-model orchestrator. OpenClaw is worth considering if you want a transparent, self-directed autonomous agent you run and control yourself rather than a managed, closed orchestration service. Maestro, an open-source orchestration project, solves the same multi-model routing problem as Fugu but is self-hosted and fully transparent about which models it calls, versus Fugu's closed and centrally trained approach. Teams that need a single frontier model rather than a multi-agent system might also consider Claude Opus 4.8 or GPT-5.5 directly, especially if predictable per-token cost matters more than Fugu's benchmark edge.

Who is Fugu best for?

Fugu is best for engineering teams building coding assistants or agentic developer tools who want frontier-level SWE-bench performance without betting on a single model vendor. It also suits AI research teams running automated ML research, cybersecurity analysis, or multi-step scientific investigation, the exact use cases Sakana highlights for the Ultra tier. Enterprises that want to reduce single-vendor lock-in on model access, for continuity or compliance reasons, are a natural fit given Fugu's cross-provider routing design. Fugu is not a good fit for EU or EEA-based teams today, since it is unavailable there pending GDPR compliance work, or for teams that need full visibility into exactly which underlying model processed a given request, since Sakana has not disclosed the full agent pool.

How does Fugu compare on benchmarks?

Fugu Ultra scores 73.7% on SWE-bench Pro, 93.2% on LiveCodeBench v6, 95.5% on GPQA Diamond, 50.0% on Humanity's Last Exam, and 82.1% on TerminalBench 2.1, per Sakana's June 2026 release materials. Sakana positions these as ahead of Claude Opus 4.8 (69.2% SWE-bench Pro), GPT-5.5 (58.6%), and Gemini 3.1 Pro (54.2%) on the same benchmark, and describes Fugu as roughly matching Anthropic's Fable 5 and Mythos Preview despite not being one single frontier-scale model. These figures are Sakana-reported and have not been independently reproduced on public leaderboards as of this writing, so they should be read as vendor benchmarks rather than third-party verified scores. WebArena and GAIA scores have not been published for Fugu.

How do you get started with Fugu?

Sign up for one of Fugu's three subscription tiers ($20, $100, or $200 per month) through Sakana AI's website, or request API access if you want to integrate Fugu into an existing application. Because the API is OpenAI Chat Completions-compatible, developers with an existing OpenAI, Gemini, or Claude integration can typically switch by changing the endpoint URL and API key rather than rewriting request logic. First requests should start with a simple, well-scoped task to observe how Fugu splits and routes work before trusting it with a multi-step, high-stakes job. Standard-tier users should also check the provider opt-out settings if there are specific underlying models they want excluded from their agent pool. EU and EEA-based users cannot currently sign up, since the product is not yet available in those regions pending GDPR compliance work.

Visit Fugu Official Site