Name: Pulsar 16B: AIME 87.22 Open Reasoning Model (2026)
Brand: Multiverse Computing
Availability: InStock

Question 1

What is Pulsar 16B and who built it?

Accepted Answer

Pulsar 16B is an open reasoning model released on June 23, 2026 by Multiverse Computing, a Spanish AI infrastructure company founded in 2019 in San Sebastian. The model is built on a Hybrid Mamba2-Transformer with Mixture-of-Experts architecture, compressed from the NVIDIA Nemotron 3 Nano 30B base (31.6B total, 3.5B active parameters) using Multiverse Computing's proprietary CompactifAI quantum-inspired tensor network technology. The result is 16.15B total parameters with 3.1B active, with no retraining from scratch and preserved reasoning behavior. On AIME 2025, Pulsar 16B scores 87.22, within a fraction of the uncompressed 30B base and 15 points above OpenAI's gpt-oss-20B. On GPQA Diamond, it reaches 71.41, more than 12 points above gpt-oss-20B. The model was developed in collaboration with NVIDIA using Model Optimizer and Megatron Bridge libraries, and validated on NVIDIA Blackwell accelerated computing infrastructure. It sits in Multiverse Computing's Pulsar model family, targeting the 15-20B active-parameter efficiency tier below the HyperNova 60B line.

Question 2

How much does Pulsar 16B cost per 1M tokens?

Accepted Answer

Pulsar 16B is released under the Apache 2.0 license, which means the weights are completely free to download and self-host. There is no per-token charge for self-hosted deployments; the only cost is the hardware running the model. On an NVIDIA Blackwell GPU in FP8 precision, you get 4,808 tokens per second, making the effective cost per 1M tokens roughly the hourly hardware cost divided by throughput. On a single RTX 4090 (approximately $0.80 per hour on spot), 1M tokens costs under $0.20 in hardware time at peak throughput. For teams that do not want to manage GPU infrastructure, Multiverse Computing offers the CompactifAI API with token-based pricing, available via AWS Marketplace and the CompactifAI portal. The company reports this API runs up to 75% below the cost of comparable frontier proprietary models for coding and reasoning workloads. Specific CompactifAI API rates are not publicly listed as of June 2026; contact Multiverse Computing directly for a quote. There is no documented free API trial period.

Question 3

What is Pulsar 16B's context window and max output?

Accepted Answer

Pulsar 16B inherits a 1,000,000-token context window from its NVIDIA Nemotron 3 Nano 30B base model. Multiverse Computing validated long-context recall using LongBench, AA-LCR, RULER suite variants, and needle-in-a-haystack tasks at progressively longer spans. Needle retrieval remains essentially perfect on both sides of the 100K token mark, and the model tracks the uncompressed 30B base closely on harder RULER tasks at extended lengths. For reference, the Nemotron 3 Nano base achieves RULER scores of 87.5% at 64K tokens, 82.92% at 128K, and 70.56% at 512K; Pulsar 16B matches these curves closely. The 4,808 tok/s throughput headline is measured at short-to-medium context lengths; real-world throughput drops significantly at context lengths above 256K tokens due to attention memory scaling. Maximum output tokens have not been separately documented for Pulsar 16B as of June 2026. Compared with proprietary competitors, the 1M context window matches or exceeds GPT-4o and Claude Haiku 4.5 at no per-token cost for self-hosted workloads.

Question 4

How does Pulsar 16B compare on benchmarks vs gpt-oss-20B?

Accepted Answer

Pulsar 16B outperforms gpt-oss-20B on every major benchmark category reported in the June 2026 launch announcement. On AIME 2025 math reasoning, Pulsar 16B scores 87.22 versus gpt-oss-20B's 72.22 (a gap of approximately 15 points). On GPQA Diamond science reasoning, Pulsar 16B reaches 71.41 against gpt-oss-20B's 58.88, a 12.5-point lead. On instruction-following (IFBench), Pulsar 16B leads by 14 points, and on function-calling (BFCL-v4), it leads by 11 points. These results are noteworthy because gpt-oss-20B has 20B parameters versus Pulsar 16B's 16.15B, meaning Pulsar 16B achieves stronger results at fewer parameters due to the 30B knowledge preserved through compression. Versus the uncompressed Nemotron 3 Nano 30B base, Pulsar 16B is within a fraction of a point on AIME 2025 and GPQA Diamond. No SWE-bench Verified or ARC-AGI 2 scores have been published for Pulsar 16B as of June 2026. The benchmark numbers are verified on NVIDIA infrastructure; independent third-party confirmation is pending at time of writing.

Question 5

Is Pulsar 16B open source or proprietary?

Accepted Answer

Pulsar 16B is fully open source under the Apache 2.0 license. The weights are downloadable at no cost from Hugging Face under the MultiverseComputingCAI organization. Apache 2.0 permits commercial use, modification, fine-tuning, redistribution, and integration into proprietary products without royalty obligations. There are no usage restrictions or community license terms that limit production deployments, making it one of the more permissive licenses in the open-weights model ecosystem. The model is available in three precision formats: BF16 (approximately 32 GB VRAM), FP8 (approximately 16 GB VRAM), and NVFP4 (approximately 8 GB VRAM). Unlike Meta's Llama license or Mistral's weight-access terms, Apache 2.0 places no cap on commercial usage or number of users. The underlying NVIDIA Nemotron 3 Nano 30B base model is also available on Hugging Face under a separate NVIDIA license; Pulsar 16B's Apache 2.0 applies to the compressed checkpoint specifically. NVIDIA's Model Optimizer and Megatron Bridge libraries used in the compression workflow are separately licensed by NVIDIA.

Question 6

What modalities does Pulsar 16B support?

Accepted Answer

Pulsar 16B supports text input and text output only in its initial June 2026 release. Vision, audio, and video input are not included in this checkpoint. The model inherits function-calling and structured output generation capabilities from the NVIDIA Nemotron 3 Nano base, supporting parallel tool calls and JSON-structured outputs natively. Function definitions follow the Nemotron 3 Nano tool-calling schema rather than the OpenAI function-calling convention, which requires a prompt-format migration for pipelines built on GPT-family models. The broader Nemotron 3 family includes multimodal variants (the Nemotron 3 Omni series) that support vision, audio, and video, but those capabilities were not included in the Pulsar 16B compression workflow. Multiverse Computing has not announced a multimodal Pulsar variant as of June 2026. For vision understanding in the same infrastructure, teams can pair Pulsar 16B with a separate vision encoder or use NVIDIA's Nemotron 3 Omni models alongside the text-reasoning checkpoint.

Question 7

Does Pulsar 16B train on user data?

Accepted Answer

For self-hosted deployments of Pulsar 16B, no data is sent to Multiverse Computing or NVIDIA by default. The Apache 2.0 license covers the weights as a standalone artifact; running inference locally involves no external data transmission. Multiverse Computing does not train on self-hosted users' inputs because there is no telemetry layer baked into the weights. For CompactifAI API deployments, data governance is governed by the Multiverse Computing commercial agreement. The company has not published a comprehensive data retention policy specific to Pulsar 16B as of June 2026; enterprise teams should request a data processing agreement before using the managed API. No SOC 2 Type II, ISO 27001, or HIPAA certification has been announced for the Pulsar 16B checkpoint or the CompactifAI API as of June 2026. Teams in regulated industries should default to self-hosted deployment with their own data isolation controls until vendor certification is confirmed.

Question 8

Who is Pulsar 16B best for and who should avoid it?

Accepted Answer

Pulsar 16B is best for ML engineering teams running high-throughput agentic pipelines on NVIDIA hardware who need 30B-class reasoning at 16B parameters and the cost structure of open weights. The 4,808 tok/s FP8 throughput and 43% latency improvement over the 30B base make it well suited for high-volume reasoning calls where cost per token matters. Research teams that need AIME 87.22 or GPQA 71.41 performance within a 16 GB VRAM FP8 budget benefit directly. Organizations already using the Nemotron 3 Nano prompt format can migrate with minimal pipeline changes. Teams should avoid Pulsar 16B if they need vision or audio input; the model is text-only and requires a separate vision preprocessor for multimodal tasks, where NVIDIA Nemotron 3 Omni is the correct choice. Compliance-heavy deployments in healthcare or finance requiring SOC 2 certification, vendor SLAs, or audit trails should choose AWS Bedrock or Azure OpenAI with a certified proprietary model. Teams whose pipelines rely on ChatML or Llama-3 prompt templates face a migration burden due to the Nemotron-specific extra_id format; Llama 4 or Qwen3-14B may be a lower-friction alternative at a comparable parameter count.

Pulsar 16B: AIME 87.22 Open Reasoning Model (2026)

About Pulsar 16B

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions