Multiverse Computing: Quantum-Inspired LLM Compression

Multiverse Computing (San Sebastian, 2019) compresses LLMs by up to 95% using quantum-inspired tensor networks. $215M raised. Models: Pulsar 16B, HyperNova 60B.

Multiverse Computing, founded in San Sebastian, Spain in 2019, uses quantum-inspired tensor network compression to shrink large language models by up to 95% without retraining. Its CompactifAI platform serves compressed Llama, DeepSeek, Mistral, and Nemotron models via API at up to 75% lower cost than frontier proprietary inference.

Founded: 2019

About Multiverse Computing

Multiverse Computing is a Spanish AI infrastructure company founded in 2019 in San Sebastian, Basque Country. The company's core product, CompactifAI, uses quantum-inspired tensor network mathematics to restructure the internal weight matrices of large language models into highly efficient compressed representations. The process reduces memory usage by up to 93-95% while maintaining strong performance across reasoning, instruction-following, and tool-use benchmarks. Unlike traditional quantization, CompactifAI restructures model architecture rather than simply reducing numerical precision, allowing it to operate across BF16, FP8, and NVFP4 formats. The company works in close collaboration with NVIDIA, using NVIDIA's Model Optimizer and Megatron Bridge libraries as part of the compression workflow. Multiverse Computing's output models are validated on NVIDIA accelerated computing infrastructure and released in NVIDIA-compatible precision formats. The Pulsar 16B model, released June 23, 2026, demonstrates 30B-class reasoning performance at 16.15B parameters, a 43% throughput gain over the Nemotron 3 Nano 30B base, and time-to-first-token reduced from 2.18s to 1.24s on NVIDIA Blackwell GPUs. CompactifAI is available as a self-serve API accessible via AWS Marketplace and the Multiverse Computing portal. It serves compressed versions of Meta Llama, DeepSeek, Mistral, NVIDIA Nemotron, and OpenAI-family models to enterprise customers. Deployment spans managed cloud (AWS, Azure, GCP), private data centers, and on-device for NVFP4 variants on consumer GPUs and embedded systems. The company reports API costs run up to 75% below comparable frontier proprietary model inference for coding and reasoning workloads. Key products include CompactifAI (LLM compression API), FinOptimal (quantum-inspired portfolio optimization for financial institutions), and a growing lineup of open compressed models. Customer deployments span energy (Iberdrola), automotive (Bosch), and more than 100 global enterprises as of 2026. In June 2025, Multiverse Computing closed a EUR 189M ($215M) Series B round led by Bullhound Capital, with HP Tech Ventures, Toshiba, Santander Climate VC, and CDP Venture Capital joining. The Spanish government became a direct shareholder through a EUR 67M co-investment via the Society for Technological Transformation. By January 2026, the company reported annual recurring revenue of approximately EUR 100M. Press reports from early 2026 indicated Multiverse was exploring a Series C of up to EUR 500M at a EUR 1.5B valuation, though no completed round had been confirmed at the time of writing. The company's AI model lineup as of June 2026 includes LittleLamb (a compact tool-calling model), HyperNova 60B (a 50%-compressed GPT-OSS-120B derivative released in early 2026), and Pulsar 16B (a compressed NVIDIA Nemotron 3 Nano derivative released June 23, 2026). All open models are published on Hugging Face under the Apache 2.0 license via the MultiverseComputingCAI organization.

Links

Website