Question 1

What is Scale AI and what do they build?

Accepted Answer

Scale AI is an AI data infrastructure company founded in San Francisco in 2016 that builds the training data layer behind frontier AI models and government AI systems. The company's core platform, the Data Engine, handles every stage of the data preparation pipeline: collection, curation, annotation, RLHF (reinforcement learning from human feedback), and model evaluation. Scale supports text, images, video, 3D point clouds, audio, and sensor fusion data types across verticals including autonomous vehicles, robotics, and large language model development. A separate product, Scale Donovan, serves US government and defense customers with a secure interface for extracting insights from classified and unclassified data sources. In April 2026, Scale acquired Illumina Computing Group to expand its defense analytics capabilities. Scale reported $2 billion in annual recurring revenue in 2025 and is valued at $29 billion following Meta's $14.3 billion investment in June 2025. The platform is accessible at scale.com, with enterprise and government contracts negotiated directly with Scale's sales team.

Question 2

Who founded Scale AI and who is the CEO?

Accepted Answer

Scale AI was co-founded in 2016 by Alexandr Wang and Lucy Guo after the pair met during a Quora internship in San Francisco. Wang was a 19-year-old MIT freshman and Guo was a 22-year-old Carnegie Mellon student; both dropped out to join Y Combinator's winter 2016 cohort. Guo left Scale in 2018, with both founders retaining equity that later made them billionaires as the company grew to a $29 billion valuation. Wang served as CEO from founding through June 2025, when Meta's $14.3 billion investment was paired with Wang's move to lead Meta Superintelligence Labs as Chief AI Officer. Jason Droege, who joined Scale in September 2024 as Chief Strategy Officer, was promoted to Interim CEO in June 2025. Droege brings over 20 years of technology leadership including senior roles at Uber Eats and Axon. Wang remains a director on the Scale board of directors as of mid-2026.

Question 3

How much funding has Scale AI raised?

Accepted Answer

Scale AI has raised $15.9 billion in total funding across nine rounds from 62 investors. Early backers include Y Combinator from the winter 2016 cohort, followed by Dragoneer Investment Group, Tiger Global Management, and Index Ventures across subsequent series as Scale won clients including OpenAI, Google, Microsoft, and the US Department of Defense. Meta Platforms first invested in Scale's Series F round in May 2024, then committed to the transformative deal in June 2025: $14.3 billion for a 49% minority stake, valuing Scale at over $29 billion and setting a record as the largest single private AI investment at that time. The Meta transaction provided existing shareholders with significant liquidity and reduced near-term pressure for an IPO. Scale remains a private company as of mid-2026 with no public IPO filing. Analysts estimate a potential public offering between 2027 and 2029. Scale's 2025 ARR of $2 billion supports the case for a public company valuation at current levels.

Question 4

What products does Scale AI make?

Accepted Answer

Scale AI's primary product is the Data Engine, available in two modes: Rapid (Scale provides a managed contractor workforce alongside its software, handling recruitment, quality control, and workflow) and Self-Serve (enterprise customers use Scale's tooling with their own annotation teams). The Data Engine covers the full machine learning data lifecycle: collection, curation, annotation, RLHF for language model fine-tuning, and model evaluation. Scale Donovan is the government and defense product: a secure, auditable interface for US military and intelligence customers to analyze classified and unclassified data, deployed under contracts with the Chief Digital and Artificial Intelligence Office. Scale also offers test and evaluation services, combining human red-teamers and LLM-assisted techniques to identify risks in AI models before deployment. In April 2026, Scale acquired Illumina Computing Group's defense analytics platform, adding government-specific tooling to the portfolio. Task pricing for the Data Engine ranges from cents per simple classification to several dollars for complex 3D bounding boxes, with enterprise and government contracts negotiated separately.

Question 5

Where is Scale AI headquartered and how big is the team?

Accepted Answer

Scale AI is headquartered in San Francisco, California, where it has been based since founding in 2016. The company maintains government-cleared facilities for its Donovan and defense-focused teams separate from its main commercial offices. Full-time headcount is approximately 1,200 employees covering engineering, sales, operations, and management roles. Total headcount including contractors reaches about 6,693 as of May 2026, with the contractor workforce performing the bulk of actual annotation and labeling tasks. Full-time headcount declined roughly 15% year-over-year in 2025 as AI-assisted labeling tools automated more of the annotation pipeline. Scale is actively hiring for government, physical AI, and enterprise applications roles in 2026. The company projects its international business to double in 2026 through government partnerships in allied nations, suggesting headcount growth outside the US.

Question 6

What is Scale AI's mission or research focus?

Accepted Answer

Scale AI's stated mission is to develop reliable AI systems for the world's most important decisions, a formulation updated in September 2025 to emphasize production-readiness as AI moves from research pilots to deployed products. The company publishes research on data quality methodology, model evaluation benchmarks, and red-teaming practices for frontier language models. Scale's test and evaluation practice functions as an independent auditor for some of the largest AI systems in production, using both human expert evaluators and automated LLM-based testing to identify failure modes and vulnerabilities. In 2026, Scale is expanding into physical AI training data through a global robotics data program that recruits contractors to produce point-of-view demonstrations for companies training AI-powered robots. The government mission through Scale Donovan connects the data work to national security applications, from intelligence analysis to warfighter decision support. Scale does not primarily identify as a research lab, but its data curation methodology and model evaluation standards influence how the industry measures AI system quality.

Question 7

Is Scale AI compliant with SOC 2, GDPR, and HIPAA?

Accepted Answer

Scale AI holds SOC 2 Type II certification, confirming its security controls have been independently audited for availability, confidentiality, and processing integrity. The company is also certified under ISO 27001:2022 for information security management, with the certificate available from Scale's trust center at trust.scale.com. Scale is HIPAA-eligible for customers handling protected health information. FedRAMP High authorization enables Scale to handle classified and sensitive US government data on approved cloud infrastructure. UK Cyber Essentials certification covers Scale's British government work. GDPR compliance is addressed through Scale's data processing agreements for European enterprise customers. Scale does not train its own AI models on customer annotation data by default, and enterprise customers can negotiate zero-retention terms for sensitive data.

Question 8

Who are Scale AI's main competitors?

Accepted Answer

Scale AI's primary competitors are Appen, Labelbox, and Surge AI in the AI training data market. Appen is a publicly traded managed annotation service with a model similar to Scale's Rapid tier; Scale beats Appen on RLHF quality for frontier models and US government security clearances, while Appen offers lower costs on high-volume, lower-complexity tasks. Labelbox is an enterprise software platform that lets teams manage their own annotation workflows; Scale wins on managed workforce depth while Labelbox wins on self-serve flexibility for teams that want full pipeline control. Surge AI, a bootstrapped San Francisco company, crossed $1 billion in ARR by 2025, competing specifically on RLHF and preference-tuning for language model providers. Meta's June 2025 investment triggered Google, OpenAI, and Microsoft to reduce or exit their Scale relationships over vendor-neutrality concerns about Meta's 49% stake. Scale's clearest competitive barrier today is the US defense vertical, where FedRAMP High certification, CDAO contracts, and government-cleared personnel create switching costs that commercial annotation competitors cannot match.

Scale AI: $29B Data Engine for AI Labs (Founded 2016)

About Scale AI

Mission

Products

Compliance

Links

Frequently Asked Questions