Claude 1.0: Anthropic's First LLM (2023) | hokai.io
Claude 1.0 by Anthropic, released March 2023. First-gen text-only LLM with 9,000-token context. Retired November 6, 2024. Trained on Constitutional AI.
Claude 1.0 was Anthropic's founding LLM, released March 2023 in limited beta with a 9,000-token context window and text-only input. No benchmarks were published. Pricing was per-token via the Anthropic API. The model was deprecated September 4, 2024 and fully retired November 6, 2024, with no API access remaining for any workload.
Claude 1.0 is Anthropic's first LLM, released March 14, 2023 in limited API beta. It had a 9,000-token context window (expanded to 100K in v1.3), accepted only text input, and was trained with Constitutional AI plus RLHF. No formal benchmarks were published. It was deprecated September 4, 2024 and fully retired November 6, 2024.
Provider: Anthropic · Family: Claude 1
Context window: 9,000 tokens
Input modalities: text · Output: text
About Claude 1.0
Claude 1.0 is the founding model of Anthropic's Claude family, released on March 14, 2023 via limited API beta to approved developers. Built on a dense Transformer architecture and trained using Anthropic's Constitutional AI method combined with reinforcement learning from human feedback, Claude 1.0 represented the company's first public deployment of its alignment-centric training philosophy. The model's parameter count was never disclosed, which became a pattern Anthropic maintained across all subsequent generations. Claude 1.0 sat at the entry tier of what would become a multi-generation model lineup, positioned for general text tasks with an explicit safety-first design. Anthropic did not publish formal benchmark results for Claude 1.0 in any model card or technical report. The company's first detailed public benchmarks came with Claude 2 in July 2023, which showed improved scores versus the Claude 1 baseline on HumanEval, GSM8K, and MMLU, without specifying exact Claude 1.0 figures. Independent evaluators at the time noted performance in the GPT-3.5 tier for most text tasks but no exact independently verified numbers exist for MMLU, HumanEval, or reasoning tasks for this model. The absence of published benchmarks aligned with Anthropic's early philosophy of avoiding arms-race comparisons. Claude 1.0 launched with a 9,000-token context window, roughly equivalent to 6,000 to 7,000 words or about 20 to 25 pages of text. This was expanded substantially when Claude v1.3 introduced a 100,000-token context in May 2023, an approximately eleven-fold increase. Anthropic applied the same per-token pricing to both 9K and 100K context tiers at launch. Max output tokens were not separately documented but followed standard dense Transformer generation limits. The original 9K window constrained the model to short-to-medium document processing with no ability to ingest full books, large codebases, or extended conversation histories in a single pass. Claude 1.0 was strictly text-to-text. It accepted text input and produced text output with no support for images, audio, video, or native PDF parsing. Multimodal input capabilities were not introduced until Claude 3 in March 2024. Function calling existed only through structured prompting conventions, not a formal tool use API. The model handled code generation, summarization, translation, and dialogue via text alone. No computer use, file upload, or web browsing was supported natively. Claude 1.0 was priced on a per-token model through the Anthropic API. Exact historical per-million-token rates are no longer published following the model's retirement in November 2024. At launch in 2023, pricing was competitive with GPT-3.5 class models in the market. Anthropic applied the same rate for both 9K and 100K context variants when the context expansion was announced in May 2023, an unusual pricing decision that drew developer attention at the time. No free tier, batch discount, or prompt caching feature existed for Claude 1.0. Access was available only via the Anthropic direct API during the model's operational period. Amazon Bedrock added support for the Claude 1 series during 2023. Authentication required an Anthropic API key. Google Vertex AI and Azure deployments were not available for the Claude 1 family. The API model IDs claude-1.0, claude-1.1, claude-1.2, and claude-1.3 all now return errors, as all four were retired on November 6, 2024. Anthropic announced the deprecation on September 4, 2024 and gave developers 60 days to migrate to claude-2.0 or newer models. Claude 1.0 was trained using Constitutional AI, a method Anthropic published in December 2022. The technique trains the model to critique and revise its own outputs using a written set of guiding principles called a constitution, then fine-tunes on the revised outputs via supervised learning. A reinforcement learning phase follows using AI-generated preference signals rather than purely human-labeled comparisons, making alignment more scalable. Anthropic published its first constitution in May 2023 with 75 principles drawn from the UN Universal Declaration of Human Rights and Apple's terms of service. The model carried a strict safety posture, refusing a wider range of requests than later Claude generations, reflecting Anthropic's conservative alignment defaults during this period. No formal red-teaming partner list was published for Claude 1.0. Claude 1.0 is fully retired and inaccessible via any API endpoint. No active production workload can use this model. Developers who built on the claude-1.0 API model ID were required to migrate before November 6, 2024. The model is of historical interest as the origin of the Claude family and the first commercial deployment of Constitutional AI training but cannot serve any current inference use case. Anthropic has committed to long-term preservation of model weights, so the model may become accessible again in research contexts in the future, but no public timeline exists. Training data for Claude 1.0 consisted of a proprietary mix of publicly available Internet text, licensed third-party datasets, and human-annotated examples. Training data cutoff is estimated at early 2023, consistent with what Anthropic disclosed for Claude 2, its direct successor. Inputs to the API were retained under Anthropic's standard data policy during the model's active period. No SOC 2 Type II, HIPAA, or ISO 27001 certifications were held for the Claude 1 generation; those enterprise compliance tiers were added in later model generations alongside Anthropic's expanding enterprise business.
Pricing
Historical per-token pricing no longer published. Model retired November 6, 2024. No API access available.
Key Features
- Constitutional AI Training: First commercial model trained via Anthropic's Constitutional AI method, using a written set of 75 principles to guide self-critique and revision during training.
- 100K Context (v1.3): Context window expanded from 9,000 to 100,000 tokens with claude-1.3 in May 2023, eleven times the initial limit, at no additional per-token cost.
- Strict Safety Defaults: Wider refusal range than later Claude generations, reflecting Anthropic's conservative alignment posture during the 2023 period of AI development.
- Human/Assistant Prompt Format: Used the original Human/Assistant stop sequence format before the Messages API was introduced, establishing the conversational interface pattern.
Pros
- First deployment of Constitutional AI training in a commercial LLM, establishing alignment practices that all later Claude models built on.
- Context window expanded to 100K tokens in May 2023 via claude-1.3, ahead of GPT-3.5 at the time.
- Strict safety posture produced lower rates of harmful content generation than many GPT-3.5 class contemporaries.
Cons
- Fully retired as of November 6, 2024. No API access is possible under any circumstances.
- No published benchmarks, making capability assessment against contemporaries reliant on informal comparisons only.
- Text-only input with no image, audio, or vision support in any version of the Claude 1 family.
Frequently Asked Questions
What is Claude 1.0 and who built it?
Claude 1.0 is the founding language model of the Claude family, built by Anthropic and released on March 14, 2023 via limited API beta to approved developers. Anthropic is a San Francisco AI safety company founded in January 2021 by Dario Amodei, Daniela Amodei, and five other former OpenAI researchers. Claude 1.0 uses a dense Transformer architecture and was the first commercial deployment of Anthropic's Constitutional AI training method, which guides model behavior using a written set of principles during training rather than relying solely on human feedback at every step. Parameter count was never disclosed, which became standard practice across all Anthropic model generations. The model was positioned for general text tasks with a strong safety focus and was available only to selected API partners during its initial release period. It sat at the entry point of what would become a multi-generation lineup including Claude 2, Claude 3, and Claude 4 families.
How much did Claude 1.0 cost per 1M tokens?
Exact historical per-million-token pricing for Claude 1.0 is no longer available in Anthropic's documentation following the model's retirement on November 6, 2024. During its active period in 2023, the model was priced on a per-token basis via the Anthropic API at rates competitive with GPT-3.5 class models. Anthropic made an unusual pricing decision when it expanded the context window to 100,000 tokens with claude-1.3 in May 2023, charging the same per-token rate as the original 9,000-token model. No free tier, batch discount, or prompt caching existed for the Claude 1 family. AWS Bedrock also offered the Claude 1 series at per-token pricing consistent with Anthropic's direct API rates during 2023. Since the model is fully retired, no cost is associated with current usage because no API access is available.
What was Claude 1.0's context window and max output?
Claude 1.0 launched with a 9,000-token context window, equivalent to approximately 6,000 to 7,000 words or about 20 to 25 pages of text. This was the combined limit for input plus output in a single request. The model's specific max output token limit was not separately documented by Anthropic. The context window was expanded significantly to 100,000 tokens with the release of claude-1.3 in May 2023, an approximately eleven-fold increase that enabled ingesting full books, large codebases, or lengthy conversation histories. Anthropic notably applied the same per-token pricing to both the 9K and 100K context tiers when the expansion was announced. By comparison, the current Claude Opus 4.7 supports a 1,000,000-token context window, more than a hundred times the original Claude 1.0 limit.
How did Claude 1.0 compare on benchmarks vs GPT-3.5?
Anthropic did not publish formal benchmark scores for Claude 1.0. No MMLU, HumanEval, GSM8K, or SWE-bench results were released for any version of the Claude 1 family in a public model card or technical report. Anthropic's first detailed benchmark disclosures came with Claude 2 in July 2023, which the company described as improved over the Claude 1 baseline, without citing specific Claude 1 figures. Contemporary evaluators in 2023 placed Claude 1.0 in a comparable tier to GPT-3.5 for general text tasks, with lower harmful output rates attributed to Constitutional AI training. No independent third-party benchmark scores from that period are widely cited. The absence of benchmark data makes direct numerical comparison to GPT-3.5 not possible from current sources.
Was Claude 1.0 open source or proprietary?
Claude 1.0 was fully proprietary, with closed weights and API-only access. No model weights were made available for download, and no self-hosting option existed. Access required an Anthropic API key during the model's active period from March 2023 to November 2024. The model was available via the Anthropic direct API and, during 2023, via Amazon Bedrock as a managed cloud deployment. Google Vertex AI and Azure were not available for the Claude 1 series. Anthropic has committed to long-term preservation of model weights for retired models and has expressed intent to make past models accessible again in the future, but as of May 2026 no public access has been restored.
What modalities did Claude 1.0 support?
Claude 1.0 supported only text input and text output. There was no support for images, audio, video, PDFs, or file uploads of any kind. The Claude 1 family across all versions (1.0, 1.1, 1.2, 1.3) remained strictly text-to-text throughout its operational period. Multimodal input capabilities first appeared in the Claude 3 family in March 2024, more than a year after Claude 1.0's release. Function calling and tool use were not part of the Claude 1 API feature set. The Messages API that introduced structured tool use came with Claude 2 and later generations. For any task requiring image analysis, code execution, file parsing, or audio processing, Claude 1.0 was not usable.
Did Claude 1.0 train on user data?
Anthropic's policy for the Claude 1 series was that API inputs were not used to train production models by default. Standard data retention applied during the model's active period, with inputs retained under Anthropic's API terms for abuse monitoring purposes. Enterprise zero-retention options that became prominent with Claude 3 and Claude 4 were not a feature of the Claude 1 API offering. The Claude 1 generation predated Anthropic's formal SOC 2 Type II, ISO 27001, and HIPAA certifications, which were added in later product cycles as the company expanded into regulated enterprise markets. For research and historical analysis, the model's training data consisted of a proprietary mix of public web text, licensed datasets, and human feedback annotations, with a cutoff estimated at early 2023.
Who was Claude 1.0 best for and who should have avoided it?
During its active period, Claude 1.0 was best suited for developers building text-based applications who wanted a safety-conscious alternative to GPT-3.5 with Constitutional AI alignment guarantees. Content moderation teams, summarization tools, and question-answering systems were common use cases. Teams requiring vision, audio, or document parsing should have avoided Claude 1.0 entirely, as it was text-only. Developers needing large context for legal documents or codebases would have been better served by claude-1.3 (100K) or Claude 2 (same context, stronger performance). As of November 6, 2024, the model is fully retired and inaccessible for any workload. Any developer referencing claude-1.0 in production code must migrate to an active model; Anthropic recommended claude-haiku-4-5-20251001 as the migration target for the Claude 1 family.