Anthropic Accuses Chinese AI Firms of Large-Scale Claude Model Distillation

CyberSecureFox 🦊

Anthropic has reported what it describes as a large-scale model distillation campaign targeting its Claude large language model (LLM), allegedly conducted by three Chinese AI companies: DeepSeek, Moonshot AI and MiniMax. According to the company, more than 16 million API calls were generated via over 24,000 fake or proxy accounts, despite Anthropic’s services not being officially available in mainland China.

The case highlights a rapidly emerging class of AI model extraction attacks, where adversaries do not hack infrastructure in the traditional sense, but instead attempt to “copy” the behavior and capabilities of frontier models through intensive API interaction.

What is AI model distillation and why it matters for security

Model distillation is a well-known machine learning technique in which a smaller or cheaper model (the “student”) is trained on the outputs of a larger, more capable model (the “teacher”). This approach is widely used to compress large models, reduce latency and cut inference costs.

The technique becomes problematic when it is used to replicate commercially valuable capabilities of a proprietary model without authorization. Instead of investing billions of dollars in training and safety work, an organization can attempt to treat a competitor’s API as a shortcut, harvesting answers at scale to bootstrap its own LLM.

From a cybersecurity and intellectual property perspective, this is increasingly treated as a form of AI model IP theft or unauthorized “capability scraping,” even if the underlying API calls were initially billed and not the result of a classic intrusion.

How the alleged Claude model extraction campaign was organized

Anthropic states that the traffic originated from large networks of fake accounts and commercial AI proxy services that resell access to top LLMs. The company refers to this infrastructure as a “hydra cluster”: a distributed mesh of thousands of accounts designed to hide abnormal usage patterns and bypass API rate limits.

In at least one case, a single proxy network allegedly controlled more than 20,000 accounts simultaneously, mixing distillation-style requests with legitimate customer traffic. This blending of malicious and benign queries significantly complicates traditional anomaly detection based on simple spikes in usage or the behavior of individual accounts.

DeepSeek: testing reasoning and politically sensitive content

Anthropic attributes more than 150,000 interactions to DeepSeek-linked activity. The reported focus was on logical reasoning capabilities and on how Claude generates “politically safe” responses to sensitive topics. Such traffic is consistent with attempts to replicate not only raw language ability, but also content moderation strategies and safety alignment behavior.

Moonshot AI: autonomy, tools and vision capabilities

Moonshot AI is said to have generated over 3.4 million API calls, concentrating on Claude’s autonomous task handling, coding support, tool use and computer vision features. This pattern aligns with efforts to reproduce complex reasoning chains, multi-step workflows and integrations that are crucial for deploying LLMs in real-world applications.

MiniMax: code-centric distillation at massive scale

The largest share of suspected distillation traffic, more than 13 million message exchanges, is linked by Anthropic to MiniMax. The emphasis reportedly was on code generation and code analysis. Nearly half of the traffic switched to the newest Claude version immediately after its release, suggesting a systematic effort to rapidly snapshot each fresh model iteration.

Geopolitics, advanced chips and national security concerns

Anthropic stresses that campaigns of this scale require access to high-end compute and specialized AI chips. The claims come amid ongoing US debates over export controls on GPUs and AI accelerators to China, and recent policy shifts allowing certain downgraded AI chips to be shipped.

From a national security standpoint, the risk is not only economic. Unregulated, distilled LLMs typically lack robust safety guardrails and content filters, making them more suitable for misuse in cyberattacks, disinformation operations, large-scale surveillance and offensive intelligence activities, where compliance and ethics are deprioritized.

Defensive measures: behavioral fingerprinting and LLM traffic classifiers

To counter model distillation and API abuse, Anthropic reports deploying behavioral fingerprinting and specialized traffic classifiers. These systems build baselines of normal customer behavior and then flag patterns typical of automated data harvesting: repetitive prompts, high request density, coordinated activity across many accounts and irregular routing through proxy networks.

When a request stream is classified as likely distillation, Anthropic indicates that Claude’s responses may be deliberately degraded or randomized, for example by limiting explanation depth, altering certain details or adding additional context checks. This is part of a defense-in-depth strategy that complements standard controls such as rate limiting, API key governance and anomaly detection.

Backlash and accusations of double standards in AI data use

The disclosure has triggered strong reactions online, including criticism directed at Anthropic itself. Commentators have highlighted prior controversies around the use of unlicensed books, web content and social media data to train large AI models, as well as legal disputes over web scraping from platforms like Reddit.

Critics argue that a company whose models rely heavily on third-party data obtained without direct compensation to authors is now condemning other firms for reusing its own model outputs in a similar way. Some also note that, in this case, the Chinese firms reportedly paid for API access, whereas a large number of content creators used in LLM training corpora have not been remunerated.

The Claude distillation story illustrates how AI model protection is becoming a core domain of cybersecurity. Organizations deploying LLMs need to treat API endpoints as high-value assets, protected not only from classic breaches, but also from capability theft via large-scale querying, account farms and proxy clusters. Building resilient defenses now—combining strict access policies, continuous monitoring, behavioral analytics and transparent data governance—will be critical for any enterprise that relies on AI, both to safeguard its own intellectual property and to ensure the trustworthiness of the AI systems it chooses to adopt.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.