28 MAR 2026·6 min read·#ai #distillation #identity #deepseek #anthropic

DeepSeek thinks it's Claude

An AI that doesn't know who it is turned out to be a fingerprint of industrial-scale model distillation - and the ethics are more complicated than anyone wants to admit

DeepSeek thinks it’s Claude

In January 2025, users started noticing that when you asked DeepSeek R1 about its identity, the model’s chain-of-thought reasoning would sometimes refer to itself as “a language model developed by Anthropic, called Claude,” and if you pressed it directly on who created it, you might get a completely earnest “I’m not DeepSeek. I’m Claude from Anthropic.”

What initially looked like a quirky bug turned out to be a fingerprint left behind by the training data, and the story behind how it got there is worth understanding.

What actually happened

In February 2026, Anthropic went public with something the industry had been discussing quietly for months: three Chinese AI labs, DeepSeek, Moonshot AI, and MiniMax, had been running what Anthropic described as “industrial-scale campaigns” to extract capabilities from Claude, involving roughly 24,000 fake accounts, over 16 million exchanges, and commercial proxy networks managing thousands of fraudulent accounts simultaneously [1]. The scale and sophistication of the infrastructure made it clear this had been a deliberate, sustained effort.

Each lab was pursuing something different. MiniMax ran the largest operation, with over 13 million exchanges targeting agentic coding and tool use; Moonshot focused on agentic reasoning across 3.4 million exchanges; and DeepSeek, while smaller in volume at around 150,000 exchanges, was flagged by Anthropic as the most technically sophisticated of the three [1]. What made DeepSeek’s approach stand out was the prompt design: they asked Claude to articulate the internal reasoning behind completed responses, step by step, effectively prompting it to generate its own chain-of-thought training data and exposing the reasoning process behind its outputs [1].

Anthropic traced the accounts back to specific researchers at each lab through IP correlation, request metadata, and infrastructure indicators, and also found evidence that DeepSeek had been using Claude to generate censorship-safe alternatives to politically sensitive queries, presumably to build training data that would help their own models handle topics the Chinese government prefers to keep quiet [1].

What’s model distillation anyway?

Model distillation is how you make a smaller model smarter by training it on a bigger model’s outputs, a technique Geoffrey Hinton helped develop about a decade ago and whose transferred patterns he called “dark knowledge,” referring to the subtle signals in a model’s responses that carry far more information than the correct answer alone [2]. You take a large expensive model, collect its outputs across many inputs, and train a smaller model to reproduce them, and the student learns to mimic not just what the teacher said but the confidence patterns behind those responses: where it hesitated, which wrong answers it treated as plausible, what it found ambiguous. The resulting models typically use 80-95 per cent fewer compute resources and still perform well [3].

Every major AI lab does this to their own models, which is how you get products like GPT-4-mini or Claude Haiku, and the technique itself is entirely standard and legitimate. The controversy only begins when you apply it to someone else’s model without permission, at scale, through thousands of fake accounts.

The uncomfortable part

Anthropic’s position is clear enough: these were coordinated campaigns that violated their terms of service, circumvented access controls, and amounted to intellectual property theft, with the additional concern that models built through illicit distillation are unlikely to retain the safety guardrails of the originals [1]. I think that’s a legitimate concern, particularly given the censorship dimension and the obvious intent behind the proxy infrastructure.

But I also think it’s worth being honest about the tension at the centre of this story. Anthropic, along with every other major AI lab, built its models by training on enormous quantities of publicly available internet data, the work of writers, artists, programmers, and researchers whose output was scraped and ingested without their explicit consent [4]. Many of those people have raised exactly the same objections Anthropic is now raising against DeepSeek: that their work was used without permission to build a competing product, and that they received nothing for it. Futurism captured this with a headline that was hard to dismiss: “Anthropic Furious at DeepSeek for Copying Its AI Without Permission, Which Is Pretty Ironic When You Consider How It Built Claude in the First Place” [4].

There are genuine differences between the two situations, and acknowledging the irony doesn’t mean treating them as equivalent. DeepSeek’s campaign involved deliberate deception, fake accounts, and circumvention of access controls, while the web scraping question sits in a legal grey area that courts are still actively working through. The methods and intent are meaningfully different, and the censorship angle, using Claude’s capabilities to help build political content filtering, goes beyond ordinary commercial competition. But the foundational logic is shared: taking someone else’s outputs and using them to build something that competes with the original creator. The entire AI industry runs on this pattern, and the argument we’re really having is about where the acceptable boundary sits, a boundary currently being negotiated through lawsuits, export controls, and competing press releases rather than any settled legal framework.

I think it’s worth sitting with that discomfort rather than rushing to pick a side.

Why does an AI forget who it is?

The identity confusion reveals something genuinely interesting about how these models work beneath the surface.

When DeepSeek R1 refers to itself as Claude, it’s doing what language models always do: predicting the most likely next token based on patterns absorbed during training [5]. Models don’t learn their own names during training; the name gets assigned afterwards through system prompts, making a model’s “identity” a thin layer over whatever patterns the training data left behind. If a substantial share of that data came from Claude’s outputs, those patterns naturally include Claude’s conversational style, reasoning habits, and self-referential language, and they surface whenever the system prompt’s influence isn’t strong enough to override them.

Think of it like spending years learning to cook by meticulously recreating one chef’s recipes: you might become genuinely skilled, but your palate, your instincts, and your default flavour combinations will all reflect the original, and the more closely you copied, the more obvious it becomes to anyone familiar with the source. DeepSeek’s identity slip was the training data becoming visible in the most literal way possible, and Moonshot’s model, Kimi, had the same problem, occasionally identifying itself as Claude mid-conversation [6].

So what?

The reason this matters beyond the immediate controversy is that it exposes a structural vulnerability in how frontier AI development is funded. Building a state-of-the-art model costs hundreds of millions of dollars and requires enormous compute infrastructure, and if a competitor can replicate a meaningful fraction of those capabilities by running millions of queries through fake accounts at a tiny fraction of the cost, the investment case for frontier research becomes hard to sustain.

Anthropic has responded with a mix of technical measures and public calls for industry coordination [1], and OpenAI raised similar concerns about DeepSeek around the same time [7]. There appears to be a growing consensus among Western labs that this needs a coordinated response involving cloud providers, policymakers, and shared standards, though whether that actually materialises remains to be seen [1]. The technology to distil from other people’s models already exists and will only become more accessible, making the real question “will norms, legal frameworks, and technical defences develop fast enough before the economics of building frontier models become untenable”.

In the meantime, DeepSeek R1 still occasionally introduces itself as Claude, a quiet reminder that training data always finds a way to surface regardless of what anyone intended.

References

Anthropic — “Detecting and preventing distillation attacks” (Feb 23, 2026) https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
Wikipedia — “Knowledge distillation” https://en.wikipedia.org/wiki/Knowledge_distillation
UK Government — “AI Insights: Model Distillation” https://www.gov.uk/government/publications/ai-insights/ai-insights-model-distillation-html
Futurism — “Anthropic Furious at DeepSeek for Copying Its AI Without Permission” https://futurism.com/artificial-intelligence/anthropic-deepseek-copying-ai
GitHub — “Incorrect model identification - DeepSeek configured but reported as Claude” (Cline issue #6521) https://github.com/cline/cline/issues/6521
China US Focus — “The Identity Crisis: When Kimi Says ‘Hi, I’m Claude’” https://www.chinausfocus.com/peace-security/the-identity-crisis-when-kimi-says-hi-im-claude
TechCrunch — “Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports” https://techcrunch.com/2026/02/23/anthropic-accuses-chinese-ai-labs-of-mining-claude-as-us-debates-ai-chip-exports/