Model Extraction News

technology1 day ago•3 min saved

Anthropic alleges Chinese firms used 16M Claude prompts to clone capabilities

Anthropic says three Chinese AI labs—DeepSeek, Moonshot AI, and MiniMax—launched industrial-scale distillation attacks against Claude, generating over 16 million exchanges via about 24,000 fraudulent accounts and proxy services. Each campaign targeted different Claude capabilities: DeepSeek for reasoning and censorship-safe responses (≈150,000 exchanges), Moonshot AI for agentic reasoning, tool use, coding, and vision (≈3.4 million), and MiniMax for agentic coding and tool use (≈13 million). The prompts were designed to harvest capabilities for training rival models and evade detection, highlighting significant national-security concerns due to unguarded capabilities. Anthropic says it has strengthened defenses and detection, noting such attacks exploit illicit distillation rather than typical user risk; Google had reported similar attacks earlier.

via The Hacker News|

#anthropic #china #claude

technology11 days ago•6 min saved

Google says Gemini faced 100,000 prompt attacks to distill a cheaper clone

Google discloses that commercially motivated actors tried to clone its Gemini AI by prompting it more than 100,000 times, using distillation to train cheaper copies, and says it has adjusted Gemini’s defenses against such model-extraction attacks, which researchers say have originated from around the world.

via Ars Technica|

#ai #distillation #gemini