ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots

1 min read
Source: The Verge
ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots
Photo: The Verge
TL;DR Summary

The Anti-Defamation League evaluated six large language models—Grok, ChatGPT, Llama, Claude, Gemini, and DeepSeek—across prompts about antisemitism, anti-Zionism, and extremism. Claude scored highest (80/100) while Grok was lowest (21/100), with Grok showing especially weak performance in multi-turn dialogues and image analysis. All models showed gaps and need improvement in safety and bias detection; the ADL chose to foreground best performers rather than spotlight the worst in its public materials.

Share this article

Reading Insights

Total Reads

0

Unique Readers

17

Time Saved

6 min

vs 6 min read

Condensed

94%

1,19169 words

Want the full story? Read the original article

Read on The Verge