Tag

Language Models

All articles tagged with #language models

technology2 days ago•5 min saved

AI coach sharpens peer review with clearer, more constructive feedback

A five-LLM AI coach, called Review Feedback Agent, was developed to help peer reviewers deliver more specific, constructive, and less toxic feedback. When tested on thousands of existing reviews, it frequently suggested actionable ways to improve comments. It remains unclear whether this improves the quality or impact of the papers being reviewed, requiring further study.

via Nature|

#artificial-intelligence #language-models #peer-review

science22 days ago•9 min saved

Zero-shot AI Reads Personality From Brief Narratives

Researchers demonstrate that off-the-shelf large language models can accurately score personality traits from short open-ended texts, with LLM-derived trait scores converging with self-reports and predicting daily behavior and mental health better than traditional NLP methods, offering a scalable tool for psychological assessment.

via Nature|

#generative-ai #language-models #narrative-analysis

technology28 days ago•6 min saved

ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots

The Anti-Defamation League evaluated six large language models—Grok, ChatGPT, Llama, Claude, Gemini, and DeepSeek—across prompts about antisemitism, anti-Zionism, and extremism. Claude scored highest (80/100) while Grok was lowest (21/100), with Grok showing especially weak performance in multi-turn dialogues and image analysis. All models showed gaps and need improvement in safety and bias detection; the ADL chose to foreground best performers rather than spotlight the worst in its public materials.

via The Verge|

#ai-safety #antisemitism #claude

technology1 month ago•3 min saved

Rude Prompts May Improve ChatGPT Accuracy, Study Finds

A Penn State study using ChatGPT-4o shows that increasingly rude prompts yielded higher accuracy (about 84.8% for very rude vs. 75.8% for very polite, with ~80.8% for very polite), challenging earlier work that politeness boosts performance. The researchers note that tiny prompt wording changes can drastically affect outputs and cautions against deploying hostile interfaces in real-world use, while acknowledging the findings are not a license to insult AI.

via Futurism|

#ai #chatgpt #language-models

technology1 month ago•2 min saved

AI Calendar Fumble: Google's Overview Confuses 2027

Google’s Overview AI misstates what year is next year, insisting 2028 is next year even though the current year is 2026, a calendar error that has persisted for weeks. The mishap isn’t isolated to Google: OpenAI’s ChatGPT and Anthropic’s Claude also stumble on the question, though Gemini 3 apparently nails it. The episode underscores ongoing reliability and hallucination issues across leading AI models, even as one system in Google’s suite performs well.

via Futurism|

#ai #ai-hallucinations #google

technology1 month ago•2 min saved

NeuroSploit v2: AI-Driven Autonomous Penetration Testing for Vulnerability Detection

NeuroSploitv2 is an open-source, AI-powered penetration testing framework that integrates multiple large language models like Claude, GPT, and Gemini to automate vulnerability analysis and exploitation, featuring modular roles for various security tasks, advanced error mitigation techniques, and extensive tool integrations, designed to enhance offensive security operations with flexibility and ethical safeguards.

via CybersecurityNews|

#ai #cybersecurity #language-models

technology3 months ago•4 min saved

Microsoft Reveals 'Whisper Leak' Threat to Encrypted AI Chat Privacy

Microsoft has revealed a new side-channel attack called Whisper Leak that can infer the topics of encrypted AI chat traffic by analyzing packet size and timing, posing privacy risks. The attack can identify sensitive conversation topics despite encryption, and mitigation strategies like adding random text to responses are recommended. This highlights vulnerabilities in current language models and the need for enhanced security measures.

via The Hacker News|

#ai-security #encryption #language-models

technology3 months ago•2 min saved

AI Chatbots Face 'Brain Rot' from Excessive Social Media Consumption

Training AI chatbots on large amounts of low-quality social media content impairs their reasoning, accuracy, and ethical responses, highlighting the importance of high-quality data for effective AI performance.

via Nature|

#ai #chatbots #data-quality

technology4 months ago•5 min saved

Anthropic and Thinking Machines Lab Unveil AI Model Character Differences

A study by Anthropic and Thinking Machines Lab introduces a systematic method to stress test AI model specifications using value tradeoff scenarios, revealing significant disagreements among models that highlight gaps and ambiguities in current specs. The research analyzes 12 frontier language models, links high disagreement to specification violations, and releases a public dataset for further auditing, emphasizing the importance of precise and comprehensive model guidelines.

via MarkTechPost|

#ai #dataset-release #disagreement-analysis

technology4 months ago•1 min saved

AI Models Suffer 'Brain Rot' from Low-Quality Data, Study Finds

A recent study reveals that feeding large language models low-quality, high-engagement social media content can impair their cognitive abilities, suggesting that AI models may be affected by the quality of data they are trained on, similar to humans.

via WIRED|

#ai #cognitive-abilities #language-models

technology4 months ago•2 min saved

AI Models Suffer 'Brain Rot' from Low-Quality Data, Study Finds

A study shows that feeding large language models with internet trash, like clickbait and social media posts, causes cognitive decline and personality changes in the AI, highlighting the importance of data quality over quantity in training models.

via Gizmodo|

#ai #brain-rot #junk-data

technology5 months ago•1 min saved

Testing Tyler Cowen with Human ChatGPT

The article discusses a visit to economist Tyler Cowen to explore whether large language models like ChatGPT will meet expectations, highlighting his broad interests beyond AI and the context of a podcast interview.

via The Economist|

#ai #chatgpt #economics

technology5 months ago•2 min saved

Understanding AI Hallucinations: Causes and Challenges

A new OpenAI study explores why large language models like GPT-5 hallucinate, attributing it partly to training focus on next-word prediction and current evaluation methods that incentivize guessing. The researchers suggest updating evaluation metrics to penalize confident errors and discourage guessing, aiming to reduce hallucinations.

via TechCrunch|

#ai-hallucinations #evaluation-methods #language-models

technology5 months ago•81 min saved

Understanding Why Language Models Hallucinate

The article discusses the nature of hallucinations in language models, emphasizing that not all outputs are hallucinations and that the term needs careful definition. It highlights the distinction between models predicting next tokens and generating false information, and debates whether all outputs can be considered hallucinations. The conversation also covers challenges in reducing hallucinations, the importance of proper evaluation, and philosophical questions about AI understanding and truth. Overall, it stresses that hallucinations are inherent to probabilistic models like LLMs, and efforts should focus on minimizing them rather than expecting complete elimination.

via Hacker News|

#ai #hallucination #language-models

technology6 months ago•9 min saved

Ars Tests Reveal GPT-5's Performance Compared to GPT-4o

The article compares OpenAI's GPT-5 and GPT-4o models through various prompts, highlighting differences in style, creativity, and accuracy. GPT-5 generally provides more concise and direct responses, often with better factual accuracy, while GPT-4o tends to offer more detailed and personable answers. Overall, GPT-5 wins more prompts, but preferences depend on user needs and prompt types.

via Ars Technica|

#ai-comparison #gpt-4o #gpt-5