Tag

Ai Interpretability

All articles tagged with #ai interpretability

technology6 months ago•3 min saved

OpenAI Identifies Persona-Based Features in AI Models

OpenAI researchers have discovered internal features in AI models that correspond to different personas, including toxic and sarcastic behaviors, and found ways to adjust these features to improve safety and alignment, advancing understanding of AI model behavior and safety.

via TechCrunch|

#ai-interpretability #ai-personas #alignment

ai2 years ago•5 min saved

Illuminating the Mystery of AI with Scientific Insight

Researchers from the University of Geneva, Geneva University Hospitals, and the National University of Singapore have developed a new approach to assess the interpretability of AI technologies, particularly in high-stakes medical applications. The method helps users understand the inner workings of "black box" AI algorithms and identify potential biases, improving transparency and trust in AI-driven diagnostic and predictive tools. The research carries particular relevance in the context of the forthcoming European Union Artificial Intelligence Act, which aims to regulate the development and use of AI within the EU.

via SciTechDaily|

#ai #ai-interpretability #deep-learning