
Anthropic's AI Constitution: A Moral Framework for Safe and Ethical AI.
AI startup Anthropic has developed a "Constitutional AI" training approach that provides its Claude chatbot with explicit "values" to address concerns about transparency, safety, and decision-making in AI systems without relying on human feedback to rate responses. The principles of the constitution include the United Nations Declaration of Human Rights, portions of Apple's terms of service, several trust and safety "best practices," and Anthropic's AI research lab principles. The model critiques and revises its responses using the set of principles, and reinforcement learning relies on AI-generated feedback to select the more "harmless" output.

