Tag

Ai2

All articles tagged with #ai2

artificial-intelligence2 years ago

AI2 Releases Massive Open Dataset for Training Language Models and Scholarly Data

The Allen Institute for AI (AI2) has released Dolma, its largest open dataset yet, consisting of 3 billion tokens for training language models. Dolma is intended to be used as the basis for AI2's planned open language model, OLMo. Unlike other companies that guard the secrets of their language model training processes, AI2 aims to make Dolma transparent and accessible to the AI research community. The dataset is publicly documented, and users are required to provide contact information, disclose derivative creations, distribute derivatives under the same license, and agree not to apply Dolma to prohibited areas. Access to Dolma is available via Hugging Face.

ai-research2 years ago

Advancements in AI Language Models for Global Collaboration and Numerical Reasoning.

The Allen Institute for AI (AI2) has announced the development of an open language model called AI2 OLMo (Open Language Model), with a scale of 70 billion parameters, comparable to other large language models. AI2 is partnering with leading technology companies, including AMD and CSC, to develop OLMo. The project aims to provide the research community with access to all aspects of model creation, fostering collaboration and advancing the science of language models. AI2 plans to make all elements of the project openly available, including data, code, training curves, evaluation benchmarks, and ethical considerations surrounding the model’s development.