OpenScholar: An open, citation-aware AI for synthesizing scientific literature

1 min read
Source: Nature
OpenScholar: An open, citation-aware AI for synthesizing scientific literature
Photo: Nature
TL;DR Summary

OpenScholar introduces a fully open, retrieval-augmented language-model pipeline and an up-to-date data store (OSDS) with 45 million papers to synthesize scientific literature. It uses a bi-encoder retriever, a cross-encoder reranker, and a self-feedback loop with citation verification to generate citation-backed long-form answers. In ScholarQABench across computer science, physics, biomedicine and neuroscience, OpenScholar-8B and OpenScholar-GPT-4o consistently outperform baselines (including GPT-4o) on correctness, coverage and citation accuracy, often matching or surpassing expert responses, while offering lower costs and full open-source access, including a public demo.

Share this article

Reading Insights

Total Reads

1

Unique Readers

1

Time Saved

64 min

vs 65 min read

Condensed

99%

12,89083 words

Want the full story? Read the original article

Read on Nature