Open-Source Evo 2 AI Maps Genome Features Across Life

1 min read
Source: Ars Technica
Open-Source Evo 2 AI Maps Genome Features Across Life
Photo: Ars Technica
TL;DR Summary

Ars Technica reports Evo 2, an open-source large genome model trained on 8.8 trillion bases from bacteria, archaea, eukaryotes, and related viruses, enabling it to identify genes, regulatory DNA, and splice sites without task-specific tuning. Built on a StripedHyena 2 CNN, Evo 2 underwent two training stages—short, feature-rich segments then long-range sequences—and was released with model weights, training/inference code, and the OpenGenome2 dataset. While it shows strong genome-annotation capabilities and can recognize features across domains and some mutation effects, its ability to design functional new proteins remains unproven and early tests of regulatory sequence activity yielded only modest results. The researchers anticipate many possible uses and further specialization, with the code and data open for community exploration.

Share this article

Reading Insights

Total Reads

1

Unique Readers

5

Time Saved

10 min

vs 11 min read

Condensed

95%

2,159117 words

Want the full story? Read the original article

Read on Ars Technica