Evo 2: an AI that reads and writes genomes across all life
Arc Institute's 40B-parameter DNA foundation model predicts disease mutations and designs bacterial-scale genomes, the design-tool side of the biosecurity debate
Summary
Arc Institute released Evo 2, a DNA foundation model trained on over 9 trillion nucleotides from more than 100,000 species across the tree of life, published in Nature in March 2026. At 40 billion parameters with a 1-megabase context and single-nucleotide resolution, Evo 2 both predicts disease-causing mutations and generatively designs genomes as long as simple bacteria. It was built with NVIDIA (hosted on BioNeMo) and researchers at Stanford, UC Berkeley and UCSF. Evo 2 is the design-tool half of the biosecurity-screening debate: a model that can write novel functional DNA is exactly what worries the labs pushing mandatory synthesis screening, since AI-generated sequences may evade conventional motif checks.
By the numbers
- 40B, model parameters.
9 trillion, nucleotides in training data.
100,000, species spanned (all domains of life).
- 1 megabase, context length, single-nucleotide resolution.
- Mar 2026, published in Nature; weights and code released.
Why it matters
Evo 2 marks generative biology crossing into genome-scale design, the predictive payoff (variant effect, drug targets) and the dual-use risk (designing novel sequences) arrive in the same open model. It is the concrete reason the AI×biology screening fight is live now, not hypothetical.
What to watch
- Independent validation of Evo 2's variant-effect and design claims.
- Whether release norms tighten for generative-biology models.
- Integration of design tools like Evo 2 into drug-discovery pipelines.