Toward Comprehensive Benchmarking of the Biological Knowledge of Frontier Large Language Models
ResearchPublished Nov 25, 2025
Artificial intelligence (AI) systems demonstrate deep knowledge across a broad variety of scientific domains, including biology and chemistry, and bad actors could misuse some of these systems to develop biological or chemical weapons. In this report, the authors evaluate the most-capable AI models (as of May 2025) against eight leading knowledge benchmarks to determine the degree to which frontier AI systems pose biological or chemical risks.
ResearchPublished Nov 25, 2025
Artificial intelligence (AI) systems demonstrate deep knowledge across a broad variety of scientific domains, including biology and chemistry, and bad actors could misuse some of these systems to develop biological or chemical weapons.
The constant development of more-capable models necessitates rapid evaluation mechanisms for governments to respond to emerging security risks in a timely manner. Policymakers, industry experts, and third-party evaluators lack a cohesive standard for testing AI systems' safety levels. These challenges complicate efforts to determine the degree to which frontier AI systems pose biological or chemical risks.
The authors evaluate the utility of misusing frontier AI systems to these ends. The authors focus on custom-tuned versions of open-weight AI models that can be modified to remove safety guardrails and/or potentially increase biological capabilities. For this report, the authors evaluated 39 of the most-capable models (as of May 2025) against six public biological and chemical knowledge benchmarks and two refusal benchmarks relevant to biological and chemical threats.
This work was independently initiated and conducted within the Technology and Security Policy Center of RAND Global and Emerging Risks using income from operations and gifts from philanthropic supporters. A complete list of donors and funders is available at www.rand.org/TASP.
This publication is part of the RAND research report series. Research reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND research reports undergo rigorous peer review to ensure high standards for research quality and objectivity.
This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.
RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.
This publication supersedes a previous version published in 2025 (WR-A3797-1).