Data and AI-Enabled Biological Design
Risks Related to Biological Training Data and Opportunities for Governance
Expert InsightsPublished Jun 30, 2025
Risks Related to Biological Training Data and Opportunities for Governance
Expert InsightsPublished Jun 30, 2025
Artificial intelligence models trained on large volumes of biological data (AI-bio models) have demonstrated the growing abilities to support of basic scientific research goals. But some AI-bio models may be dual use, providing both beneficial capabilities and potentially dangerous ones. A nefarious actor with access to a frontier AI-bio model might be able to use it to design a pathogen with harmful phenotypic characteristics that enhance transmissibility. But model capabilities are closely linked to the data used to train them, and much less attention has been devoted to the relationship between dangerous capabilities and biological training data. The data that are included (or excluded) in model training heavily influences the models' capabilities and limitations. Governance of data used to train AI-bio models could be a useful way to allow beneficial scientific research while safeguarding against potentially dangerous capabilities.
The authors of this paper assess current knowledge about the link between biological data and AI-bio model capabilities, describe the anticipated impacts of new biological data sources, and outline potentially dangerous capabilities that could come from broad availability of certain types of biological data. They then recommend strategies to limit the potentially dangerous capabilities arising from biological data, including options for governance of experiments and data creation, governance of curation and aggregations of data, controls on access to collections of data, and governance of the use of data for model training.
This research was independently initiated and conducted within the Meselson Center within RAND Global and Emerging Risks using gifts for research at RAND's discretion from philanthropic supporter Open Philanthropy, as well as gifts from other RAND supporters and income from operations.
This publication is part of the RAND expert insights series. The expert insights series presents perspectives on timely policy issues.
This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.
RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.