Measuring Biological Capabilities and Risks of AI Agents
Generating and Interpreting Evidence from Agentic Evaluations
Expert InsightsPublished Feb 10, 2026
Generating and Interpreting Evidence from Agentic Evaluations
Expert InsightsPublished Feb 10, 2026
This perspective examines biological agentic evaluations as an emerging tool for assessing the capabilities and risks of autonomous AI systems in biological contexts. Drawing on hands-on evaluation experience, it offers practical guidance on defining, designing, running, scoring, and interpreting evaluations, highlighting how design choices shape conclusions and policy relevance.
This effort was independently initiated and conducted by the Center on AI, Security, and Technology within RAND Global and Emerging Risks using income from operations and gifts and grants from philanthropic supporters.
This publication is part of the RAND expert insights series. The expert insights series presents perspectives on timely policy issues.
This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.
RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.