Toward Comprehensive Benchmarking of the Biological Knowledge of Frontier Large Language Models
The authors evaluate the most-capable artificial intelligence (AI) models (as of May 2025) against eight knowledge benchmarks to determine the degree to which frontier AI models pose a risk of helping bad actors create biological or chemical weapons.