METR (2025)
METR (Model Evaluation and Threat Research).
URL: https://metr.org/
Abstract. A review by METR (the AI-safety evaluation non-profit) of frontier-lab safety policies and the evaluations on which they rely. The review observes that the thresholds are self-defined, each lab decides what counts as "materially uplifting a malicious actor" or as "significantly accelerating biothreat research", and each lab designs the evaluations that decide whether a model crosses the threshold. The same model can pass one lab's evaluations and fail another's. The report has been influential in calls for standardised, third-party-audited safety evaluations and in shaping policy proposals at the AI Safety Institutes.
Tags: safety policy evaluation
Cited in: