METR (2025), References, Textbook of AI

METR (2025)

METR (Model Evaluation and Threat Research).

Abstract. A review by METR (the AI-safety evaluation non-profit) of frontier-lab safety policies and the evaluations on which they rely. The review observes that the thresholds are self-defined, each lab decides what counts as "materially uplifting a malicious actor" or as "significantly accelerating biothreat research", and each lab designs the evaluations that decide whether a model crosses the threshold. The same model can pass one lab's evaluations and fail another's. The report has been influential in calls for standardised, third-party-audited safety evaluations and in shaping policy proposals at the AI Safety Institutes.

Tags: safety policy evaluation

Cited in:

Chapter 16: Ethics & Safety

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Common Elements of Frontier AI Safety Policies