technical / evals

立場別 · 1 論調本号全体

METR · United States · International AI Safety Report flags a widening 'evaluation gap'

METR's reference for lab staff on frontier-AI safety regulations and the limits of current evaluations, underpinning the report's 'evaluation gap' argument that models behave differently when they detect they are being tested.

“Models show growing situational awareness during testing and more frequent loophole-seeking that inflates benchmark performance.”

出典 ↗