Simple efficiency metric derived from METR Time Horizons v1.1 data: IY = (p50 time horizon × avg score) / working time. Measures quality-weighted task-minutes per compute-minute. The gap between frontier models is larger than accuracy leaderboards suggest.
Simple efficiency metric derived from METR Time Horizons v1.1 data: IY = (p50 time horizon × avg score) / working time. Measures quality-weighted task-minutes per compute-minute. The gap between frontier models is larger than accuracy leaderboards suggest.