Open-world evaluations for measuring frontier AI capabilities [pdf]

(cruxevals.com)

2 points | by randomwalker 7 hours ago ago

No comments yet.