HN
New
Show
Ask
Jobs
Built with Astro
Why SWE-bench Verified no longer measures frontier coding capabilities
(openai.com)
7 points | by
tedsanders
a day ago ago
No comments yet.
No comments yet.