SkillsBench: Benchmarking how well agent skills work across diverse tasks

(arxiv.org)

360 points | by mustaphah 3 days ago ago

173 comments