If LLMs Only Predict the Next Token, Why Do They Work?

(sicheng.dev)

3 points | by sichengo 7 hours ago ago

8 comments