Show HN: Jam Storyteller – Attention? Memory Is All You Need

(github.com)

2 points | by amthorn 11 hours ago ago

1 comments

$amthorn 11 hours ago

This demo uses standard transformer weights with a very small attention/KV component, but most temporal memory is handled by a stateful operator rather than a growing context window.
Outputs are similar to a transformer, while running super fast on CPU with much lower memory use.