year: 2023
paper: https://arxiv.org/pdf/2309.16588.pdf
website: https://x.com/TimDarcet/status/1707769575981424866?s=20
code:
connections: memory token, ViT


Fits the “LLMs need tokens to think” worldview. Chain of thought might in some cases be helpful only as an additional source of registers rather than anything else.

Idk why I cant access the oral https://iclr.cc/virtual/2024/oral/19794, it was a rlly great talk but I didnt take proper notes…

https://github.com/lucidrains/vit-pytorch/commit/f7d59cecb5a368ff910ce93bf0c70daff8378ca7