year: 2025/09
paper: http://arxiv.org/abs/2509.23745
website: https://generalist-locomotion.github.io/
code: https://github.com/lucidrains/locoformer?tab=readme-ov-file
connections: deepak pathak, robotic control
The gist is they trained a simple Transformer-XL in simulation on robots with many different bodies (cross-embodiment). When transferring to the real-world, they noticed the robot now gains the ability to adapt to insults. It was important that the XL memories extend beyond each episode, so that attention can put together experience across different bodies within the same context.
→ adaptation to OOD through domain randomization + ICL
My prediction about how they translate actions into a variety of robot actuators was wrong! (I thought they’re most likely using something very similar to “one policy to control them all”).
Interesting that this simple way of having a joint action space of actuator types and using just the legal ones for the current morphology works the way they do it.
Chatty guesses a size of this action vector on the O(10), prlly 32. Also much smaller than I would have guessed!