Graph View

decision transformer

1 min read

https://lilianweng.github.io/posts/2023-01-27-the-transformer-family-v2/#transformers-for-reinforcement-learning

The experiments compared DT with several model-free RL algorithm baselines and showed that:

  • DT is more efficient than behavior cloning in low data regime;
  • DT can model the distribution of returns very well;
  • Having a long context is crucial for obtaining good results;
  • DT can work with sparse rewards.

off-policy
transformers in RL


Backlinks

  • transformers in RL

Created with Quartz v4.5.2 © 2026