The experiments compared DT with several model-free RL algorithm baselines and showed that:
- DT is more efficient than behavior cloning in low data regime;
- DT can model the distribution of returns very well;
- Having a long context is crucial for obtaining good results;
- DT can work with sparse rewards.