year: 2019
paper: https://arxiv.org/abs/1910.04098
website: https://louiskirsch.com/metagenrl
code:
connections: meta learning, RL
Meta learning the objective function:
Generalizes to environments that are quite different to the training distribution.
Outperforms human engineered algs like PPO.