year: 2017/12
paper: https://arxiv.org/abs/1712.06564
website:
code:
connections: OpenAI-ES, SGD, Kenneth O. Stanley
If you have access to gradients/good proxy gradients, i.e. you can do supervised learning, ES’s computational overhead isn’t worth it. But in RL, gradients are noisy anyway, and ES can reduce that noise by adding more offspring (parallelizable). ES is competitive when domain noise degrades SGD’s gradient quality enough that ES’s approximation isn’t much worse.