year: 2025/05
paper: https://arxiv.org/abs/2505.11581
website: https://x.com/kenneth0stanley/status/1924650124829196370
code: https://github.com/akarshkumar0101/fer
connections: representation, Kenneth O. Stanley Akarsh Kumar, Jeff Clune, Joel Lehman
Link to originalGATO showing negative transfer learning – learning a new game is harder if done after learning other games in parallel.
It is a challenge to just not get worse!
This “loss of plasticity” in a network with continuous use is a well known phenomenon in deep learning, and it may even have biological parallels with aging brains at some levels – old dogs and new tricks. But humans
It may be necessary to give up some initial learning speed “learn slow so you can learn fast”
→ Connection to Questioning Representational Optimism in Deep Learning - The Fractured Entangled Representation Hypothesis? Sacrifice initial learning speed / task specific perf in favour of building solid representations / building blocks that allow you to perf well / recombine / adapt to other tasks down the road. Maximize adaptability?