Sutton x Dwarkesh

It’s surprising that current AI/LLM agents are agents that don’t develop, but imitate the whole of human intellectual behavior, then trying to distill the core of intelligence.

Link to original

Baby humans DONT learn by imitation. They are not told how to move their hands and eyes and muscles. There’s no target for their actions. So imitation learning — learning from being told what to do — happens much later in out lives (but arguably still too early): school.
Learning from exploration, seeking novel, interesting things… (novelty, curiosity, exploration)

SL doesnt happen in nature.
No ground truth targets. Only i predicted X and Y happened, and A was my action R were the consequences. (RL)

Social and cultural learning builds upon that BASIS and is what compounds and allowed humans to stand out over time.

In contrast to human learning, we could have a bunch of instances explore the space, and take their experiences as starting points, not having to repeat the exploration process from scratch each time! (death)

gradient descent finds solutions to seen problems.
But it doesnt fundamentally lead to good generalization. It will solve a class of problems, but its underlying representations are not good.
Learning new things might catastrophically interfere with the old (catastrophic forgetting).

Max Wolf's Second Brain

Explorer

Sutton x Dwarkesh

Graph View

Backlinks