In-context learning (ICL) occurs when models adapt their behavior based on examples provided in the prompt, without weight updates. One-shot and few-shot prompting are forms of ICL, while zero-shot uses only instructions without examples.
ICL differs from few-shot learning as a training paradigm - ICL is an emergent capability where models learn patterns from prompt examples, while few-shot learning explicitly trains models to learn from limited examples.
Training a network to be better at in-context learning is essentially meta learning - learning to learn.
Transclude of Language-Models-are-Few-Shot-Learners#^260d49
Different approaches:
Link to originalThe variable ratio problem:
Meta RNNs are simple, but they have much more meta variables than learned variables () leading them to be overparametrized, prone to overfitting.
Learned learning rules / Fast Weight Networks have , but introduce a lot of complexity in the meta-learning network etc.
→A variable-sharing and sparcity principle can be used to unify these approaches into a simple framework: VSML.
Related papers:
Using Fast Weights to Attend to the Recent Past
hypernetworks