An Explanation of In-context Learning as Implicit Bayesian Inference

year: 2022
paper: arxiv
website:
code:
connections: in-context learning, bayesian inference, pretraining

TLDR

Proposes that in-context learning is implicit bayesian inference: during pretraining, each document is generated by a latent concept, and next-token prediction trains the model to infer a posterior over these latent concepts from context. At inference time, the prompt examples narrow down which concept is active, and the model predicts accordingly. ICL thus emerges as a natural consequence of the pretraining objective, not a separate learned skill.

Graph View

An Explanation of In-context Learning as Implicit Bayesian Inference

Backlinks