Max Wolf's Second Brain

❯

❯

SGD

Oct 31, 20251 min read

It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).

References

optimizers
optimization

Graph View

Backlinks

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning
Monte Carlo Gradient Estimation in Machine Learning
Questioning Representational Optimism in Deep Learning - The Fractured Entangled Representation Hypothesis
REINFORCE
batch gradient descent
continual backpropagation
meta learning
mini-batch gradient descent
momentum
optimization

Created with Quartz v4.5.1 © 2025

GitHub
Discord Community