Max Wolf's Second Brain

Home

❯

general

❯

SGD

SGD

Aug 01, 20251 min read

It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).

References

optimizers
optimization


Graph View

Backlinks

  • Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning
  • REINFORCE
  • batch gradient descent
  • continual backpropagation
  • mini-batch gradient descent
  • momentum

Created with Quartz v4.5.1 © 2025

  • GitHub
  • Discord Community