go through this in ~december

CMA-ES

CMA-ES biases the search/distribution towards the direction of the elite, and adapts it depending on whether the best solutions are far away or close by, in addition to updating the mean of the distribution towards the elite.

Line 1: Affine property of gaussian
Line 2: is the matrix square root of
Line 3: eigendecomposition of
Line 4: since is a orthogonal matrix and rotations of a standard normal distribution are still standard normal (rotating a sphere).

Since the covariance calculation scales with , cma-es starts becoming unpractical for > ~10k parameters, but low-rank approximations, for example: LM-MA-ES or sep-CMA-ES

covariance matrix evolution strategies

CMA-ES YouTube series + blog:
https://szhaovas.github.io/2023-02-06-cmaesall/
https://szhaovas.github.io/2022-09-06-cmaes/
https://szhaovas.github.io/2022-09-07-cmaes2/
https://szhaovas.github.io/2022-09-09-cmaes3/

https://inria.hal.science/hal-00808450v1/document