The mean preference is a bad estimate of preferences.

https://www.ethansmith2000.com/post/the-mean-preference-is-a-bad-estimate-of-preferences

Solution (maybe?): mixture of gaussians
Challenges: Granularity tradeoff. (also harder to train / less efficient, maybe?)

this is prlly a related observation – behaviour of llms

LLMs are optmized to minimize risk $E [L (f (x_{i}), y_{i})]$ , which explains a lot in their behavior, kowing that’s a name for this loss
because like they always try the most “average” … well the least risky thing
rather than trying to think out of the box / take creative leaps (ok maybe it’s not that deep in this case and rlly just abt preferences here)
noticed that w/ gemini just now
asked to create a note in my notes style, but it clinged to the style we had previously in the convo, till i nudged it.
kinda interesting
ig the prompt is literally the condition, and ofc even if u have explicit instructions, the weight of the longer priortr interaction in a different style dominates still initially.

Graph View

The mean preference is a bad estimate of preferences.