Information-theoretically: KL-divergence
the difference between optimal return and that achieved by the agent
generally how much worse you did than a chosen comparator (best fixed action, best policy, …)
Information-theoretically: KL-divergence
the difference between optimal return and that achieved by the agent
generally how much worse you did than a chosen comparator (best fixed action, best policy, …)