test

Testing

bullet
- nested
not nested

Note

bullet

nested

not nested

Testing
Test

Note

$testing = \nabla λγ$

Note

Line

Anotherone

Note

Line
Anotherone

Outer

Outer

Inner

Inner
Inner

Outer

Should render without errors (works with mathjax not with katex - writing a plugin for this is more effort than just adapting this manually once mathjax 4.0 comes out):

\displaylines a b

\displaylines c d

Should be centered:
center

Big callout

Stuff

Stuff

Agent $i_{1}$ goes first, picks an action with $a^{i_{1}}$ aiming for positive advantage $A_{π}^{i_{1}} (o, a^{i_{1}}) > 0$
Agent $i_{2}$ , knowing $a^{i_{1}}$ , chooses $a^{i_{2}}$ for positive $A_{π}^{i_{2}} (o, a^{i_{1}}, a^{i_{2}}) > 0$
Agent $i_{3}$ , knowing $(a^{i_{1}}, a^{i_{2}})$ , …

Stuff

Stuff $i_{1}$ goes first, picks an action with $a^{i_{1}}$ aiming for positive advantage $A_{π}^{i_{1}} (o, a^{i_{1}}) > 0$ Agent $i_{2}$ , knowing $a^{i_{1}}$ , chooses $a^{i_{2}}$ for positive $A_{π}^{i_{2}} (o, a^{i_{1}}, a^{i_{2}}) > 0$ Agent $i_{3}$ , knowing $(a^{i_{1}}, a^{i_{2}})$ , …

Agent

if you can read this, it's not working as intended

Referencing a header with a link in the title should work
[[test#text]]
Referencing a section with alt text works:
Stuff
Referencing a section without alt text should also work!
^f39253

Sanity check:

? = {a b if condition if other condition

This should work inline too!
$c = \cases a b$

[[test]]

$↑$ this should display as a link even iff there’s a tab in the line above it lol

I have NO idea what’s wrong here:

Hazard Rate

The hazard rate (or failure rate) $h (t)$ represents the instantaneous rate of failure at time $t$ , given survival up to that time. For the exponential distribution, it’s defined as:
$h (t) = \frac{p ( t )}{P ( T > t )} = \frac{λ e ^{- λ t}}{e ^{- λ t}} = λ$
The constant hazard rate $λ$ is a direct consequence of the memoryless property - the failure rate doesn’t change over time. This means an exponentially distributed component is just as likely to fail in the next instant whether it’s brand new or has been running for years.

The relationship $P (T < t + d t ∣ T > t) = λ \cdot d t$ (given I’ve lasted $t$ time, what’s the probability I’ll last another $d t$ ) tells us that for a small time interval $d t$ , the probability of failure is approximately $λ$ times the length of that interval, regardless of how long the component has already survived.

\begin{align*}
P(T<t + dt | T > t) &= 1 - P(T> t+dt|T>t) \
&= 1 -P(T>t) \quad \text{memoryless property} \
&= 1-e^{-\lambda dt}
\end{align*}

$When we do `[[taylor expansion]]` for the exponential and make a small $dt$ approximation, we get the hazard rate:$

= 1- [1 - \lambda dt + \frac{1}{2}\lambda^{2} dt^{2} - \dots]
\approx \lambda dt

$\implies P(t \lt T \le t + dt) = \lambda P(T \gt t)dt$

What about this? Bruh, this fixes it.. it’s the nested callout latex…

Hazard Rate

The hazard rate (or failure rate) $h (t)$ represents the instantaneous rate of failure at time $t$ , given survival up to that time. For the exponential distribution, it’s defined as:
$h (t) = \frac{p ( t )}{P ( T > t )} = \frac{λ e ^{- λ t}}{e ^{- λ t}} = λ$
The constant hazard rate $λ$ is a direct consequence of the memoryless property - the failure rate doesn’t change over time. This means an exponentially distributed component is just as likely to fail in the next instant whether it’s brand new or has been running for years.

The relationship $P (T < t + d t ∣ T > t) = λ \cdot d t$ (given I’ve lasted $t$ time, what’s the probability I’ll last another $d t$ ) tells us that for a small time interval $d t$ , the probability of failure is approximately $λ$ times the length of that interval, regardless of how long the component has already survived.

$P (T < t + d t ∣ T > t) = 1 - P (T > t + d t ∣ T > t) = 1 - P (T > t) memoryless property = 1 - e^{- λ d t}$
When we do [[taylor expansion]] for the exponential and make a small $d t$ approximation, we get the hazard rate:
$= 1 - [1 - λ d t + \frac{1}{2} λ^{2} d t^{2} - \dots] \approx λ d t$
$⟹ P (t < T \leq t + d t) = λ P (T > t) d t$

This callout should be properly indented (the indentation of that numbered list shouldnt stop after the numbered list ends, since therea an empty newline (with>)):

Training procedure

Evolution loop (CMA-ES):

Start of a generation: Sample a population of parameter vectors $θ$

For each individual:

Load parameters $θ$

Initialize fresh graph: random connections via $P (A_{ij} = 1) \sim N_{[0, 1]} (μ^{conn}, σ^{conn})$

Run development phase if enabled ( $T_{S A}$ steps of spontaneous activity)

Run multiple episodes, keeping network graph between episodes

Return fitness (average reward over episodes)

CMA-ES updates its distribution based on fitnesses

Repeat for … generations

Each individual gets its own graph that persists across episodes but not across generations.

Information flow (per timestep):
$h^{t + 1} = f_{θ}^{h} (G^{t})$
$e_{ij}^{t + 1} = f_{θ}^{e} (e_{ij}^{t}, h_{i}^{t + 1}, h_{j}^{t + 1}, r^{t})$
$w_{ij} = e_{ij, 0}$
$v^{t + 1} = tanh (\overset{v}{^}^{t} \cdot w^{t}), \overset{v}{^}_{i} = {o_{i} v_{i} if i \in Input otherwise$ (repeat rnn_iters times)
Actions = $v_{output_nodes}$ argmax (discrete) or raw “concatted” activations (continuous)

Lifetime dynamics (no gradient updates):

Structural changes: $P (A_{ij} \leftarrow 1) = f_{θ}^{+} (h_{i}, h_{j})$ , $P (A_{ij} \leftarrow 0) = f_{θ}^{-} (e_{ij})$

Weight changes: via edge state updates $e_{ij}^{t + 1} = f_{θ}^{e} (...)$

Both use evolved rules $θ$ fixed at birth

Core challenge: Discover both structural rules (which connections to form) and learning rules (how to update weights) using only episodic rewards - no supervision on topology or weights.

The comments should be aligned:

The task code length varies around 200-400+ LOC, with a gym env like structure

class Env(R2D2Env):
  """[10-30 lines of task description docstring]"""
  def __init__(self):	 # 30-80 lines: Initialize environment objects
  def create_box/...():   # 5-10 lines each: Helper methods for creating objects
  def reset():			# 10-20 lines: Reset environment state
  def step():			 # 10-30 lines: Update environment dynamics
  def get_task_rewards(): # 10-30 lines: Calculate reward components
  def get_terminated():   # 5-15 lines: Check termination conditions
  def get_success():	  # 3-10 lines: Check success conditions

  z (BF = +2)
  / \
     y   T3
    / \
   x   T2
  / \
  T0 T1

Max Wolf's Second Brain

Explorer

test

Testing

Graph View

Backlinks