Occasional writing on my projects and ideas.
Memory and Implicit Communication Emergent multi-agent signaling with spatial memory
Online Transformer RL Training a transformer with PPO to solve POMDPs
Vanilla Actor Critic Implement a actor critic from scratch in jax