What the HECK is the difference between various implementations of cross-entropy and transformer padding?
Trying to rewrite the whole guided latent diffusion thing is hard, so instead here’s some smushy images of butterflies.
I graduated from Georgia Tech’s Online Master’s in Computer Science and they promoted me straight to the top.
Run some fighting game environments on an ec2 instance, retrieve experience for training an RL agent from Google Colab.
Using mlflow with minio and postgresql as storage backends, and hydra to manage configuration and perform sweeps.
Personal updates: PhD applications and OMSCS worries. Bonus: a puzzle about flipping coins.
I am a sick man who enjoys watching a little lunar module softly touch down on the surface in a simulated environment. See how to do it with PPO.
Somewhere in this blog I mentioned I would talk about baking? Ok here it is. Look at that bagel, mmmmmm. I got the recipe from GPT-5 which OpenAI has been sitting on since early 2023 and gave me early access to.
Training a ‘spider’ to awkwardly walk to the right using DDPG.
I’ve written before about both the gumbel-max trick and variational autoencoders. The world demanded a post that combined the two, so here it is.
Read about me fumbling around recreating vision transformer, one of the components of DALL-E/CLIP.
Behind this disaster of a title lies the secret to quickly sample from a categorical distribution in python!
My attempt at explaining VAE’s in an effort to understand new text-to-image systems.
I found the 2021-2022 crypto hype cycle and subsequent FTX collapse to be fascinating and infuriating. A couple people wrote books about it, I settled for a blog post.