Projects

PyTorch

Fast, Modern, and Low Precision PyTorch Optimizers

optimi enables accurate low precision training via Kahan summation, supports fully decoupled weight decay, and features fast implementations of modern optimizers.

fast.ai

fastxtend

Train fastai Models Faster (and Other Useful Tools)

Train fast.ai models faster with fastxtend's fused optimizers, Progressive Resizing callback, integrated FFCV DataLoader, and integrated PyTorch Compile support.

PyTorch

Commented Transformers

Highly Commented Implementations of Transformers

Highly commented implementations of GPT-2 and BERT, and Bidirectional Attention, Causal Attention, and Causal Cross Attention, in PyTorch for the Creating a Transformer From Scratch series.

Jul 1, 2023

Creating a Transformer From Scratch

Part One: The Attention Mechanism

You cannot create a Transformer without Attention. In this post, I will show you how to write an Attention layer from scratch in PyTorch. By the end of this post, you will be familiar with all three flavors of Attention: Bidirectional, Causal, and Cross Attention, and should be able to write your own implementation of the Attention mechanism in code.

Jul 28, 2023

Creating a Transformer From Scratch

Part Two: The Rest of the Transformer

In this post, I will show you how to build the rest of the Transformer. By the end of this post, you will be familiar with all the pieces of a Transformer model and, combined with your knowledge of Attention, will be able to write an entire Transformer from scratch.

Jul 14, 2022

Discovering and Debugging a PyTorch Performance Decrease

Subclassed Tensors Reduce GPU Throughput up to Forty Percent

Over the past week, Thomas Capelle and I discovered, debugged, and created a workaround for a performance bug in PyTorch which reduced image training GPU throughput up to forty percent when using fastai. The culprit? Subclassed tensors.

Mar 11, 2022

Detecting Cloud Cover Via Sentinel-2 Satellite Data

My Top-10 Percent Solution to DrivenData's On CloudN Competition

In this post I will give an overview of my solution, explore some of my alternate solutions which didn't perform as well, and give a quick overview on how to customize fastai to work on a new dataset.