fast.ai
Train fast.ai models faster with fastxtend's fused optimizers, Progressive Resizing callback, integrated FFCV DataLoader, and integrated PyTorch Compile support.
PyTorch
optimi enables accurate low precision training via Kahan summation, supports fully decoupled weight decay, and features fast implementations of modern optimizers.
PyTorch
Highly commented implementations of GPT-2 and BERT, and Bidirectional Attention, Causal Attention, and Causal Cross Attention, in PyTorch for the Creating a Transformer From Scratch series.
Jul 1, 2023
You cannot create a Transformer without Attention. In this post, I will show you how to write an Attention layer from scratch in PyTorch. By the end of this post, you will be familiar with all three flavors of Attention: Bidirectional, Causal, and Cross Attention, and should be able to write your own implementation of the Attention mechanism in code.
Jul 28, 2023
In this post, I will show you how to build the rest of the Transformer. By the end of this post, you will be familiar with all the pieces of a Transformer model and, combined with your knowledge of Attention, will be able to write an entire Transformer from scratch.
Jul 14, 2022
Over the past week, Thomas Capelle and I discovered, debugged, and created a workaround for a performance bug in PyTorch which reduced image training GPU throughput up to forty percent when using fastai. The culprit? Subclassed tensors.
Mar 11, 2022
In this post I will give an overview of my solution, explore some of my alternate solutions which didn't perform as well, and give a quick overview on how to customize fastai to work on a new dataset.