Deep Learning Using Python
A Two-Day Workshop
This workshop is a hands-on introduction to deep learning in Python using PyTorch. The first day covers the foundations: what a neural network computes, how it is trained, and how to build and train multilayer perceptrons in PyTorch. The second day covers the two dominant modern architectures: convolutional neural networks for image data and transformers for text.
Day One
Session One: Introduction to Artificial Neural Networks
We implement artificial neurons from scratch using NumPy. We define and plot the common activation functions, then build a simple forward pass through a small network by hand. This gives a concrete computational picture of what a neural network is before we move to PyTorch.
Session Two: Training Neural Networks
We introduce loss functions and work through the mechanics of training: forward pass, loss computation, backward pass, and parameter update. We cover gradient descent and its variants, and introduce PyTorch’s autograd for automatic differentiation.
Session Three: Multilayer Perceptrons with PyTorch
We build and train multilayer perceptrons using PyTorch’s nn.Module. The running example is MNIST digit classification. We cover the full training loop, monitoring loss and validation accuracy, and practical use of skorch as a scikit-learn wrapper for PyTorch.
Day Two
Session One: Convolutional Neural Networks
We introduce convolutional layers and build CNNs for image classification in PyTorch. We cover Conv2d, MaxPool2d, and BatchNorm2d, and train a CNN on a real image dataset.
Session Two: Language Models and Transformers
We cover tokenisation, embeddings, and the self-attention mechanism. We work through the transformer architecture piece by piece, building toward the GPT-style decoder used in the following session.
Session Three: Implementing GPT Models
We implement a minimal GPT from scratch in PyTorch and train it on a small text corpus. We then turn to the Hugging Face Transformers library and use pre-trained models for text classification and generation.
Extra Topics
Attention Mechanism Fundamentals
A focused introduction to the attention mechanism and the transformer architecture. We motivate attention through the problems of long-range context in language and images, explain the query/key/value framework, implement scaled dot-product attention, and assemble a transformer block.