Deep Learning Ramp-Up: Starting My Journey from Linear Regression to GPT-2 Scale
Beginning my deep learning journey through Jacob Buckman's comprehensive ramp-up curriculum, from basic linear regres...
Building neural networks from scratch to advanced architectures
Single scalar input/output with manual backprop
PyTorch implementation with automatic differentiation
Vector inputs for regression tasks
Categorical classification with softmax loss
Single hidden layer neural networks
Multiple hidden layers for deep learning
MNIST digit classification
Advanced optimization with Adam
Convolutional neural networks
Residual networks for better training
Sequence modeling with feedforward networks
Text generation through autoregressive sampling
Transformer architecture for sequence modeling
GPU acceleration and optimization
Large-scale image classification
GPU-accelerated transformer training
Distributed training across multiple GPUs
GPT-2 scale language model training
Deep dives, experiments, and connections to music technology
Beginning my deep learning journey through Jacob Buckman's comprehensive ramp-up curriculum, from basic linear regres...
I'm working through Jacob Buckman's Deep Learning Ramp-Up curriculum, documenting my learning process with a unique perspective from my background in music technology and MIR.
What makes this series different:
All code, experiments, and detailed notes are available in my GitHub repository.