Skip to content
Deep Learning Machine Learning

Deep Learning Ramp-Up: Starting My Journey from Linear Regression to GPT-2 Scale

Oriol Colomé Font
6 min read

Embarking on a Deep Learning Journey

After successfully working through Sebastian Raschka’s LLM series, I’m excited to dive into another comprehensive learning journey: Jacob Buckman’s Deep Learning Ramp-Up curriculum. This structured approach will take me from the fundamentals of linear regression all the way to training GPT-2 scale models on multiple GPUs.

Why This Curriculum?

Jacob Buckman’s ramp-up is particularly appealing because it:

  • Builds from first principles: Starting with manual backpropagation in NumPy
  • Progressive complexity: Each exercise builds naturally on the previous one
  • Real-world applications: Moving from synthetic data to MNIST, ImageNet, and Shakespeare
  • Performance focus: GPU optimization and scaling considerations
  • Comprehensive coverage: From basic regression to state-of-the-art architectures

The 18-Exercise Roadmap

The curriculum is structured into three main phases:

Phase 1: Foundations (Exercises 1-6)

Starting with synthetic data and building core understanding:

  1. Linear regression with NumPy - Manual backpropagation
  2. Linear regression with PyTorch - Automatic differentiation
  3. Vector input regression - Scaling to higher dimensions
  4. Classification with softmax - Categorical outputs
  5. Single feedforward layer - First neural networks
  6. Deep feedforward networks - Multiple hidden layers

Phase 2: Real Data & Advanced Architectures (Exercises 7-13)

Moving to real datasets and sophisticated models:

  1. MNIST classification - Image recognition
  2. Adam optimizer - Advanced optimization
  3. Convolutional networks - Spatial feature learning
  4. ResNet architecture - Residual connections
  5. Shakespeare feedforward - Sequence modeling
  6. Autoregressive sampling - Text generation
  7. Causal transformer - Attention mechanisms

Phase 3: Scale & Performance (Exercises 14-18)

GPU optimization and large-scale training:

  1. GPU optimization - Performance tuning
  2. ImageNet ResNet - Large-scale image classification
  3. GPU transformer - Accelerated sequence modeling
  4. Multi-GPU training - Distributed computing
  5. GPT-2 scale training - Large language models

My Approach

As with my LLM series, I’ll be documenting this journey with:

🎵 Music Technology Connections

Drawing parallels between neural networks and audio processing techniques I’ve used in MIR research.

🔬 Deep Experiments

Going beyond the basic exercises to explore:

  • Different activation functions and their effects
  • Regularization techniques and their impact
  • Visualization of learned representations
  • Performance comparisons across architectures

📊 Interactive Visualizations

Creating plots and diagrams to build intuition about:

  • Loss landscapes and optimization dynamics
  • Feature maps in convolutional layers
  • Attention patterns in transformers
  • Training dynamics across different scales

💭 Honest Reflections

Documenting the challenges, “aha” moments, and insights gained along the way.

Reference Materials

The curriculum references several excellent resources:

Getting Started

I’ll be maintaining all code and experiments in a dedicated GitHub repository, with each exercise in its own folder for easy navigation and reference.

The first exercise - implementing linear regression with manual backpropagation in NumPy - is particularly exciting because it forces a deep understanding of the mathematical foundations that are often abstracted away in modern frameworks.

What’s Next

I’m planning to work through approximately one exercise per week, allowing time for:

  • Thorough understanding of each concept
  • Additional experiments and explorations
  • Detailed documentation and visualization
  • Connecting concepts to my background in music technology

This journey will complement my LLM series perfectly, providing the foundational understanding needed to appreciate the sophisticated architectures used in modern language models.


Follow along as I work through this comprehensive deep learning curriculum, sharing insights, challenges, and connections to music technology along the way.

Tags: #DeepLearning #MachineLearning #PyTorch #NeuralNetworks #AI #LearningJourney

Share this post

Related Posts