🛠️ Steven Gong

Search

CS480 — Introduction to Machine Learning
Lectures
1. Supervised Learning: Linear Models
2. Kernels and Margins
3. Trees and Ensembles
4. Neural Networks
5. Unsupervised Learning
6. Generative Models
7. Trustworthy ML
8. Sequence Models & LLMs
Readings

Apr 19, 2026, 2 min read

CS480 — Introduction to Machine Learning

Fall 2025, taught by Gautam Kamath at Waterloo.

Course page: http://www.gautamkamath.com/courses/CS480-fa2025.html
Sections: TTh 2:30–3:50 PM (DC 1302) / TTh 1:00–2:20 PM
Final: December 11, 12:30–3:00 PM

Lectures

1. Supervised Learning: Linear Models

Lec 1–2: Perceptron — convergence theorem, mistake bound, padding trick
Lec 3: Linear Regression — ERM, squared loss, normal equations, MLE under Gaussian noise, ridge, lasso, cross validation
Lec 4: k-Nearest Neighbours — Bayes optimal, Cover-Hart bound, curse of dimensionality
Lec 5: Logistic Regression — logit transform, cross-entropy, softmax multiclass

2. Kernels and Margins

Lec 6–7: Support Vector Machine — hard-margin, soft-margin, dual, support vectors, hinge loss
Lec 8: Kernel Method — feature maps, kernel trick, RBF/polynomial, Mercer’s condition

3. Trees and Ensembles

Lec 9: Decision Tree — entropy, Gini, information gain, pruning, Random Forest
Lec 10: Bagging & Boosting (AdaBoost) — variance reduction, Hedge algorithm, weighted experts

4. Neural Networks

Lec 11: Multilayer Perceptrons — XOR via nonlinearities, universal approximation, activations, backprop
Lec 12: Deep Networks — regularization, weight decay, dropout, batch/layer norm, data augmentation
Lec 13: Optimization + CNNs — SGD, momentum, AdaGrad, RMSProp, Adam
Lec 15: Recurrent Neural Networks

5. Unsupervised Learning

Lec 16: k-Means & Gaussian Mixture Models — fit via Expectation Maximization

6. Generative Models

Lec 17: Autoencoders & Variational Autoencoders — ELBO, reparameterization trick
Lec 18: Generative Adversarial Networks
Lec 24: Normalizing Flows

7. Trustworthy ML

Lec 19: Adversarial Robustness — FGSM, PGD, adversarial training
Lec 20: Differential Privacy — $(ε, δ)$ -DP, Gaussian mechanism, DP-SGD

8. Sequence Models & LLMs

Lec 21: Attention → Transformer
Lec 22–23: Large Language Models — BERT, GPT, scaling laws, chain of thought, RLHF

Readings

Textbook shorthand used in the schedule:

UML: Understanding Machine Learning (Shalev-Shwartz & Ben-David)
ESL: Elements of Statistical Learning (Hastie, Tibshirani, Friedman)
ISL: Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani) — “most recommended when available”
DL: Deep Learning (Goodfellow, Bengio, Courville)
D2L: Dive into Deep Learning (Zhang, Lipton, Li, Smola)

Graph View

Backlinks

4A SE
K-Nearest Neighbors (kNN)
Kernel Method
Linear Regression
Logistic Regression
Perceptron
Support Vector Machine (SVM)

Created with Quartz, © 2026

Blog
LinkedIn
Twitter
GitHub