(former note name: “machine learning basics”)
Resources for beginners.
This book is a really nice intro to ML, no heavy math prereq: https://udlbook.github.io/udlbook/
Yannik Kilcher has lots of good videos where he explains papers.
Artem Kirsanov if you’re interested in Neuroscience+ML
Steve Brunton - best Math courses for anything.
For coding can recommend tinygrad (much easier to use and setup than pytorch (no 1GB download), soon faster, …).
A recipe for training neural networks
Yes you should understand backprop
NYU Deep Learning SP21
Visualization of how NN learns to classify.
Visualizing and Debugging Neural Networks with PyTorch and Weights & Biases
(very old stuff)
Introductory article with CNNs.
Must have libraries
(pytorch)
Pytorch Lightning
wandb
Data
Datasets
Train, Test, Validation Datasets
(When validating, you don’t optimize the model any further, but use it to tune hyper params etc. (which could also affect overfitting, hence it’s seperate from test set))
Dataloaders & Datamodules
What is a dataloader? Guide
Custom Dataloader
Lightining Datamodule
Preprocessing Data
Scale and normalize your data (Changing range / distribution)
Models
What types of ML tasks? Classification, Regression, Generation?
KNN, Trees, Kmeans, etc.: Classical approaches
https://wandb.ai/site/articles/tour-of-machine-learning-models
https://wandb.ai/site/articles/p-picking-a-machine-learning-model
Optimization / Training
Batch gradient descent - Mini-batch gradient descent
Training is done in batches (many examples at once) due to GPU parellelism.
Each training sample is still completely independent from eachother in the batch.
Due to this, you don’t get the exact but an approximate gradient.
Accuracy or loss?
Basically, if you’d optimize regarding to the accuracy you wouldn’t tell the model how close it was, just “bad” or “good”.
Measuring Performance
Accuracy or loss?
Accuracy tells you how much you classified correctly, loss tells how by how much you were off:
explanation
There are multiple different approaches to measuring performance of a (classification) model:
softmax output:
[0.3,0.4,0.3]
→ Good accuracy, bad AUROC.
[0.1,0.8,0.1]
→ Good accuracy, good AUROC.
In a nutshell: Definitely use loss in highly imbalenced datasets. (Or AUPRC)