(former note name: “machine learning basics”)

Resources for beginners.

This book is a really nice intro to ML, no heavy math prereq: https://udlbook.github.io/udlbook/
Yannik Kilcher has lots of good videos where he explains papers.
Artem Kirsanov if you’re interested in Neuroscience+ML
Steve Brunton - best Math courses for anything.
For coding can recommend tinygrad (much easier to use and setup than pytorch (no 1GB download), soon faster, …).

A recipe for training neural networks
Yes you should understand backprop
NYU Deep Learning SP21
Visualization of how NN learns to classify.
Visualizing and Debugging Neural Networks with PyTorch and Weights & Biases


(very old stuff)

Introductory article with CNNs.

Must have libraries

(pytorch)
Pytorch Lightning
wandb

Data

Datasets

Train, Test, Validation Datasets
(When validating, you don’t optimize the model any further, but use it to tune hyper params etc. (which could also affect overfitting, hence it’s seperate from test set))

Custom Dataset Pytorch

Dataloaders & Datamodules

What is a dataloader? Guide
Custom Dataloader
Lightining Datamodule

Preprocessing Data

Scale and normalize your data (Changing range / distribution)

Models

What types of ML tasks? Classification, Regression, Generation?
KNN, Trees, Kmeans, etc.: Classical approaches
https://wandb.ai/site/articles/tour-of-machine-learning-models
https://wandb.ai/site/articles/p-picking-a-machine-learning-model

Optimization / Training

Batch gradient descent - Mini-batch gradient descent

Training is done in batches (many examples at once) due to GPU parellelism.
Each training sample is still completely independent from eachother in the batch.
Due to this, you don’t get the exact but an approximate gradient.

Accuracy or loss?

Basically, if you’d optimize regarding to the accuracy you wouldn’t tell the model how close it was, just “bad” or “good”.

Measuring Performance

Accuracy or loss?

Accuracy tells you how much you classified correctly, loss tells how by how much you were off:
explanation

There are multiple different approaches to measuring performance of a (classification) model:
softmax output:
[0.3,0.4,0.3] → Good accuracy, bad AUROC.
[0.1,0.8,0.1] → Good accuracy, good AUROC.
In a nutshell: Definitely use loss in highly imbalenced datasets. (Or AUPRC)