Introduction to machine learning

The Python ML ecosystem

In this book, we will use Python 3. For a good introduction, see e.g., the free books Whirlwind tour of Python by Jake Vanderplas or Dive into Python 3 by Mark Pilgrim.

This document is an example of a Jupyter notebook, which mixes code and results. When developing larger software projects, it is often better to use an IDE (interactive development environment), which keeps the code in separate files. I recommend Spyder, although many people use JupyterLab for a browser-based solution.

Software for datascience and ML

We will leverage many standard libraries from Python's "data science stack", listed in the table below. For a good introduction to these, see e.g., the free book Python Datascience Handbook by Jake Vanderplas, or the class Computational Statistics in Python by Cliburn Chen at Duke University. For an excellent book on scikit-learn, see Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow v2 by Aurelion Geron.

Name Functionality
Numpy Vector and matrix computations
Scipy Various scientific / math / stats / optimization functions
Matplotlib Plotting
Seaborn Extension of Matplotlib
Pandas Manipulating tabular data and time series
Scikit-learn Implements many "Classical" ML methods

Software for deep learning

Deep learning is about composing differentiable functions into more complex functions, represented as a computation graph, and then using automatic differentiation ("autograd") to compute gradients, which we can pass to an optimizer, to fit the function to data. This is sometimes called "differentiable programming".

There are several libraries that can execute such computation graphs on hardware accelerators, such as GPUs. (Some libraries also support distributed computation, but we will not need use this feature in this book.) We list a few popular libraries below.

Name Functionality More info
Tensorflow 2.0 Accelerated numpy-like library with autograd support. Keras API.
JAX Accelerated numpy, functional code transformations (autograd, JIT, VMAP, etc)
Pytorch Similar to TF 2.0 Official PyTorch tutorials
MXNet Similar to TF 2.0. Gluon API. Dive into deep learning book

Software for probabilistic modeling

In this book, we will be focusing on probabilistic models, both supervised (conditional) models of the form $p(y|x)$, as well as unsupervised models of the form $p(z,x)$, where $x$ are the features, $y$ are the labels (if present), and $z$ are the latent variables. GMMs and PCA, which we discuss in the unsupervised learning notebook, are very simple examples of such latent variable models. However, to create more complex models, we need to move beyond scikit-learn. In addition, we will often need more than just gradient-based optimization, in order to handle discrete variables and randomly-shaped data structures.

There are a variety of Python libraries for probabilistic modeling, some of which build on top of deep learning libraries, and extend them to handle stochastic functions and probabilistic inference. If the model is specified declaratively, using a domain specific language (DSL) or an application programming interface (API), we will call it a "probabilistic modeling language" (PML). If the system uses a lower level interface, and allows the creation of more flexible models (e.g., using stochastic control flow), we will call it a "probabilistic programming language" (PPL). We list a few examples below.

Name Functionality
Pyro PPL built on top of PyTorch.
NumPyro Lightweight version of Pyro, using JAX instead of PyTorch as the backend.
TF Probability (TFP) PPL on top of Tensorflow.
PyStan Python interface to Stan, which uses the BUGS DSL for PGMs. Supports MCMC and VI. Custom C++ autodiff library.
PyMc Similar functionality to PyStan without the C++ part. v3 uses Theano for autograd. v4 will use TF for autograd.
Pgmpy Python API for (non-Bayesian) discrete PGMs. No support for autodiff or GPUs.

Installing software inside Google Colab notebook

See .

Exploratory data analysis

See

Logistic regression

See

Linear regression

See

Deep neural networks

See

Unsupervised learning

See