Introduction to machine learning

The Python ecosystem for ML

The Python ML ecosystem

In this book, we will use Python 3. For a good introduction, see e.g., the free books Whirlwind tour of Python by Jake Vanderplas or Dive into Python 3 by Mark Pilgrim.

This document is an example of a Jupyter notebook, which mixes code and results. When developing larger software projects, it is often better to use an IDE (interactive development environment), which keeps the code in separate files. I recommend Spyder, although many people use JupyterLab for a browser-based solution.

Software for datascience and ML

We will leverage many standard libraries from Python's "data science stack", listed in the table below. For a good introduction to these, see e.g., the free book Python Datascience Handbook by Jake Vanderplas, or the class Computational Statistics in Python by Cliburn Chen at Duke University. For an excellent book on scikit-learn, see Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow v2 by Aurelion Geron.

Name	Functionality
Numpy	Vector and matrix computations
Scipy	Various scientific / math / stats / optimization functions
Matplotlib	Plotting
Seaborn	Extension of Matplotlib
Pandas	Manipulating tabular data and time series
Scikit-learn	Implements many "Classical" ML methods

Software for deep learning

Deep learning is about composing differentiable functions into more complex functions, represented as a computation graph, and then using automatic differentiation ("autograd") to compute gradients, which we can pass to an optimizer, to fit the function to data. This is sometimes called "differentiable programming".

There are several libraries that can execute such computation graphs on hardware accelerators, such as GPUs. (Some libraries also support distributed computation, but we will not need use this feature in this book.) We list a few popular libraries below.

Name	Functionality	More info
Tensorflow 2.0	Accelerated numpy-like library with autograd support. Keras API.
JAX	Accelerated numpy, functional code transformations (autograd, JIT, VMAP, etc)
Pytorch	Similar to TF 2.0	Official PyTorch tutorials
MXNet	Similar to TF 2.0. Gluon API.	Dive into deep learning book

Software for probabilistic modeling

In this book, we will be focusing on probabilistic models, both supervised (conditional) models of the form $p(y|x)$, as well as unsupervised models of the form $p(z,x)$, where $x$ are the features, $y$ are the labels (if present), and $z$ are the latent variables. GMMs and PCA, which we discuss in the unsupervised learning notebook, are very simple examples of such latent variable models. However, to create more complex models, we need to move beyond scikit-learn. In addition, we will often need more than just gradient-based optimization, in order to handle discrete variables and randomly-shaped data structures.

There are a variety of Python libraries for probabilistic modeling, some of which build on top of deep learning libraries, and extend them to handle stochastic functions and probabilistic inference. If the model is specified declaratively, using a domain specific language (DSL) or an application programming interface (API), we will call it a "probabilistic modeling language" (PML). If the system uses a lower level interface, and allows the creation of more flexible models (e.g., using stochastic control flow), we will call it a "probabilistic programming language" (PPL). We list a few examples below.

Name	Functionality
Pyro	PPL built on top of PyTorch.
NumPyro	Lightweight version of Pyro, using JAX instead of PyTorch as the backend.
TF Probability (TFP)	PPL on top of Tensorflow.
PyStan	Python interface to Stan, which uses the BUGS DSL for PGMs. Supports MCMC and VI. Custom C++ autodiff library.
PyMc	Similar functionality to PyStan without the C++ part. v3 uses Theano for autograd. v4 will use TF for autograd.
Pgmpy	Python API for (non-Bayesian) discrete PGMs. No support for autodiff or GPUs.

Installing software inside Google Colab notebook

See .

Exploratory data analysis

See

Logistic regression

See

Linear regression

See

Deep neural networks

See

Unsupervised learning

See