Resources

Below are some general resources for Python and IPython development, along with a number of tutorials and textbooks that will support the material for the first day of the course.

I've never used Python before!

Tutorials for scientific Python

Getting started can be a little daunting as there is so much stuff ready to use. There is a short tutorial below, which introduces key features, but here are a few tutorials to introduce the various components.

What is an IPython notebook?

Scientific packages for Python

Python is commonly used with a wide set of scientific libraries. The most important are:

  • NumPy, which provides high-performance numerical operations (e.g. linear algebra)
  • SciPy, which builds on NumPy to provide a range of tools (e.g. linear filtering, image operations, statistical tools, computational geometry)
  • Pandas, which provides a rich data type for manipulating numerical data, with named columns and rows, sophisticated joining and grouping and I/O from many kinds of files and databases.
  • Matplotlib, which provides 2D graphing functionality, including scatterplots, histograms, pseudo-color maps, and more.
  • SymPy, which adds symbolic algebra to Python (and can for example do expression simplification, series expansion, and so on)
  • IPython, the interactive notebook tool you are using right now.

Other handy packages

  • sklearn provides clustering, preprocessing (e.g. patch extraction), classification (e.g. SVM), meta-classifiers (bagging/boosting) and validation/testing tools (cross-validation, ROC curves)
  • mdp provides low-dimensional projection and independent component analysis tools (PCA, LLE, ISOMAP, RBM)
  • theano provides GPU accelerated Numpy, as well as automatic differentiation and other tools handy in deep learning.
  • pyMC is a powerful MCMC library, which can set up MCMC samplers from a simple specification, run them and produce sophisticated diagnostics. Includes a powerful Gaussian Process module, which can be used in MCMC runs.
  • statsmodels Statistical models, hypothesis testing, regression and estimators.
  • scikit-image provides scientific image manipulation tools.
  • opencv provides video access and a rich set of image processing operations (contour extraction, morphological ops, motion tracking, etc.)
  • cython Write pseudo-Python which compiles to very fast C modules, seamless integration with standard Python. Ideal for writing optimised inner-loops.
  • NLTK The python natural langauge processing toolkit, including part-of-speech tagging, tokenisers, parse tree generators.

Specialised packages

  • GPy Flexible Gaussian Processes for Python.
  • tSNE Stochastic neighbour embedding, for low-dimensional visualisation.

Resources

There are a large number of freely available textbooks and courses which cover material in this course. The following resources are recommended; you don't need to read them all, but if you want to follow things up, these are good sources.

Python for science

API References

Python docs

IPython docs

NumPy docs NumPy provides essential numerical operations (e.g. efficient matrix operations)

NumPy for MATLAB users A guide for those familiar with MATLAB

SciPy docs SciPy provides many useful functions (statistical operations, computational geoemetry, interpolation)

Matplotlib docs Matplotlib provide scientific plotting tools

scikit-learn Scikit-learn provides ready to use machine learning tools.

Tutorials

IPython tutorial Interactive guide to IPython

Python for Data Using Python to process and visualise data

IPython and Pandas A video on using IPython and Pandas

Scientific analysis with Python A complete course of scientific analysis with Python

scikit-learn introduction A very good introduction to machine learning with the scikit-learn package

machine learning IPython notebooks A gallery of various ML and data processing notebooks.

Machine Learning / Statistics

Introduction to Statistical Learning A thorough introduction to statistical learning, with both a textbook and an accompanying video lecture series. Uses R for exercises. The classic textbook "The elements of statistical learning" is also available for free online.

Deep Learning This freely-available textbook goes into great depth on deep learning, but the introductory chapters in Part 1 are an excellent introduction to the concepts needed understand machine learning topics.

Information Theory, Inference and Learning Algorithms A dense and insightful exploration of machine learning and information theory; requires serious study but explains the mathematical underpinnings of machine learning clearly and succinctly.

Machine Learning: A Probabilistic Perspective Probably the best all-round machine learning textbook. Covers a vast swathe of material in a fairly accessible manner. Not freely available

Machine Learning Andrew Ng's excellent short course on machine learning, available as a Coursera on-demand video lecture series.

Probability and Statistics Cookbook If you need a formula in probability or statistics, it's probably in here. A very compact reference book.

Control theory

Control Theory for Humans A high-level but accessible introduction to control theory. Not freely available

Manual Control -- theory and applications PDF An old (1964!) but clear and very thorough treatment of manual control (i.e. human operator performance).

Optimisation

Convex Optimization PDF A very mathematical but complete coverage of convex optimisation.


In [ ]: