This notebook was originally put together by [Jake Vanderplas](http://www.vanderplas.com) for PyCon 2014. [Peter Prettenhofer](https://github.com/pprett) adapted it for PyCon Ukraine 2014. Source and license info is on [GitHub](https://github.com/pprett/sklearn_pycon2014/).
Outline:
10:00 - 10:30 Preliminaries: Setup & introduction
10:30 - 11:15 Basic Principles of Machine Learning and the Scikit-learn Interface
11:15 - 12:00 Supervised learning in-depth
12:00 - 12:30 Unsupervised learning in-depth
12:30 - 13:00 Validation and Model Selection
This tutorial requires the following packages:
numpy
version 1.5 or later: http://www.numpy.org/scipy
version 0.9 or later: http://www.scipy.org/matplotlib
version 1.0 or later: http://matplotlib.org/scikit-learn
version 0.14 or later: http://scikit-learn.orgipython
version 1.0 or later, with notebook support: http://ipython.orgThe easiest way to get these is to use an all-in-one installer such as Anaconda from Continuum. These are available for multiple architectures.
If you're using Anaconda, simpy type
conda install scikit-learn
Otherwise it's best to install from source (requires a C compiler):
git clone https://github.com/scikit-learn/scikit-learn.git
cd scikit-learn
python setup.py install
Scikit-learn requires NumPy
and SciPy
, and examples require Matplotlib
.
Note: some examples used in this tutorial require the scripts in the fig_code
directory, which can be found within the notebooks
subdirectory of the Github repository at https://github.com/pprett/sklearn_pycon2014/
Linux: If you're on Linux, you can use the linux distribution tools (by typing, for example apt-get install numpy
or yum install numpy
.
Mac: If you're on OSX, there are similar tools such as MacPorts or HomeBrew which contain pre-compiled versions of these packages.
Windows: Windows can be challenging: the best bet is probably to use a package installer such as Anaconda, above.
In [1]:
import numpy
print 'numpy:', numpy.__version__
import scipy
print 'scipy:', scipy.__version__
import matplotlib
print 'matplotlib:', matplotlib.__version__
import sklearn
print 'scikit-learn:', sklearn.__version__
In [ ]: