Setting up Python for machine learning: scikit-learn and IPython Notebook

From the video series: Introduction to machine learning with scikit-learn

Agenda

What are the benefits and drawbacks of scikit-learn?
How do I install scikit-learn?
How do I use the IPython Notebook?
What are some good resources for learning Python?

Benefits and drawbacks of scikit-learn

Benefits:

Consistent interface to machine learning models
Provides many tuning parameters but with sensible defaults
Exceptional documentation
Rich set of functionality for companion tasks
Active community for development and support

Potential drawbacks:

Harder (than R) to get started with machine learning
Less emphasis (than R) on model interpretability

Installing scikit-learn

Option 1: Install scikit-learn library and dependencies (NumPy and SciPy)

Option 2: Install Anaconda distribution of Python, which includes:

Hundreds of useful packages (including scikit-learn)
IPython and IPython Notebook
conda package manager
Spyder IDE

Using the IPython Notebook

Components:

IPython interpreter: enhanced version of the standard Python interpreter
Browser-based notebook interface: weave together code, formatted text, and plots

Installation:

Option 1: Install IPython and the notebook
Option 2: Included with the Anaconda distribution

Launching the Notebook:

Type ipython notebook at the command line to open the dashboard
Don't close the command line window while the Notebook is running

Keyboard shortcuts:

Command mode (gray border)

Create new cells above (a) or below (b) the current cell
Navigate using the up arrow and down arrow
Convert the cell type to Markdown (m) or code (y)
See keyboard shortcuts using h
Switch to Edit mode using Enter

Edit mode (green border)

Ctrl+Enter to run a cell
Switch to Command mode using Esc

IPython and Markdown resources:

nbviewer: view notebooks online as static documents
IPython documentation: focuses on the interpreter
IPython Notebook tutorials: in-depth introduction
GitHub's Mastering Markdown: short guide with lots of examples

Resources for learning Python

Codecademy's Python course: browser-based, tons of exercises
DataQuest: browser-based, teaches Python in the context of data science
Google's Python class: slightly more advanced, includes videos and downloadable exercises (with solutions)
Python for Informatics: beginner-oriented book, includes slides and videos

Comments or Questions?



In [1]:

    
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()









    Out[1]: