Think Like a Machine

Course Outline

Course Materials

Session 1

  • Machine learning and AI in the press -- how to separate the hype from the reality?
  • Examples of companies using machine learning to great benefit -- we've all heard of the recommender systems at Netflix and Amazon, but what else? Look at Information Week's Elite 2016 companies like Capital One and Andreesen Horowtiz examples (http://www.informationweek.com/elite100-2016.asp, http://aiplaybook.a16z.com/).
  • The economics of machine learning (a16z and HBS article)
  • The basic tasks of machine learning (Provost and Fawcett pp. 20-23)
  • EXCERCISE: What should the data look like for each of these tasks?

Session 2

  • What makes machine learning different from other approaches to learning from experience?
    • Bottom up not top down (not rules but experience)
    • Lots of data but not necessarity big data
    • Data is just a table of numbers
    • Experience, tasks, models, cost functions, and optimization algorithms
    • Machine learning tasks are giant numerical optimization problems
    • Machine learning system = Model + Cost Function + Cost Optimization Algorithm
    • Example of how the results change when the knobs are moved
    • Key operational questions for machine learning -- the switches and knobs on your control panel.
    • The promise and limits of machine learning.

Session 3

  • Linear regression -- motivating examples
    • Wine data
    • Housing prices
    • ...
  • Visualize the data (whenever possible)
  • Define the inputs
  • Define the outputs
  • Define the model
    • How the model translates into a matrix
    • All you need to know about matrix algebra -- the multiplication rule
    • Matrix notation helps you parse expressions such as $J(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_{\theta}x^{(i)}) - y^{(i)})^2$
  • Define the parameters of the model
  • Define the cost of getting it wrong
  • Pick a method for miminimizing the cost of getting it wrong and implement it
  • Check to see if your results make sense

Session 4

  • Multi-variable and non-linear regression

Session 5

  • Logistic regression

Session 6

  • Neural networks for classification
  • Support Vector Machines for classification

Session 7

  • Evaluating and fine tuning machine learning models
    • Bias and variance
    • Precision and Recall
    • Learning curves
    • Error analysis
    • Ceiling analysis

Session 8

  • Measuring similarity using K-means

Session 9

  • Principal Component Analysis and dimensionality reduction

Session 10

  • Anomaly detection

Session 11

  • Recommender systems

Session 12

  • Representing and learning from text

Session 13

  • Techniques for handling big data
    • Stochastic gradient descent
    • Mini-batch gradient descent
    • Splitting up data for parallel processing

In [ ]: