Think of this class like a journal club. I am here, I'll facilitate it, but I want you to read (see previous expectation), and discuss what you are reading with each other. In the age of the MOOC, it is the classroom community that justifies old fashioned courses.
This is the part that I am really excited about...
On campus has benefits, at IHME has benefits (for me). What about others?
One/Two/Four exercise, or writing and sharing, depending on class size and dynamics
The web suggests several criteria: searching hypotheses vs testing hypothesis; prediction vs inference; good marketing vs bad marketing; publishing in conferences vs publishing in journals; or simply sitting in a CS department vs sitting in a Stats dept.
I think there is a more fundamental difference, which may betray my CMU math department upbringing: a foundation of mathematical logic vs a foundation of real analysis.
In [2]:
import IPython.display
In [3]:
IPython.display.Image("http://upload.wikimedia.org/wikipedia/en/c/c8/Alan_Turing_photo.jpg")
Out[3]:
In [4]:
IPython.display.YouTubeVideo("W7Rq-PEW5qM")
Out[4]:
In [4]:
IPython.display.Image('http://upload.wikimedia.org/wikipedia/en/1/1c/Stravinsky_picasso.png')
Out[4]:
In [12]:
import pandas as pd
df = pd.read_csv('https://github.com/aflaxman/AI4HM/raw/master/data/weather-numeric.csv')
In [13]:
df.head()
Out[13]:
In [14]:
def predict(s):
if s['outlook'] == 'sunny':
return 'no'
else:
return 'yes'
In [15]:
predict(df.loc[1]) # loc[1] means "location = row 1"
Out[15]:
In [16]:
i = 0
predict(df.loc[i]) == df.play[i]
Out[16]:
In [17]:
for i in df.index:
# count how many predictions are correct
pass
Experience indicates that some students may feel that they do not yet know enough about the scope of AI/ML to develop a project yet, let alone an elevator pitch for it. Here is an example of a project that I hope someone does:
This all started with our work on smoking prevalence, the details of which do not fit into the elevator ride. But the key point is we want to know how much of the population is exhibiting this important risk factor. So we ask a representative sample, via telephone survey. And we start the questions off with a screening question, "have you smoked at least 100 cigarettes in your life?". Here is the problem: there are at least 3 common interpretations of this question: 50% think it means A, 25% B, 25% C. This is important to know, but finding out required hard work using qualitative methods, cognitive interviewing, think-aloud exercises, etc. Wouldn't it be cool if when you were developing a survey, you could just ask a computer for a list of possible interpretations of your candidate question? Project: make a computer do this, so that survey designers don't have to do all the hard work of cognitive interviewing. Or at least so that they are pretty sure things are going to work when they do the testing...
In [16]:
!cd /homes/abie/nbconvert/; cp /homes/abie/notebook/2013_03_31_ML4HM_Lecture_1.ipynb L1.ipynb; ./nbconvert.py --format reveal L1.ipynb
In [1]:
import ipynb_style
reload(ipynb_style)
ipynb_style.presentation()
Out[1]:
In [ ]: