CSAL4243: Introduction to Machine Learning

Muhammad Mudassir Khan (mudasssir.khan@ucp.edu.pk)

Lecture 1: Introduction

Overview

What is Machine Learning?
The three different types of machine learning
An introduction to the basic terminology and notations
A roadmap for building machine learning systems
Would you have survived Titanic?
Applications
Competitions
Summary
Resources
Credits

What is Machine Learning?

AI Dreams: Need for Intelligent Machines

Human interest in intelligent machines
Replicate human functionalities
Traditional algorithms were hard coded

Problems with hard-coded algorithms



In [1]:

    
from IPython.display import Image
Image(filename='./images/challenges.jpeg', width=500)









    Out[1]:

Learning algorithms

How human learns?
- From observation/examples
- Trial and error
- Practice for improving performance



In [2]:

    
from IPython.display import YouTubeVideo
YouTubeVideo("TeFF9wXiFfs")









    Out[2]:



In [8]:

    
YouTubeVideo("M5pj2CrO-2w")









    Out[8]:

Machine Learning is making computers/machcines learn from data
Learning improve over time with more data

Definition

Mitchell ( 1997 ) define Machine Learning as “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P , if its performance at tasks in T , as measured by P , improves with experience E .”

Example: playing checkers.

T = the task of playing checkers.

E = the experience of playing many games of checkers

P = the probability that the program will win the next game.

Examples

Emails -----> spam / no-spam ?
Image -----> human ?
Audio -----> Text
English -----> Urdu
Text -----> Audio
Ad,user -----> click / not click ?
Self Driving Cars

The three different types of machine learning



In [3]:

    
Image(filename='./images/01_01.png', width=500)









    Out[3]:

Learning from labled data with supervised learning



In [5]:

    
Image(filename='./images/01_02.png', width=500)









    Out[5]:

Examples:

Spam or non-spam email
cat or dog from a picture
disease or no-disease from test results
positive or negative customer review
predict hourse price from size
stock rates

Regression for predicting continuous outcomes



In [3]:

    
Image(filename='./images/01_04.png', width=300)









    Out[3]:



In [7]:

    
Image(filename='./images/01_11.png', width=500)









    Out[7]:

Classification for predicting class labels



In [5]:

    
Image(filename='./images/01_03.png', width=300)









    Out[5]:



In [7]:

    
Image(filename='./images/01_12.png', width=600)









    Out[7]:

Discovering hidden structures with unsupervised learning

Finding patterns in unlabled data

Finding subgroups with clustering



In [8]:

    
Image(filename='./images/01_06.png', width=300)









    Out[8]:

Solving interactive problems with reinforcement learning



In [7]:

    
Image(filename='./images/01_05.png', width=300)









    Out[7]:

An introduction to the basic terminology and notations



In [8]:

    
Image(filename='./images/01_08.png', width=450)









    Out[8]:

A roadmap for building machine learning systems



In [17]:

    
Image(filename='./images/01_09.png', width=500)









    Out[17]:

Would you have survived Titanic?



In [4]:

    
from __future__ import print_function

import numpy as np
import tflearn

# Download the Titanic dataset
from tflearn.datasets import titanic
titanic.download_dataset('datasets/titanic_dataset.csv')

# Load CSV file, indicate that the first column represents labels
from tflearn.data_utils import load_csv
data, labels = load_csv('datasets/titanic_dataset.csv', target_column=0,
                        categorical_labels=True, n_classes=2)


# Preprocessing function
def preprocess(data, columns_to_ignore):
    # Sort by descending id and delete columns
    for id in sorted(columns_to_ignore, reverse=True):
        [r.pop(id) for r in data]
    for i in range(len(data)):
      # Converting 'sex' field to float (id is 1 after removing labels column)
      data[i][1] = 1. if data[i][1] == 'female' else 0.
    return np.array(data, dtype=np.float32)

# Ignore 'name' and 'ticket' columns (id 1 & 6 of data array)
to_ignore=[1, 6]

# Preprocess data
data = preprocess(data, to_ignore)

print (data)









    



Downloading Titanic dataset...
Succesfully downloaded titanic_dataset.csv 82865 bytes.
[[   1.            1.           29.            0.            0.
   211.3374939 ]
 [   1.            0.            0.91670001    1.            2.
   151.55000305]
 [   1.            1.            2.            1.            2.
   151.55000305]
 ..., 
 [   3.            0.           26.5           0.            0.
     7.2249999 ]
 [   3.            0.           27.            0.            0.
     7.2249999 ]
 [   3.            0.           29.            0.            0.            7.875     ]]



In [2]:

    
# Build neural network
net = tflearn.input_data(shape=[None, 6])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 2, activation='softmax')
net = tflearn.regression(net)

# Define model
model = tflearn.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)









    



Training Step: 819  | total loss: 0.50917 | time: 0.347s
| Adam | epoch: 010 | loss: 0.50917 - acc: 0.7704 -- iter: 1296/1309
Training Step: 820  | total loss: 0.49590 | time: 0.349s
| Adam | epoch: 010 | loss: 0.49590 - acc: 0.7746 -- iter: 1309/1309
--



In [3]:

    
# Let's create some data for DiCaprio and Winslet
dicaprio = [3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000]
winslet = [1, 'Rose DeWitt Bukater', 'female', 17, 1, 2, 'N/A', 100.0000]
user = [2, 'user', 'female', 20, 0, 2, 'N/A', 50.0000]
# Preprocess data
dicaprio, winslet, user = preprocess([dicaprio, winslet, user], to_ignore)
# Predict surviving chances (class 1 results)
pred = model.predict([dicaprio, winslet, user])
print("DiCaprio Surviving Rate:", pred[0][1])
print("Winslet Surviving Rate:", pred[1][1])
print("user Surviving Rate:", pred[2][1])









    



DiCaprio Surviving Rate: 0.09548981487751007
Winslet Surviving Rate: 0.8807896375656128
user Surviving Rate: 0.7699385285377502

Applications

## Algrithm that plays Atari breakout



In [9]:

    
from IPython.display import YouTubeVideo
YouTubeVideo("V1eYniJ0Rnk")









    Out[9]:

## Machine Learning algorithm is better than human at object recognition



In [10]:

    
from datetime import timedelta
start=int(timedelta(hours=0, minutes=7, seconds=47).total_seconds())
YouTubeVideo("BfDQNrVphLQ", start=start, autoplay=1, theme="light", color="red")









    Out[10]:

## Scene understanding



In [15]:

    
Image(filename='./images/01_10.png', width=500)









    Out[15]:

## AlphaGo beat Human GO expert



In [11]:

    
YouTubeVideo("PQCrX1sQSzY")









    Out[11]:

## Self Driving Cars



In [12]:

    
YouTubeVideo("MqUbdd7ae54")









    Out[12]:

Medical diagnostics
Music Generation
Art Generation (Mario levels)
Story Writing
Speech recognition (personal assistants, chat bots)
Face recognition

Competitions

Lung cancer detection --- prize $1,000,000

Youtube 8m video tagging --- prize $100,000

Detect and classify fish --- prize $150,000

IMAGENET (Large Scale Visual Recognition) --- Fame

Efficient ConvNets for Semantic Segmentation --- prize undeclared

Titanic survival prediction --- Fame and ranking

Summary

Machine learning is at the heart of all technologies today.



In [13]:

    
start=int(timedelta(hours=0, minutes=1, seconds=23).total_seconds())
YouTubeVideo("jBLN1UJbiyw", start=start, autoplay=1, theme="light", color="red")









    Out[13]:

Resouces

Course website: https://w4zir.github.io/ml17s/

Course resources

Credits

Raschka, Sebastian. Python machine learning. Birmingham, UK: Packt Publishing, 2015. Print.