Movile Lightning Talk

Machine Learning Engineering using Scikit-Learn

Overview

[Machine Learning in Data Science]
[Data Science Toolkit]
[Main Concepts]
[Types of learning]
[Machine Learning Workflow]
[Introduction to Scikit-Learn]
[Let’s go to iPython Notebook]
- Linear Regresson (Supervised Learning)
- Decision Trees (Supervised Learning)
- GLM (Supervised Learning)
- K-Means (Unsupervised Learning)
- Random Forests (Ensemble Learning)
[Mindflow to apply Machine Learning algorithms]
[Contact Me]

[back to top]

Machine Learning in Data Science

"Data Science Venn Diagram". Licensed under CC BY-SA 3.0 via Wikimedia Commons

[back to top]

Main Concepts

Attributes (Predictiors, Variables): Columns of a dataset (thinking in Key-Value dataset)

Instance (Tuples, Records, lines): Records of a dataset

Class (Target): Column that indicates the final value of instance

Method (Technique, Algorithm (A.K.A. Algo)): It's a function or algorithm designed to use some features of a dataset and learn.

Model: It's a representation of a set of parameters of a method.

Data (Duh!)

[back to top]

Types of learning

Supervised Learning

Unsupervised Learning

Reinforcement Learning

[back to top]

Supervised Learning

It's a type of learning where some function learning is based in some labeled data (X) and the final output of this function it's a target output (y).

Training: Examples X_train together with labels y_train. Testing: Given X_test, predict y_test.

Common tasks:

Classification
Regression
Ranking

A.K.A House of Prediction Analytics!!!

[back to top]

Unsupervised Learning

In this type of learning a function doesn't have labeled data. It means that the learning comes from the whole structure of the data, instead some statistical/algorithm approximation

Examples X. Learn something about X.

Common tasks:

Dimensionality reduction
Clustering

A.K.A. learn from data itself

A.K.A No-structured info apriori

[back to top]

Reinforcement Learning

Reinforcement learning it's a term stolen from robots where an agent makes decisions based in their environment (state) and for each action this agent have some penalty (for bad actions) or reward (for good actions). The main objective is get the maximum reward in some cummulative way.

Use a structure of reward and penalty of the model with memory.