Currently, this is just a place to store notes in prep for upcoming interviews with a few different firms doing various AI applications.

Topics I have covered in my online courses so far:

Big Data/Distributed Computing:

Spark (Spark: MLlib, SQL, RDDs, DataFrames)
Using AWS Elastic MapReduce
Using Hadoop YARN

Deep Learning:

intro to TensorFlow
review of background and multiple types of Neural Networks

ML/DS topics:

Regression analysis
K-Means Clustering
Principal Component Analysis
Train/Test and cross validation
Bayesian Methods
Decision Trees and Random Forests
Multivariate Regression
Multi-Level Models
SVM Support Vector Machines
Reinforcement Learning
Collaborative Filtering
K-Nearest Neighbor
Bias/Variance Tradeoff
Ensemble Learning
Term Frequency / Inverse Document Frequency
Experimental Design and A/B Tests

Courses taken so far:

Udemy:

Data science and machine learning with Python - hands on!
Taming big data with Apache Spark and Python - hands on!
Python for Data Science and Machine Learning bootcamp (in progress)
Python for data structures, algorithms, and interviews!
Mastering Python

Others:

Machine Learning - Andrew Ng (Coursera/Stanford) (in progress)
Data Science, by Bill Howe (Coursera/U-Washington) (in progress)

Neural Networks

Neural networks are modeled after biological neural networks and attempt to allow computers to learn in a similar manner to humans through reinforcment learning.

Neural networks attemp to solve problems that would normally be easy for humans, but hard for computers.

Use cases:

Pattern Recognition
Time series predictions
Signal processing
Anomaly detection
control

Convolution Neural Networks (CNN)

Inspired by Multilayered Perceptron (MLP) and the animal visual cortex. "Designed to use minimal amounts of preprocessing. They have wide applications in image and video recognition, recommender systems and natural language processing"

Multilayered Perceptron (MLP) "a type of feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training the network. MLP is a modification of the standard linear perceptron and can distinguish data that are not linearly separable." from Wikipedia

"A feedforward neural network is an artificial neural network wherein connections between the units do not form a cycle. As such, it is different from recurrent neural networks." wiki

"Backpropagation, an abbreviation for "backward propagation of errors", is a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of a loss function with respect to all the weights in the network, so that the gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the loss function.

Backpropagation requires a known, desired output for each input value in order to calculate the loss function gradient. It is therefore usually considered to be a supervised learning method, although it is also used in some unsupervised networks such as autoencoders. It is a generalization of the delta rule to multi-layered feedforward networks, made possible by using the chain rule to iteratively compute gradients for each layer. Backpropagation requires that the activation function used by the artificial neurons (or "nodes") be differentiable." wiki

Radial Basis Function

"In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including function approximation, time series prediction, classification, and system control." wiki

In Python you could use neupy or scipy.interpolate.Rbf

Random Forests

Can use 'from sklearn import tree', AND pandas to organize data going into the trees. Graphviz can be used to visualize resulting trees.

Decision trees are very susceptible to overfitting
- construct many trees in a 'forest' and have them all 'vote' towards the outcome classification
  - MUST randomly sample data used to make each tree!
  - Also, randomize the attributes each tree is fitting.

Bayesian Methods: Concepts

Bayes' Theorem (not covering it here as it is in my PhD and MSc theses)

One good real-world application is in a spam filter. Naive Bayes' can be used to develop a model that can discriminate normal (Ham) emails from garbage (Spam). Lots of ways to improve it, but works fairly well in a basic sense.

Spam Classifier/Filter with Naive Bayes

Supervised learning.

Steps

Read in emails and their classification of ham/spam the bulk of the code
Vectorize emails to numbers representing each word
Get a functional object that will perform Multinomial Naive Bayes' from sklearn
Fit vectorized emails
Check it worked with test cases

Visualizing data

Visualization of the data can enable one to discover much more than standardized statistics like mean, median or mode... an article highlighting this for the Anscombe’s Quartet problem.

An interesting option, is a library built ontop of matplotlib seaborn
An entire article discussing major 7 available Python based tools here

Common models and tools in ML/AI:

Neural Network Models:

Types:

Feedforward (acyclic graphs)

Autoencoders
Denoising autoencoders
Restricted Boltzmann machines (stacked, they form deep-belief networks)

Convolutional

Deep convolutional networks are SOTA for images. There are many well known architectures, including AlexNet and VGGNet.

Convolutional networks usually involved a combination of convolutional layers as well as subsampling and fully connected feedforward layers.

Recurrent

These handle time series data especially well. They can be combined with convolutional networks to generate captions for images.

Recursive

These handle natural language especially well

MLP Multi-layer perceptron
CNN Convolutional Neural Network
RNN Recurrent Neural Network
RNN Recursive Neural Network
LSTM Long Short Term Memory
FRN Fully recurrent network
HN Hopfield network
EN Elman network
JN Jordan network
ESN Echo state network
BRNN Bi-directional RNN

Stochastic gradient descent

Optimize the cost function while training the model to give the highest accuracy.

Momentum SGD
AdaGrad
RMSprop
AdaDelta
Adam
Nestrov’s Accelerated gradient descent
Grave’s RMS prop

Activation function

Outputs of perceptrons/neurons/nodes generated by passing weighted inputs through an 'activation function'.

Relu (simple rectifier. returns max(x,0))
Sigmoid
Soft Max
Max Out
Tanh
Identity
Leaky ReLU
Clipped_RelU
Exponential Linear Unit
Log Soft Max
Soft Plus
Parametric ReLU

Pre-learning program

Training to estimate best weights for inputs to nodes.

Denoising Auto-Encoder
Auto-encoder
Add the user code
Deep Boltzmann Machine
Restricted Boltzmann Machine Gibbs sampling
Restricted Boltzmann Machine Contrastive Divergence
Deep Belief Network
Gaussian unit
RELU unit

Data Normalization

standardization
ZPC whitening
ZCA whitening

Big Data Solutions/Distributed computing notes

Spark

Alternative to tools like MapReduce.
More flexible and can work with other file systems: Cassandra, AWS S3...
Keeps most data in memory to be faster, BUT can RAM can overflow
MapReduce writes back to disk after each step, so slower...
MLLib provides powerful, easy to use ML & data mining tools
RDD Resilient Distributed Datasets (OLD way, latest version uses Dataframes like those in SQL and Pandas)

Misc Others:

autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. Both of these models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting).



In [ ]: