Currently, this is just a place to store notes in prep for upcoming interviews with a few different firms doing various AI applications.




Topics I have covered in my online courses so far:

Big Data/Distributed Computing:

  • Spark (Spark: MLlib, SQL, RDDs, DataFrames)
  • Using AWS Elastic MapReduce
  • Using Hadoop YARN

Deep Learning:

  • intro to TensorFlow
  • review of background and multiple types of Neural Networks

ML/DS topics:

  • Regression analysis
  • K-Means Clustering
  • Principal Component Analysis
  • Train/Test and cross validation
  • Bayesian Methods
  • Decision Trees and Random Forests
  • Multivariate Regression
  • Multi-Level Models
  • SVM Support Vector Machines
  • Reinforcement Learning
  • Collaborative Filtering
  • K-Nearest Neighbor
  • Bias/Variance Tradeoff
  • Ensemble Learning
  • Term Frequency / Inverse Document Frequency
  • Experimental Design and A/B Tests

Courses taken so far:

Udemy:

  • Data science and machine learning with Python - hands on!
  • Taming big data with Apache Spark and Python - hands on!
  • Python for Data Science and Machine Learning bootcamp (in progress)
  • Python for data structures, algorithms, and interviews!
  • Mastering Python

Others:

  • Machine Learning - Andrew Ng (Coursera/Stanford) (in progress)
  • Data Science, by Bill Howe (Coursera/U-Washington) (in progress)



Neural Networks

Neural networks are modeled after biological neural networks and attempt to allow computers to learn in a similar manner to humans through reinforcment learning.

Neural networks attemp to solve problems that would normally be easy for humans, but hard for computers.

Use cases:

  • Pattern Recognition
  • Time series predictions
  • Signal processing
  • Anomaly detection
  • control

Convolution Neural Networks (CNN)

Inspired by Multilayered Perceptron (MLP) and the animal visual cortex. "Designed to use minimal amounts of preprocessing. They have wide applications in image and video recognition, recommender systems and natural language processing"

Multilayered Perceptron (MLP) "a type of feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training the network. MLP is a modification of the standard linear perceptron and can distinguish data that are not linearly separable." from Wikipedia

"A feedforward neural network is an artificial neural network wherein connections between the units do not form a cycle. As such, it is different from recurrent neural networks." wiki

"Backpropagation, an abbreviation for "backward propagation of errors", is a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of a loss function with respect to all the weights in the network, so that the gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the loss function.

Backpropagation requires a known, desired output for each input value in order to calculate the loss function gradient. It is therefore usually considered to be a supervised learning method, although it is also used in some unsupervised networks such as autoencoders. It is a generalization of the delta rule to multi-layered feedforward networks, made possible by using the chain rule to iteratively compute gradients for each layer. Backpropagation requires that the activation function used by the artificial neurons (or "nodes") be differentiable." wiki

Radial Basis Function

"In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including function approximation, time series prediction, classification, and system control." wiki

In Python you could use neupy or scipy.interpolate.Rbf

Random Forests

Can use 'from sklearn import tree', AND pandas to organize data going into the trees. Graphviz can be used to visualize resulting trees.

  • Decision trees are very susceptible to overfitting
    • construct many trees in a 'forest' and have them all 'vote' towards the outcome classification
      • MUST randomly sample data used to make each tree!
      • Also, randomize the attributes each tree is fitting.

Bayesian Methods: Concepts

Bayes' Theorem (not covering it here as it is in my PhD and MSc theses)

One good real-world application is in a spam filter. Naive Bayes' can be used to develop a model that can discriminate normal (Ham) emails from garbage (Spam). Lots of ways to improve it, but works fairly well in a basic sense.

Spam Classifier/Filter with Naive Bayes

Supervised learning.

Steps

  • Read in emails and their classification of ham/spam the bulk of the code
  • Vectorize emails to numbers representing each word
  • Get a functional object that will perform Multinomial Naive Bayes' from sklearn
  • Fit vectorized emails
  • Check it worked with test cases

Visualizing data

Visualization of the data can enable one to discover much more than standardized statistics like mean, median or mode... an article highlighting this for the Anscombe’s Quartet problem.

  • An interesting option, is a library built ontop of matplotlib seaborn
  • An entire article discussing major 7 available Python based tools here



Common models and tools in ML/AI:



Neural Network Models:


Types:

Feedforward (acyclic graphs)

  • Autoencoders
  • Denoising autoencoders
  • Restricted Boltzmann machines (stacked, they form deep-belief networks)

Convolutional

Deep convolutional networks are SOTA for images. There are many well known architectures, including AlexNet and VGGNet.

Convolutional networks usually involved a combination of convolutional layers as well as subsampling and fully connected feedforward layers.

Recurrent

These handle time series data especially well. They can be combined with convolutional networks to generate captions for images.

Recursive

These handle natural language especially well


  • MLP Multi-layer perceptron
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • RNN Recursive Neural Network
  • LSTM Long Short Term Memory
  • FRN Fully recurrent network
  • HN Hopfield network
  • EN Elman network
  • JN Jordan network
  • ESN Echo state network
  • BRNN Bi-directional RNN

Stochastic gradient descent

Optimize the cost function while training the model to give the highest accuracy.

  • Momentum SGD
  • AdaGrad
  • RMSprop
  • AdaDelta
  • Adam
  • Nestrov’s Accelerated gradient descent
  • Grave’s RMS prop

Activation function

Outputs of perceptrons/neurons/nodes generated by passing weighted inputs through an 'activation function'.

  • Relu (simple rectifier. returns max(x,0))
  • Sigmoid
  • Soft Max
  • Max Out
  • Tanh
  • Identity
  • Leaky ReLU
  • Clipped_RelU
  • Exponential Linear Unit
  • Log Soft Max
  • Soft Plus
  • Parametric ReLU

Pre-learning program

Training to estimate best weights for inputs to nodes.

  • Denoising Auto-Encoder
  • Auto-encoder
  • Add the user code
  • Deep Boltzmann Machine
  • Restricted Boltzmann Machine Gibbs sampling
  • Restricted Boltzmann Machine Contrastive Divergence
  • Deep Belief Network
  • Gaussian unit
  • RELU unit

Data Normalization

  • standardization
  • ZPC whitening
  • ZCA whitening



Big Data Solutions/Distributed computing notes

Spark

  • Alternative to tools like MapReduce.
  • More flexible and can work with other file systems: Cassandra, AWS S3...
  • Keeps most data in memory to be faster, BUT can RAM can overflow
  • MapReduce writes back to disk after each step, so slower...
  • MLLib provides powerful, easy to use ML & data mining tools
  • RDD Resilient Distributed Datasets (OLD way, latest version uses Dataframes like those in SQL and Pandas)



Misc Others:

autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. Both of these models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting).


In [ ]: