r-squared: 0 = bad, 1 = perfect (all of variance is captured by model)
Make distribution more complicated to see if scikit-learn can fit it
With a high-N polynomial, it is unlikely to hold up to future testing and only fits the test data well.
Just regression above with more than one variable being fit.
Multi-Level Models attempt to model these interdepndencies
Commonly applied in healthcare.
Not covered in more detailed beyond general discusion in lecture 22, and instead recommends a [book] for further reading.
One good real-world application is in a spam filter. Naive Bayes' can be used to develop a model that can discriminate normal (Ham) emails from garbage (Spam). Lots of ways to improve it, but works fairly well in a basic sense.
Supervised learning.
Attempts to split data into K groups that are closest to K centroids.
(1)Centroids are adjusted to the center of the points that were closest to it.
(2)Points are then used to find which centroids they are closest to again.
repeat 1 & 2 until error or distance centroids move converges.
choosing K
use different randomly chosen initial centroids to avoid local minima
Still need to determine labels for clusters found.
Multiple models work together to make a prediction
Ex. Identify types of iris flower by length and width of sepal.
With a simple linear kernal.
Resolves some of the problems that arise from using people's actions to make recommendations mentioned above.
Supervised learning
When data has too many dimensions, extract sets of basis data that can be combined to re-produce the high-D data sufficiently. In another way: find a way to represent the data with minimal dimensions that sufficiently preserves its variance.
Ex. Identify types of iris flower by length and width of sepal. Data comes with scikit-learn.
The more 'traditional' approach.
transformed data is loaded into warehouse
BUT, step 2, transform can be a big problem with "big data"
Push intensive transformation step to the end where it can be better optimized. This approach is now much more scalable than ETL.
One example is Pac-Man.
Implementation of reinforcement learning.
have:
Start all Q's at 0
Use Bayes theorem to include intelligent randomness into exploration to increase the learning efficiency. Thus, a Markov Decision Process (MDP)
Use This in tandem with Q-learning to build up a list of all possible states and the reward values (Q values) for every available action in that state. Can be considered to implement Dynamic Programming or Memoization in some cases or terms.
Lectures 48-53 based on issues of applying course fundamentals to real world data.
Using MLLib to esentially do things like K-Means Clustering, Decision Trees... reviewed in pure Python before, but in a way that could be ran locally OR on a Hadoop cluster with Amazon Web Services (AWS).
Deep convolutional networks are SOTA for images. There are many well known architectures, including AlexNet and VGGNet.
Convolutional networks usually involved a combination of convolutional layers as well as subsampling and fully connected feedforward layers.
These handle time series data especially well. They can be combined with convolutional networks to generate captions for images.
These handle natural language especially well
Optimize the cost function while training the model to give the highest accuracy.
Outputs of perceptrons/neurons/nodes generated by passing weighted inputs through an 'activation function'.
Training to estimate best weights for inputs to nodes.
In [ ]: