1) CNN or ConvNets
2) If you feed them faces images they learn edges,dots bright spots, dark spots
3) Multi-Layer NN : Second layer eyes, noses and in third layer identify faces
4) CNN can learn video games by learning patterns
5) CNN can be used to learn from videos
Lets take a two dimensional array of pixels like checker board where each square is light or dark is it a picture or X or O
Tricker cases: translation, scaling, rotation, weight
CNN: matches parts of images, rather than the whole thing
Example of the features are: 3*3 images ( parts of images) and then matched which is called Filtering
The math behind this match is:
Input Image => Convolution Layer -> ReLU -> Pooling (multiple)
Where do all the magic numbers come from?
a) Features in convolutional layers
b) Voting weights in fully connected layers
A: Back Propogation ( an error in the final answer is used to determine how much the network adjusts and changes)
Gradient Descent: For each feature pixel and voting weight they adjust it up and down and see how error changes. The amount they adjusted is is guessed by how big the error is like sliding a ball to right and left.
Convolution:
Pooling:
Fully Connected
a) How many of each type of layer?
b) In what order?
c) can we design a new type of layer?
Not just 2D or 3D images but we can also apply to structured data
- Images
- Sound (timesteps close to each other are closely related)
- Text - Position in sentence is column, and row is word in dictionary (take a filter from top to bottom and slide left to right)
ConvNets are graat at finding patterns and using classify images