images from: http://neuralnetworksanddeeplearning.com/chap1.html
Here's where things start to get interesting. Before, we mapped our inputs directly onto our outputs through a single linear transformation.
This model is perfectly adequate when the underlying relationship between our data points and labels is approximately linear. When our data points and targets are characterized by a more complex relationship, a linear model and produce sucky results. We can model a more general class of functions by incorporating one or more hidden layers.
Here, each layer will require it's own set of parameters. To make things simple here, we'll assume two hidden layers of computation.
Word Embedding turns text into numbers
ML Algorithms don't understand text, they require input to be vectors of continuous values
Benefits:
https://github.com/saurabh3949/Word2Vec-MXNet/blob/master/Word2vec%2Bwith%2BGluon.ipynb
https://ronxin.github.io/wevi/
https://medium.com/towards-data-science/deep-learning-4-embedding-layers-f9a02d55ac12
In [ ]: