notebook.community
Edit and run
Text
Feature extraction
Binary feature vector: (0/1) for presence or absence of a word
Limitaion: cannot capture the importance of a word
Term frequency
Limitation: cannot stopwords (she, it, the, ..)
Term frequency inverse document frequency
Example
Distance Measures