Included Stop words


In [1]:
from stemgraphic.stopwords import EN, FR, ES, ALT_EN

Very short list of English stop words


In [2]:
len(ALT_EN)


Out[2]:
27

In [3]:
print(ALT_EN)


['a', 'am', 'an', 'and', 'are', 'as', 'at', 'been', 'for', 'from', 'in', 'is', 'of', 'on', 'or', 'out', 'so', 'such', 'that', 'the', 'these', 'this', 'those', 'to', 'upon', 'was', 'were']

The French and Spanish stop words are quite similar, but Spanish has several gender specific words (i.e. quelque vs. algun, algunos, algunas) so it is larger.


In [4]:
len(FR)


Out[4]:
127

In [5]:
len(ES)


Out[5]:
183

The main English stop word list is significantly larger.


In [6]:
len(EN)


Out[6]:
316

In [ ]: