Some sentiment analysis results

I've only ran some of the models at the sentiment corpora. Performance is not great: 60-70%, SOTA is around 90%


In [1]:
%cd ~/NetBeansProjects/ExpLosion/
from notebooks.common_imports import *
from gui.output_utils import *

sns.timeseries.algo.bootstrap = my_bootstrap
sns.categorical.bootstrap = my_bootstrap


/Users/miroslavbatchkarov/NetBeansProjects/ExpLosion

In [2]:
ids = Experiment.objects.filter(labelled__in=['movie-reviews-tagged', 'aclImdb-tagged'],
                                clusters__isnull=False).values_list('id', flat=True)
print(ids)
df = dataframe_from_exp_ids(ids, {'id':'id',
                                  'labelled': 'labelled',
                                  'algo': 'clusters__vectors__algorithm',
                                  'unlab': 'clusters__vectors__unlabelled',
                                  'num_cl': 'clusters__num_clusters'}).convert_objects(convert_numeric=True)
performance_table(df)


[385, 386, 387, 388, 389]
folds has 2500 values
Accuracy has 2500 values
id has 2500 values
unlab has 2500 values
num_cl has 2500 values
algo has 2500 values
labelled has 2500 values
keeping {'unlab', 'num_cl', 'algo', 'labelled'}
Out[2]:
mean ci_width
algo labelled num_cl unlab
glove aclImdb-tagged 100 wiki 62.794289 1.326575
w2v aclImdb-tagged 100 cwiki 62.226928 1.303602
wiki 62.255558 1.229679
500 wiki 66.200386 1.206691
2000 wiki 68.837472 1.041709

MR is too small- CI is almost 12% wide!


In [ ]: