Now, we're going to analyze the result from our large experiment. In this experiment, we used 30 iterations for each experiment configuration and unigram frequency as our feature. Unfortunately, experiment script using readability measures as features is still running so the result is not available yet.

Let's import our tools, Numpy and Pandas, and let matplotlib plots inline.



In [1]:

    
import pandas as pd
import numpy as np
%matplotlib inline

Read the experiment result and display it.



In [2]:

    
df = pd.read_hdf('../reports/large-exp-unigram-feats.h5', 'df')



In [3]:

    
df









    Out[3]:






  
    
      
      
      
      
      num_norm
      10
      ...
      80
    
    
      
      
      
      
      num_oot
      1
      ...
      8
    
    
      
      
      
      
      num_top
      1
      3
      5
      ...
      5
    
    
      
      
      
      
      result
      base
      perf
      base
      perf
      base
      ...
      base
      perf
    
    
      
      
      
      
      k
      0
      1
      0
      1
      0
      1
      0
      1
      0
      1
      ...
      2
      3
      4
      5
      0
      1
      2
      3
      4
      5
    
    
      method
      feature
      metric
      norm_dir
      oot_dir
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      txt_comp_dist
      unigram
      euclidean
      bbs152930
      bbs57549
       0.909091
       0.090909
       0.266667
       0.733333
       0.727273
       0.272727
       0.066667
       0.933333
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.000000
       0.100000
       0.300000
       0.400000
       0.200000
       0.000000
    
    
      mus10142
       0.909091
       0.090909
       0.000000
       1.000000
       0.727273
       0.272727
       0.066667
       0.933333
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.000000
       0.000000
       0.000000
       0.100000
       0.900000
       0.000000
    
    
      phy40008
       0.909091
       0.090909
       0.200000
       0.800000
       0.727273
       0.272727
       0.100000
       0.900000
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.000000
       0.266667
       0.100000
       0.433333
       0.200000
       0.000000
    
    
      phy17301
      bbs57549
       0.909091
       0.090909
       0.066667
       0.933333
       0.727273
       0.272727
       0.000000
       1.000000
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.000000
       0.166667
       0.266667
       0.500000
       0.066667
       0.000000
    
    
      mus10142
       0.909091
       0.090909
       0.000000
       1.000000
       0.727273
       0.272727
       0.000000
       1.000000
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.000000
       0.000000
       0.066667
       0.400000
       0.466667
       0.066667
    
    
      phy40008
       0.909091
       0.090909
       0.366667
       0.633333
       0.727273
       0.272727
       0.066667
       0.933333
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.133333
       0.166667
       0.500000
       0.166667
       0.033333
       0.000000
    
    
      mus1139
      bbs57549
       0.909091
       0.090909
       0.066667
       0.933333
       0.727273
       0.272727
       0.166667
       0.833333
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.233333
       0.333333
       0.366667
       0.066667
       0.000000
       0.000000
    
    
      mus10142
       0.909091
       0.090909
       0.466667
       0.533333
       0.727273
       0.272727
       0.300000
       0.700000
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.400000
       0.166667
       0.300000
       0.133333
       0.000000
       0.000000
    
    
      phy40008
       0.909091
       0.090909
       0.300000
       0.700000
       0.727273
       0.272727
       0.133333
       0.866667
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.366667
       0.300000
       0.333333
       0.000000
       0.000000
       0.000000
    
  

9 rows × 116 columns

Let's compute the average performance over all threads.



In [4]:

    
df2 = df.groupby(level=['method', 'feature', 'metric']).mean()



In [5]:

    
df2









    Out[5]:






  
    
      
      
      num_norm
      10
      ...
      80
    
    
      
      
      num_oot
      1
      ...
      8
    
    
      
      
      num_top
      1
      3
      5
      ...
      5
    
    
      
      
      result
      base
      perf
      base
      perf
      base
      ...
      base
      perf
    
    
      
      
      k
      0
      1
      0
      1
      0
      1
      0
      1
      0
      1
      ...
      2
      3
      4
      5
      0
      1
      2
      3
      4
      5
    
    
      method
      feature
      metric
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      txt_comp_dist
      unigram
      euclidean
       0.909091
       0.090909
       0.192593
       0.807407
       0.727273
       0.272727
       0.1
       0.9
       0.545455
       0.454545
      ...
       0.058722
       0.004517
       0.000143
       0.000001
       0.125926
       0.166667
       0.248148
       0.244444
       0.207407
       0.007407
    
  

1 rows × 116 columns

It is easier to scroll vertically, so let's display its transpose instead.



In [6]:

    
df2.T









    Out[6]:






  
    
      
      
      
      
      method
      txt_comp_dist
    
    
      
      
      
      
      feature
      unigram
    
    
      
      
      
      
      metric
      euclidean
    
    
      num_norm
      num_oot
      num_top
      result
      k
      
    
  
  
    
      10
      1
      1
      base
      0
       0.909091
    
    
      1
       0.090909
    
    
      perf
      0
       0.192593
    
    
      1
       0.807407
    
    
      3
      base
      0
       0.727273
    
    
      1
       0.272727
    
    
      perf
      0
       0.100000
    
    
      1
       0.900000
    
    
      5
      base
      0
       0.545455
    
    
      1
       0.454545
    
    
      perf
      0
       0.051852
    
    
      1
       0.948148
    
    
      4
      1
      base
      0
       0.714286
    
    
      1
       0.285714
    
    
      perf
      0
       0.133333
    
    
      1
       0.866667
    
    
      3
      base
      0
       0.329670
    
    
      1
       0.494505
    
    
      2
       0.164835
    
    
      3
       0.010989
    
    
      perf
      0
       0.092593
    
    
      1
       0.229630
    
    
      2
       0.255556
    
    
      3
       0.422222
    
    
      5
      base
      0
       0.125874
    
    
      1
       0.419580
    
    
      2
       0.359640
    
    
      3
       0.089910
    
    
      4
       0.004995
    
    
      perf
      0
       0.011111
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      80
      4
      5
      base
      4
       0.000003
    
    
      perf
      0
       0.196296
    
    
      1
       0.229630
    
    
      2
       0.285185
    
    
      3
       0.214815
    
    
      4
       0.074074
    
    
      8
      1
      base
      0
       0.909091
    
    
      1
       0.090909
    
    
      perf
      0
       0.537037
    
    
      1
       0.462963
    
    
      3
      base
      0
       0.748706
    
    
      1
       0.230371
    
    
      2
       0.020413
    
    
      3
       0.000510
    
    
      perf
      0
       0.133333
    
    
      1
       0.270370
    
    
      2
       0.507407
    
    
      3
       0.088889
    
    
      5
      base
      0
       0.613645
    
    
      1
       0.322971
    
    
      2
       0.058722
    
    
      3
       0.004517
    
    
      4
       0.000143
    
    
      5
       0.000001
    
    
      perf
      0
       0.125926
    
    
      1
       0.166667
    
    
      2
       0.248148
    
    
      3
       0.244444
    
    
      4
       0.207407
    
    
      5
       0.007407
    
  

116 rows × 1 columns

Let's put the baseline and performance side by side so we can compare them more easily.



In [7]:

    
df3 = df2.T.unstack(level='result')



In [8]:

    
df3









    Out[8]:






  
    
      
      
      
      method
      txt_comp_dist
    
    
      
      
      
      feature
      unigram
    
    
      
      
      
      metric
      euclidean
    
    
      
      
      
      result
      base
      perf
    
    
      num_norm
      num_oot
      num_top
      k
      
      
    
  
  
    
      10
      1
      1
      0
       0.909091
       0.192593
    
    
      1
       0.090909
       0.807407
    
    
      3
      0
       0.727273
       0.100000
    
    
      1
       0.272727
       0.900000
    
    
      5
      0
       0.545455
       0.051852
    
    
      1
       0.454545
       0.948148
    
    
      4
      1
      0
       0.714286
       0.133333
    
    
      1
       0.285714
       0.866667
    
    
      3
      0
       0.329670
       0.092593
    
    
      1
       0.494505
       0.229630
    
    
      2
       0.164835
       0.255556
    
    
      3
       0.010989
       0.422222
    
    
      5
      0
       0.125874
       0.011111
    
    
      1
       0.419580
       0.137037
    
    
      2
       0.359640
       0.225926
    
    
      3
       0.089910
       0.300000
    
    
      4
       0.004995
       0.325926
    
    
      8
      1
      0
       0.555556
       0.100000
    
    
      1
       0.444444
       0.900000
    
    
      3
      0
       0.147059
       0.033333
    
    
      1
       0.441176
       0.200000
    
    
      2
       0.343137
       0.333333
    
    
      3
       0.068627
       0.433333
    
    
      5
      0
       0.029412
       0.011111
    
    
      1
       0.196078
       0.088889
    
    
      2
       0.392157
       0.162963
    
    
      3
       0.294118
       0.333333
    
    
      4
       0.081699
       0.259259
    
    
      5
       0.006536
       0.144444
    
    
      80
      1
      1
      0
       0.987654
       0.844444
    
    
      1
       0.012346
       0.155556
    
    
      3
      0
       0.962963
       0.507407
    
    
      1
       0.037037
       0.492593
    
    
      5
      0
       0.938272
       0.470370
    
    
      1
       0.061728
       0.529630
    
    
      4
      1
      0
       0.952381
       0.637037
    
    
      1
       0.047619
       0.362963
    
    
      3
      0
       0.862264
       0.240741
    
    
      1
       0.132656
       0.411111
    
    
      2
       0.005038
       0.325926
    
    
      3
       0.000042
       0.022222
    
    
      5
      0
       0.778699
       0.196296
    
    
      1
       0.204921
       0.229630
    
    
      2
       0.015968
       0.285185
    
    
      3
       0.000409
       0.214815
    
    
      4
       0.000003
       0.074074
    
    
      8
      1
      0
       0.909091
       0.537037
    
    
      1
       0.090909
       0.462963
    
    
      3
      0
       0.748706
       0.133333
    
    
      1
       0.230371
       0.270370
    
    
      2
       0.020413
       0.507407
    
    
      3
       0.000510
       0.088889
    
    
      5
      0
       0.613645
       0.125926
    
    
      1
       0.322971
       0.166667
    
    
      2
       0.058722
       0.248148
    
    
      3
       0.004517
       0.244444
    
    
      4
       0.000143
       0.207407
    
    
      5
       0.000001
       0.007407

A little explanation, k here denotes the number of OOT posts found in the top list.

The table looks very neat now. However, it is still difficult to compare baseline and performance distribution this way. So, let's just plot them so we can see more clearly the shape of the distribution.

First, group them by the number of normal posts, OOT posts, and posts in top list. Each (num_norm, num_oot, num_top) configuration represents a different random event so we have to plot each of them separately.



In [9]:

    
grouped = df3.groupby(level=['num_norm', 'num_oot', 'num_top'])

Now, simply plot each group. In the plot, blue and green denotes the baseline and actual performance respectively.



In [10]:

    
for name, group in grouped:
    group.plot(kind='bar', legend=False, use_index=False, title='num_norm={}, num_oot={}, num_top={}'.format(*name))

Great! Now we can see that our method performance, in this case txt_comp_dist, is better than baseline most of the time since its distribution has higher probability for large values of k compared to baseline.

Now, we may have plot the distributions and are able to draw some conclusion, but wouldn't it be nice if we can represent each distribution with one numerical value and see whose value is larger to determine the superior one? That's what we're going to do now. Let's represent each distribution with its expected value. (why?)



In [11]:

    
ngroup = len(grouped)
data = np.empty((ngroup, 2))
index = []
for i, (name, _) in enumerate(grouped):
    tmp = df3.loc[name]
    prod = tmp.T * np.array(tmp.index)   # multiply pmf and support
    prod = prod.unstack(level='result')
    expval = prod.sum(axis=1, level='result').values.ravel()
    data[i,:] = expval
    index.append(name)



In [12]:

    
data









    Out[12]:





array([[ 0.09090909,  0.80740741],
       [ 0.27272727,  0.9       ],
       [ 0.45454545,  0.94814815],
       [ 0.28571429,  0.86666667],
       [ 0.85714286,  2.00740741],
       [ 1.42857143,  2.79259259],
       [ 0.44444444,  0.9       ],
       [ 1.33333333,  2.16666667],
       [ 2.22222222,  3.17407407],
       [ 0.01234568,  0.15555556],
       [ 0.03703704,  0.49259259],
       [ 0.0617284 ,  0.52962963],
       [ 0.04761905,  0.36296296],
       [ 0.14285714,  1.12962963],
       [ 0.23809524,  1.74074074],
       [ 0.09090909,  0.46296296],
       [ 0.27272727,  1.55185185],
       [ 0.45454545,  2.26296296]])

Doesn't look very nice, eh? Let's create a DataFrame so it can be displayed nicely.



In [13]:

    
index = pd.MultiIndex.from_tuples(index, names=['num_norm', 'num_oot', 'num_top'])
columns = pd.MultiIndex.from_tuples([('E[X]', 'base'), ('E[X]', 'perf')])



In [14]:

    
result = pd.DataFrame(data, index=index, columns=columns)



In [15]:

    
result

We're done! Remember that in this experiment, we used txt_comp_dist anomalous text detection method with unigram frequency as features and euclidean distance metric.

				num_norm	10										...	80
				num_oot	1										...	8
				num_top	1				3				5		...	5
				result	base		perf		base		perf		base		...	base				perf
				k	0	1	0	1	0	1	0	1	0	1	...	2	3	4	5	0	1	2	3	4	5
method	feature	metric	norm_dir	oot_dir
txt_comp_dist	unigram	euclidean	bbs152930	bbs57549	0.909091	0.090909	0.266667	0.733333	0.727273	0.272727	0.066667	0.933333	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.000000	0.100000	0.300000	0.400000	0.200000	0.000000
				mus10142	0.909091	0.090909	0.000000	1.000000	0.727273	0.272727	0.066667	0.933333	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.000000	0.000000	0.000000	0.100000	0.900000	0.000000
				phy40008	0.909091	0.090909	0.200000	0.800000	0.727273	0.272727	0.100000	0.900000	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.000000	0.266667	0.100000	0.433333	0.200000	0.000000
			phy17301	bbs57549	0.909091	0.090909	0.066667	0.933333	0.727273	0.272727	0.000000	1.000000	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.000000	0.166667	0.266667	0.500000	0.066667	0.000000
				mus10142	0.909091	0.090909	0.000000	1.000000	0.727273	0.272727	0.000000	1.000000	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.000000	0.000000	0.066667	0.400000	0.466667	0.066667
				phy40008	0.909091	0.090909	0.366667	0.633333	0.727273	0.272727	0.066667	0.933333	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.133333	0.166667	0.500000	0.166667	0.033333	0.000000
			mus1139	bbs57549	0.909091	0.090909	0.066667	0.933333	0.727273	0.272727	0.166667	0.833333	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.233333	0.333333	0.366667	0.066667	0.000000	0.000000
				mus10142	0.909091	0.090909	0.466667	0.533333	0.727273	0.272727	0.300000	0.700000	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.400000	0.166667	0.300000	0.133333	0.000000	0.000000
				phy40008	0.909091	0.090909	0.300000	0.700000	0.727273	0.272727	0.133333	0.866667	0.545455	0.454545	...	0.058722	0.004517	0.000143	0.000001	0.366667	0.300000	0.333333	0.000000	0.000000	0.000000

			E[X]
			base	perf
num_norm	num_oot	num_top
10	1	1	0.090909	0.807407
		3	0.272727	0.900000
		5	0.454545	0.948148
	4	1	0.285714	0.866667
		3	0.857143	2.007407
		5	1.428571	2.792593
	8	1	0.444444	0.900000
		3	1.333333	2.166667
		5	2.222222	3.174074
80	1	1	0.012346	0.155556
		3	0.037037	0.492593
		5	0.061728	0.529630
	4	1	0.047619	0.362963
		3	0.142857	1.129630
		5	0.238095	1.740741
	8	1	0.090909	0.462963
		3	0.272727	1.551852
		5	0.454545	2.262963