http://ipg.idsia.ch/tutorials/2016/bayesian-tests-ml/
A nonparametric procedure is a statistical procedure that has a certain desirable properties that hold under relatively mild assumptions regarding the underlying populations from which the data are obtained.
Hollander, Myles, Douglas A. Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley and Sons, 2013.</sub>
The term nonparametric is imprecise. The related term distribution-free has a more precise meaning...The distribution-free property enables one to obtain the distribution of the statistic under the null hypothesis without specifying the underlying distribution of the data
Hollander, Myles, Douglas A. Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley and Sons, 2013.</sub>
2) perform NHST to establish if the two classifiers have different performance or not based on these mean differences of accuracy.
Suggested (Nonparametric) Tests:
Why?
In [1]:
using Distributions, Gadfly, Compose, DataFrames, Fontconfig, Cairo
include("/home/benavoli/Data/Work_Julia/Julia/Plots/plot_data1.jl")
include("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Tests/Bsigntest.jl")
ClassID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierID.dat", ',')
ClassNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierNames.dat", ',')
DatasetID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetID.dat", ',');
DatasetNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetNames.dat", ',');
Percent_correct = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/Percent_correct.dat", ',');
cl1a=1 #nbc
cl2a=2 #aode
println("Comparison of ", ClassNames[cl1a,1], " vs. ", ClassNames[cl2a,1])
println()
#Compute mean accuracy
indi=find(x->x==cl1a,ClassID)
indj=find(x->x==cl2a,ClassID)
accNbc=Float64[]
accAode=Float64[]
for d=1:Int32(maximum(DatasetID))
indd=find(x->x==d,DatasetID)
indid=intersect(indi,indd)
indjd=intersect(indj,indd)
push!(accNbc,mean(Percent_correct[indid])/100)
push!(accAode,mean(Percent_correct[indjd])/100)
end
Typically used as follows.
Assumptions
The test statistic is: $$ \begin{array}{rcl} t=& \sum\limits_{1\leq i \leq j \leq q} t^+_{ij}, \end{array} $$
where
$$ t^+_{ij}=\left\{\begin{array}{ll} 1 & \textit{if } z_i \geq -z_j,\\ 0 & \textit{otherwise. } \\ \end{array}\right. $$Example: consider the two cases ${z}=\{-2,-1,4,5\}$ or ${z}=\{-1,4,5\}$, statistic is $t=7$ and, respectively, $t=5$.
$p$-value of the statistic is computed against the null (zero median).
In [8]:
using HypothesisTests
p=plot_data1(cl1a,cl2a,"all datasets",accNbc-accAode,-0.15,0.1)
display(p)
p1=pvalue(SignedRankTest(accNbc,accAode))
p2=pvalue(SignTest(accNbc,accAode))
@printf "p-value SignedRank=%0.8f " p1
@printf "p-value Sign=%0.8f\n" p2
In [3]:
ClassID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierID.dat", ',')
ClassNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierNames.dat", ',')
DatasetID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetID.dat", ',');
DatasetNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetNames.dat", ',');
Percent_correct = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/Percent_correct.dat", ',');
cl1b=1 #nbc
cl2b=3 #aode
println("Comparison of ", ClassNames[cl1b,1], " vs. ", ClassNames[cl2b,1])
println()
#Compute mean accuracy
indi=find(x->x==cl1b,ClassID)
indj=find(x->x==cl2b,ClassID)
accNbc=Float64[]
accHnb=Float64[]
for d=1:Int32(maximum(DatasetID))
indd=find(x->x==d,DatasetID)
indid=intersect(indi,indd)
indjd=intersect(indj,indd)
push!(accNbc,mean(Percent_correct[indid])/100)
push!(accHnb,mean(Percent_correct[indjd])/100)
end
In [4]:
using HypothesisTests
p1=pvalue(SignedRankTest(accNbc,accHnb))
p2=pvalue(SignTest(accNbc,accHnb))
p=plot_data1(cl1b,cl2b,"all datasets",accNbc-accHnb,-0.15,0.1)
display(p)
@printf "p-value SignedRank=%0.4f " p1
@printf "p-value Sign=%0.4f\n" p2
In [5]:
ClassID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierID.dat", ',')
ClassNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/ClassifierNames.dat", ',')
DatasetID = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetID.dat", ',');
DatasetNames = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/DatasetNames.dat", ',');
Percent_correct = readdlm("/home/benavoli/Data/Work_Julia/Tutorial/Julia/Data/Percent_correct.dat", ',');
cl1b=4 #nbc
cl2b=5 #aode
println("Comparison of ", ClassNames[cl1b,1], " vs. ", ClassNames[cl2b,1])
println()
#Compute mean accuracy
indi=find(x->x==cl1b,ClassID)
indj=find(x->x==cl2b,ClassID)
accJ48=Float64[]
accJ48gr=Float64[]
for d=1:Int32(maximum(DatasetID))
indd=find(x->x==d,DatasetID)
indid=intersect(indi,indd)
indjd=intersect(indj,indd)
push!(accJ48,mean(Percent_correct[indid])/100)
push!(accJ48gr,mean(Percent_correct[indjd])/100)
end
In [6]:
using HypothesisTests
p1=pvalue(SignedRankTest(accJ48[20:50],accJ48gr[20:50]))
p2=pvalue(SignTest(accJ48[20:50],accJ48gr[20:50]))
p=plot_data1(cl1b,cl2b,"all datasets",accJ48[20:50]-accJ48gr[20:50],-0.013,0.011)
display(p)
@printf "p-value SignedRank=%0.4f " p1
@printf "p-value Sign=%0.4f\n" p2
In [7]:
using HypothesisTests
p1=pvalue(SignedRankTest(accJ48,accJ48gr))
p2=pvalue(SignTest(accJ48,accJ48gr))
p=plot_data1(cl1b,cl2b,"all datasets",accJ48-accJ48gr,-0.013,0.011)
display(p)
@printf "p-value SignedRank=%0.4f " p1
@printf "p-value Sign=%0.4f\n" p2