A nonparametric procedure is a statistical procedure that has a certain desirable properties that hold under relatively mild assumptions regarding the underlying populations from which the data are obtained.
Hollander, Myles, Douglas A. Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley and Sons, 2013.</sub>
The term nonparametric is imprecise. The related term distribution-free has a more precise meaning...The distribution-free property enables one to obtain the distribution of the statistic under the null hypothesis without specifying the underlying distribution of the data
Hollander, Myles, Douglas A. Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley and Sons, 2013.</sub>
Bayesian parametric
Probability Space: Let $\mathbb{X}$ be a standard Borel space with Borel $\sigma$-field $\mathcal{B}_\mathbb{X}$ and $\mathbb{P}$ be the space of probability measures on $(\mathbb{X},\mathcal{B}_{\mathbb{X}})$.
Nonparametric Prior: Let $\mathbb{M}$ be the class of all probability measures on $(\mathbb{P},\mathcal{B}_\mathbb{P})$. We call the elements $\mu \in \mathbb{M}$ nonparametric priors.
An element $\mu \in \mathbb{M}$ is called a Dirichlet process $\mathcal{D}p(s,\alpha)$ on $\mathbb{X}$, with
if for every finite measurable partition $B_1,\dots,B_m$ of $\mathbb{X}$,
the vector $(P(B_1),\dots,P(B_m))$
has a Dirichlet distribution with parameters $(s\alpha(B_1),\dots,s\alpha(B_m))$
It means that $\alpha(\mathbb{X})=s\in \mathbb{R}$ is finite and, thus, we can normalize $\alpha(\cdot)$!
Normalization: $$ \alpha(\cdot)=s ~\alpha(\cdot) ~~~ \text{ with $~~~\alpha(\mathbb{X})=1$} $$
Note that $\alpha(\cdot)$ is now a probability measure.
In [8]:
using Gadfly
using Compose
myplot=plot(x=rand(1)*0.0,y=rand(1)*0.0,Guide.annotation(compose(context(),line([(-1, 0), (-1, 0.5)]),line([(1, 0), (1, 0.5)]),stroke(colorant"blue"),linewidth(0.4))),Stat.xticks(ticks=[-2.0,-1.5,-1.0,-0.5,0.0, 0.5, 1.0,1.5,2]),Stat.yticks(ticks=[0.0, 0.5, 1.0]),xintercept=[0.0], Geom.line,Guide.ylabel("Probability"),Theme(major_label_font_size=15pt,minor_label_font_size=13pt))
draw(PNG("plots/real.png", 6inch, 3inch), myplot)
In [2]:
xv=(0:0.01:1)
yv=6*xv.^(1).*(1-xv).^(1)
densplot=plot(x=xv,y=yv, Geom.line,Guide.xlabel("P(B<sub>1</sub>)"),Guide.ylabel("Density"),Theme(panel_fill=colorant"black", default_color=colorant"orange",major_label_font_size=15pt,minor_label_font_size=13pt))
draw(PNG("plots/densplot.png", 6inch, 3inch), densplot)
One of the most remarkable properties of the DP priors is that the posterior of $P$ is again a DP.
Then the posterior distribution of $P$ is
$$
P|x_1,\dots,x_n \sim Dp\left(s+n, \frac{s}{s+n} \alpha+ \frac{1}{s+n} \sum\limits_{i=1}^n \delta_{x_i}\right),
$$
where $\delta_{x_i}$ is an atomic probability measure centered at $x_i$.
In [3]:
using Gadfly
using Compose
myplot=plot(x=rand(1)*0.0,y=rand(1)*0.0,Guide.annotation(compose(context(),line([(-1, 0), (-1, 0.4)]),line([(1, 0), (1, 0.4)]),stroke(colorant"blue"),linewidth(0.4))),Guide.annotation(compose(context(),line([(-1.5, 0), (-1.5, 0.1)]),line([(-0.5, 0), (-0.5, 0.1)]),line([(0.25, 0), (0.25, 0.1)]),line([(0.75, 0), (0.75, 0.1)]),line([(1.25, 0), (1.25, 0.1)]),line([(1.75, 0), (1.75, 0.1)]),stroke(colorant"orange"),linewidth(0.4))),Stat.xticks(ticks=[-2.0,-1.5,-1.0,-0.5,0.0, 0.5, 1.0,1.5,2]),Stat.yticks(ticks=[0.0, 0.5, 1.0]),xintercept=[0.0], Geom.line,Guide.ylabel("Probability"),Theme(major_label_font_size=15pt,minor_label_font_size=13pt))
draw(PNG("plots/realpost.png", 6inch, 3inch), myplot)
In [4]:
xv=(0:0.01:1)
yv=504*xv.^(3).*(1-xv).^(5)
densplot=plot(x=xv,y=yv, Geom.line,Guide.xlabel("P(B<sub>1</sub>)"),Guide.ylabel("Density"),Theme(panel_fill=colorant"black", default_color=colorant"orange",major_label_font_size=15pt,minor_label_font_size=13pt))
draw(PNG("plots/densplotpost.png", 6inch, 3inch), densplot)
In [5]:
using Gadfly
using Compose
myplot=plot(x=rand(1)*0.0,y=rand(1)*0.0,Guide.annotation(compose(context(),line([(-1.5, 0), (-1.5, 1/7)]),stroke(colorant"blue"),linewidth(0.4))),Guide.annotation(compose(context(),line([(-1, 0), (-1, 1/7)]),line([(-0.5, 0), (-0.5, 1/7)]),line([(0.25, 0), (0.25, 1/7)]),line([(0.75, 0), (0.75, 1/7)]),line([(1.25, 0), (1.25, 1/7)]),line([(1.75, 0), (1.75, 1/7)]),stroke(colorant"orange"),linewidth(0.4))),Stat.xticks(ticks=[-2.0,-1.5,-1.0,-0.5,0.0, 0.5, 1.0,1.5,2]),Stat.yticks(ticks=[0.0, 0.5, 1.0]),xintercept=[0.0], Geom.line,Guide.ylabel("Probability"),Theme(major_label_font_size=15pt,minor_label_font_size=13pt))
draw(PNG("plots/realpost1.png", 6inch, 3inch), myplot)
We can actually only consider
$B_1=(.,-1.5]$, $~B_2=(.,-1]$, $~B_3=(.,-0.5]$,...
$$
\begin{array}{rcl}
(P(B_1),\dots,P(B_{8}))&\sim& Dir(s_n\alpha_n(B_1),\dots,s_n\alpha_n(B_8))\\
&=& Dir(s,1,1,1,1,1,1)
\end{array}
$$
Note that the support of $P$ is atomic, hence
$$
P=w_0 \delta_{\{-1.5\}}+ w_1 \delta_{-1}+ w_2 \delta_{-0.5}+\dots+ w_7 \delta_{1.75},
$$
with
$(w_0,w_1,\dots,w_n)\sim Dir(s,1,\dots,1)$
So we have a Hierarchical model:
Goal: is the median of some distribution $P$ greater than zero?
Known $P$: just check whether $P(0,\infty)>P(-\infty,0]~~$ equiv. $~~P(0,\infty)>0.5$
Unknown $P$: use observations from $P$ to estimate $P$ and query the posterior (Bayesian way)
Our estimate of $P$ is $$ P=w_0 P_0+ \sum_{i=1}^n w_i \delta_{x_i}, $$ with
The goal is to compute
$$ \mathcal{P}(P(0,\infty)>0.5) =\mathcal{P}(w_3 +w_4+w_5+w_6 >0.5) $$where $\mathcal{P}$ is computed w.r.t. $(w_0,w_1,\dots,w_n)\sim Dir(s,1,\dots,1)$
In [6]:
using Distributions, Gadfly, Compose, DataFrames
include("plots/Bsigntest1.jl")
#Bayesian Sign Test
observ=[-1;-0.5;0.25;0.75;1.25;1.75];
data=Bsigntest(observ,zeros(length(observ),1),0,[1,0]);
#Plot
df0 = DataFrame(x=data[2,:][:])
p0=plot(df0, x=:x, Geom.histogram,Theme(major_label_font_size=13pt,minor_label_font_size=12pt,key_label_font_size=11pt),Guide.xlabel("P(0,∞)"))
set_default_plot_size(15cm, 8cm)
display(p0)
prob=length(find(z->z> 0.5,data[2,:][:]))/length(data[2,:][:])
println("Probability that P(0,\u221e)>0.5 is $prob")
The goal is to compute
$$ \mathcal{P}(P(0,\infty)>0.5) =\mathcal{P}(w_0 +w_3 +w_4+w_5+w_6 >0.5) $$where $\mathcal{P}$ is computed w.r.t. $(w_0,w_1,\dots,w_n)\sim Dir(s,1,\dots,1)$
In [7]:
using Distributions, Gadfly, Compose, DataFrames
include("plots/Bsigntest1.jl")
#Bayesian Sign Test
observ=[-1;-0.5;0.25;0.75;1.25;1.75];
data=Bsigntest(observ,zeros(length(observ),1),0,[0,1]);
#Plot
df0 = DataFrame(x=data[2,:][:])
p0=plot(df0, x=:x, Geom.histogram,Theme(major_label_font_size=13pt,minor_label_font_size=12pt,key_label_font_size=11pt),Guide.xlabel("P(0,∞)"))
set_default_plot_size(15cm, 8cm)
display(p0)
prob=length(find(z->z> 0.5,data[2,:][:]))/length(data[2,:][:])
println("Probability that P(0,\u221e)>0.5 is $prob")