For the input


In [4]:
x = [1, 1, 0, 1, 0, 0, 0, 1];

If choose the relative error to be $\epsilon=0.1$ then

$k = [\frac{1}{\epsilon}] = 10$

and therefore $k/2$ is an integer

According to the Invariant 2 (p.4)

If $C_j = 2^r$ is the size of $j$-th bucket then we are guaranteed that there are at least $\frac{k}{2}$ buckets of sizes $1,\dots,2^{r-1}$ with the indexes less than $j$

To cover all active $1$'s we need no more than

$m<=(\frac{k}{2} +1)(\log (\frac{2N}{k}+1)+1)$

buckets.

In case of $k=10$, it is 6 * ((log(2N/10) +1)+1)

This all follows from the exponential ordering in sizes

Invariant 2 is important.

.. if $2^h$ is the size of the last bucket then for every bucket size other that the size of the last bucket there are at most $k/2+1$ and at least $k/2$ buckets of that size. Why???


In [17]:
function hist_int(v::Array{Int,1})
    a=unique(v)
    hist::Array{Int,1} = zeros(maximum(a))
    for e in a
        hist[e]+=1
    end 
    hist
end 
a=rand(1:10,10)
hist_int(a)


Out[17]:
10-element Array{Int64,1}:
 0
 1
 1
 0
 1
 0
 1
 0
 1
 1

In [ ]: