In [100]:

    
#Larson Hogstrom
#18.337
#PSET1
#9/28/2015

1a. What is a nice formula for the broken line in the parallel prefix curve?:

1b. Can you roughly explain the formula? (Not a formal proof)

2.The exclusive prefix computes the "sum" of every number but yourself. ($y_i = \sum_{j\ne i}x_j$)
Here sum is general, and there is no concept of subtraction. Write a recursive function similar to the one in lecture 3, that computes the exclusive prefix. Give an example why this would be preferred over subtraction.

3.Write a parallel-style code that computes the carry lookahead addition of Babbage's using only matrix multiply prefix and embarassingly parallel operations.



In [24]:

    
using PyPlot

Problem 1a



In [ ]:

    
### plot curve ### 
#parameters for step down
step_list = float([2^m for m=[1:6]])
append!(step_list,float([3*2^m for m=[1:4]]))
step_rate = m*2.5
step_down = float(0)

## line parameters
m=9/80 # slope
b=1 #intercept
x=float([1:80])

#apply function
y = zeros(Float64,80)
for (ij,j) in enumerate(x)
    if ij in step_list
        step_down = step_down + step_rate
    else
        null = 1
    end
    y[ij] = m*j + b - step_down
end



In [131]:

    
#make plot
ax = axes()
ax[:set_ylim]([0,8])
p = plot(x,y,".")
xlabel("Number of processors, p")
ylabel("Speedup, r")
grid("on")

Problem 1b

The formula can be expressed as follows:

$$ \textit{speedup} = 1 + mp - r * \begin{cases} 1 & \mbox{if } p \in \{2^k\} \\ 0 & \text{otherwise} \\ \end{cases} -r* \begin{cases} 1 & \mbox{if } p \in \{3(2^k)\} \\ 0 & \text{otherwise} \\ \end{cases} $$

where $m$ is the linear term for increasing speed with additional processors. The variable $r$ is a step down term associated with increased overhead for adding new processors. This penalty is most noticable whenever the number of processors is not able to efficently divide the matrix multiply work (most efficent at $2^k$ and $3(2^k)$).

Problem 2

https://en.wikipedia.org/wiki/Prefix_sum



In [98]:

    
# basic prefix sum
function prefix_serial!(y, ⋅)
    for i=2:length(y)
        y[i] = y[i-1] ⋅ y[i]
    end
    y
end









    Out[98]:





prefix_serial! (generic function with 1 method)



In [99]:

    
# define prefix exclusive sume - naive 
function prefix_exclusive_serial(y, +)
    m = zeros(Int64,length(y))
    for i=1:length(y)
        if i == 1
            m[i] = sum(y[i+1:end])
        else
            m[i] = sum(y[1:i-1]) + sum(y[i+1:end])
        end
    end
    m
end









    Out[99]:





prefix_exclusive_serial (generic function with 1 method)

why might exclusive prefix sum be prefered over subtraction?

One reason would be that subtraction could introduce large floating point errors in the during the sumation.

$$ d.ddd ... d \text{ x } \beta^e $$

Where $ d.ddd ... d $ is the significant and $\beta$ is the significand. One scenario would be if the ith entry of the input matrix where much bigger than all other entries. In such a case, the smaller values might get washed out when added to the largest floating point value. In such a case $\sum_{j\ne i}x_j \ne (\sum_{j}x_j) - x_i$.



In [6]:

    
### example showing that subtraction can introduce floating point errors ### 
x_with = [.3 4 10^44 1]
x_without = [.3 4 1]
(sum(x_with) - 10^44) == sum(x_without)









    Out[6]:





false



In [34]:

    
#prefix recursive, power of 2 only

function prefix_recursive(x,⋅)
    n = length(x)
    if n>1 
        
        #w=x[1:2:n]+x[2:2:n] #should this be the . operator? 
        w=x[1:2:n]⋅x[2:2:n] 
     
        w=prefix_recursive(w,⋅)
      
        x[2:2:n] = w
        x[1:2:n] += [0, w[1:end-1]] # update odds (try update evens, or try exclusive)
   end
   x
end









    Out[34]:





prefix_recursive (generic function with 1 method)



In [103]:

    
#Take an input array, pop value for the ith entry, and perform the usual prefix_recursive
#This only works on arrays of size 2^n+1
function prefix_exclusive_serial(y, +)
    n = length(x)
    m = zeros(Int64,length(y))
    x_small = zeros(Int64,length(y)-1)
    if ~ispow2(n) #base case
        for i=1:length(y)
            if i == 1
                x_small = y[i+1:end]
            else
                x_small = y[1:i-1]
                append!(x_small,y[i+1:end])
            end
            m[i] = sum(x_small)
        end
    end
    m
    #x_small
end









    Out[103]:





prefix_exclusive_serial (generic function with 1 method)



In [109]:

    
y=[1:9]
print(sum(y))
prefix_exclusive_serial(y,+)
#prefix_recursive([1:8],+)
#sqrt(8)
#isqrt(16)
ispow2(8)









    



45





    Out[109]:





true



In [102]:

    
?ispow2









    



Base.ispow2(n) -> Bool

   Test whether "n" is a power of two

Problem 3

Write a parallel-style code that computes the carry lookahead addition of Babbage's using only matrix multiply prefix and embarassingly parallel operations.



In [318]:

    
#start by defining a function to convert an integer a 7-element array of bits
function int_to_bit_array(x)
    #convert intiger to 7 bit array
    m = bits(x)
    g = m[58:end]
    y = [int64(string(i)) for i=g] 
    y
end









    Out[318]:





int_to_bit_array (generic function with 1 method)

For the cary-lookahead adder, the result of the 1 bit full adder depend on three values: the input bits (a and b) and the carry value (c).

The resulting sum for a 1 bit adder can be defined as follows:

$$s = a ⊕ b ⊕ c_{in} $$

In addition to yielding a sum value, the adder also outputs a carry value $c_{out}$ that is passed to the next adder in the sequence:

$$ c_{out} = ab + ac_{in} ⊕ bc_{in} $$$$ = ab + c_{in}(a ⊕ b) $$

To keep track of many carry results and passes, this formulation can be adapted to matrix notation:

$$ \begin{pmatrix} c_i \\ 1 \end{pmatrix} = \begin{pmatrix} a_i⊕b_i & a_i b_i \\ 0 & 1 \end{pmatrix} \begin{pmatrix} c_{i-1} \\ 1 \end{pmatrix} $$

Given the starting carry, the ith cary can be calulated as follows

$$\begin{pmatrix} c_{i} \\ 1 \end{pmatrix} = M_i M_{i-1} ... M_1 \begin{pmatrix} c_{1} \\ 1 \end{pmatrix} $$

Where the matrix M is defined as follows

$$ M_i = \begin{pmatrix} a_i⊕b_i & a_i b_i \\ 0 & 1 \end{pmatrix} $$



In [272]:

    
function mtrx_build(a,b)
    M = [a$b a*b; 0 1]
    M
end









    Out[272]:





mtrx_build (generic function with 1 method)



In [311]:

    
n1 = 10
n2 = 2
x1 = int_to_bit_array(n1)
x2 = int_to_bit_array(n2)
reverse_index = [8-i for i in 1:7]
c = zeros(Int64,8)
Mm = [1 0; 0 1]
for (ix,i) in enumerate(reverse_index)
    M = mtrx_build(x1[i],x2[i]) 
    Mm = M * Mm
    Cv = [c[i+1] 1]'
    product1 = Mm * Cv
    c[i] = product1[1]
end
s = x1 $ x2 $ c[2:8]
# check that reported bit sum matches bits of true sum
s == int_to_bit_array(n1+n2)









    Out[311]:





true

Calculate bit addition using embarassingly parallel operations



In [283]:

    
killallworkers()=rmprocs(procs()[2:end])









    Out[283]:





killallworkers (generic function with 1 method)



In [315]:

    
n1 = 8
n2 = 5
x1 = int_to_bit_array(n1)
x2 = int_to_bit_array(n2)
reverse_index = [8-i for i in 1:7]
c = zeros(Int64,8)
Mm = [1 0; 0 1]

killallworkers()
addprocs(4)
@parallel for (ix,i) in enumerate(reverse_index)
    println(ix)
    M = mtrx_build(x1[i],x2[i]) 
    Mm = M * Mm
    Cv = [c[i+1] 1]'
    product1 = Mm * Cv
    c[i] = product1[1]
end

s = x1 $ x2 $ c[2:8]
# check that reported bit sum matches bits of true sum
s == int_to_bit_array(n1+n2)









    Out[315]:





true






    



exception on 82: exception on 85: exception on 83: exception on 84: ERROR: `getindex` has no method matching getindex(::Enumerate{Array{Int64,1}}, ::UnitRange{Int64})
 in anonymous at no file:1446
 in anonymous at multi.jl:1279
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: `getindex` has no method matching getindex(::Enumerate{Array{Int64,1}}, ::UnitRange{Int64})
 in anonymous at no file:1446
 in anonymous at multi.jl:1279
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: `getindex` has no method matching getindex(::Enumerate{Array{Int64,1}}, ::UnitRange{Int64})
 in anonymous at no file:1446
 in anonymous at multi.jl:1279
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: `getindex` has no method matching getindex(::Enumerate{Array{Int64,1}}, ::UnitRange{Int64})
 in anonymous at no file:1446
 in anonymous at multi.jl:1279
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6

*Collaboration acknowlegments:

I discussed stratagies for solving Problem 2 with A. Albaiz and Problem 3 with M. Sindi.