In [100]:
#Larson Hogstrom
#18.337
#PSET1
#9/28/2015
1a. What is a nice formula for the broken line in the parallel prefix curve?:
1b. Can you roughly explain the formula? (Not a formal proof)
2.The exclusive prefix computes the "sum" of every number but yourself. ($y_i = \sum_{j\ne i}x_j$)
Here sum is general, and there is no concept of subtraction. Write a recursive function similar to the
one in lecture 3, that computes the exclusive prefix. Give an example why this would be preferred over
subtraction.
3.Write a parallel-style code that computes the carry lookahead addition of Babbage's using only matrix multiply prefix and embarassingly parallel operations.
In [24]:
using PyPlot
In [ ]:
### plot curve ###
#parameters for step down
step_list = float([2^m for m=[1:6]])
append!(step_list,float([3*2^m for m=[1:4]]))
step_rate = m*2.5
step_down = float(0)
## line parameters
m=9/80 # slope
b=1 #intercept
x=float([1:80])
#apply function
y = zeros(Float64,80)
for (ij,j) in enumerate(x)
if ij in step_list
step_down = step_down + step_rate
else
null = 1
end
y[ij] = m*j + b - step_down
end
In [131]:
#make plot
ax = axes()
ax[:set_ylim]([0,8])
p = plot(x,y,".")
xlabel("Number of processors, p")
ylabel("Speedup, r")
grid("on")
The formula can be expressed as follows:
$$ \textit{speedup} = 1 + mp - r * \begin{cases} 1 & \mbox{if } p \in \{2^k\} \\ 0 & \text{otherwise} \\ \end{cases} -r* \begin{cases} 1 & \mbox{if } p \in \{3(2^k)\} \\ 0 & \text{otherwise} \\ \end{cases} $$where $m$ is the linear term for increasing speed with additional processors. The variable $r$ is a step down term associated with increased overhead for adding new processors. This penalty is most noticable whenever the number of processors is not able to efficently divide the matrix multiply work (most efficent at $2^k$ and $3(2^k)$).
2.The exclusive prefix computes the "sum" of every number but yourself. ($y_i = \sum_{j\ne i}x_j$)
Here sum is general, and there is no concept of subtraction. Write a recursive function similar to the
one in lecture 3, that computes the exclusive prefix. Give an example why this would be preferred over
subtraction.
In [98]:
# basic prefix sum
function prefix_serial!(y, ⋅)
for i=2:length(y)
y[i] = y[i-1] ⋅ y[i]
end
y
end
Out[98]:
In [99]:
# define prefix exclusive sume - naive
function prefix_exclusive_serial(y, +)
m = zeros(Int64,length(y))
for i=1:length(y)
if i == 1
m[i] = sum(y[i+1:end])
else
m[i] = sum(y[1:i-1]) + sum(y[i+1:end])
end
end
m
end
Out[99]:
One reason would be that subtraction could introduce large floating point errors in the during the sumation.
$$ d.ddd ... d \text{ x } \beta^e $$Where $ d.ddd ... d $ is the significant and $\beta$ is the significand. One scenario would be if the ith entry of the input matrix where much bigger than all other entries. In such a case, the smaller values might get washed out when added to the largest floating point value. In such a case $\sum_{j\ne i}x_j \ne (\sum_{j}x_j) - x_i$.
In [6]:
### example showing that subtraction can introduce floating point errors ###
x_with = [.3 4 10^44 1]
x_without = [.3 4 1]
(sum(x_with) - 10^44) == sum(x_without)
Out[6]:
In [34]:
#prefix recursive, power of 2 only
function prefix_recursive(x,⋅)
n = length(x)
if n>1
#w=x[1:2:n]+x[2:2:n] #should this be the . operator?
w=x[1:2:n]⋅x[2:2:n]
w=prefix_recursive(w,⋅)
x[2:2:n] = w
x[1:2:n] += [0, w[1:end-1]] # update odds (try update evens, or try exclusive)
end
x
end
Out[34]:
In [103]:
#Take an input array, pop value for the ith entry, and perform the usual prefix_recursive
#This only works on arrays of size 2^n+1
function prefix_exclusive_serial(y, +)
n = length(x)
m = zeros(Int64,length(y))
x_small = zeros(Int64,length(y)-1)
if ~ispow2(n) #base case
for i=1:length(y)
if i == 1
x_small = y[i+1:end]
else
x_small = y[1:i-1]
append!(x_small,y[i+1:end])
end
m[i] = sum(x_small)
end
end
m
#x_small
end
Out[103]:
In [109]:
y=[1:9]
print(sum(y))
prefix_exclusive_serial(y,+)
#prefix_recursive([1:8],+)
#sqrt(8)
#isqrt(16)
ispow2(8)
Out[109]:
In [102]:
?ispow2
In [318]:
#start by defining a function to convert an integer a 7-element array of bits
function int_to_bit_array(x)
#convert intiger to 7 bit array
m = bits(x)
g = m[58:end]
y = [int64(string(i)) for i=g]
y
end
Out[318]:
For the cary-lookahead adder, the result of the 1 bit full adder depend on three values: the input bits (a and b) and the carry value (c).
The resulting sum for a 1 bit adder can be defined as follows:
$$s = a ⊕ b ⊕ c_{in} $$In addition to yielding a sum value, the adder also outputs a carry value $c_{out}$ that is passed to the next adder in the sequence:
$$ c_{out} = ab + ac_{in} ⊕ bc_{in} $$$$ = ab + c_{in}(a ⊕ b) $$To keep track of many carry results and passes, this formulation can be adapted to matrix notation:
$$ \begin{pmatrix} c_i \\ 1 \end{pmatrix} = \begin{pmatrix} a_i⊕b_i & a_i b_i \\ 0 & 1 \end{pmatrix} \begin{pmatrix} c_{i-1} \\ 1 \end{pmatrix} $$Given the starting carry, the ith cary can be calulated as follows
$$\begin{pmatrix} c_{i} \\ 1 \end{pmatrix} = M_i M_{i-1} ... M_1 \begin{pmatrix} c_{1} \\ 1 \end{pmatrix} $$Where the matrix M is defined as follows
$$ M_i = \begin{pmatrix} a_i⊕b_i & a_i b_i \\ 0 & 1 \end{pmatrix} $$
In [272]:
function mtrx_build(a,b)
M = [a$b a*b; 0 1]
M
end
Out[272]:
In [311]:
n1 = 10
n2 = 2
x1 = int_to_bit_array(n1)
x2 = int_to_bit_array(n2)
reverse_index = [8-i for i in 1:7]
c = zeros(Int64,8)
Mm = [1 0; 0 1]
for (ix,i) in enumerate(reverse_index)
M = mtrx_build(x1[i],x2[i])
Mm = M * Mm
Cv = [c[i+1] 1]'
product1 = Mm * Cv
c[i] = product1[1]
end
s = x1 $ x2 $ c[2:8]
# check that reported bit sum matches bits of true sum
s == int_to_bit_array(n1+n2)
Out[311]:
In [283]:
killallworkers()=rmprocs(procs()[2:end])
Out[283]:
In [315]:
n1 = 8
n2 = 5
x1 = int_to_bit_array(n1)
x2 = int_to_bit_array(n2)
reverse_index = [8-i for i in 1:7]
c = zeros(Int64,8)
Mm = [1 0; 0 1]
killallworkers()
addprocs(4)
@parallel for (ix,i) in enumerate(reverse_index)
println(ix)
M = mtrx_build(x1[i],x2[i])
Mm = M * Mm
Cv = [c[i+1] 1]'
product1 = Mm * Cv
c[i] = product1[1]
end
s = x1 $ x2 $ c[2:8]
# check that reported bit sum matches bits of true sum
s == int_to_bit_array(n1+n2)
Out[315]:
*Collaboration acknowlegments:
I discussed stratagies for solving Problem 2 with A. Albaiz and Problem 3 with M. Sindi.