In [2]:
### display the smallest possible intiger for 32 bit
xMin = realmin(Float32)
xMin
# show max number
xMax = realmax(Float32)
g = xMax + 2
Out[2]:
In [1]:
### 64 bit
xMax = realmax(Float64)
xMax + 1
g_int = -1.79e308
#g_underflow = -1.79e309
-xMax
xMax_mantissa = xMax * 10.0^(-308)
# make lowest representation
-(xMax_mantissa * 10.0^(308))
-(xMax_mantissa * 10.0^(309)) # this causes overflow
# add very small value to mantissa, show that it causes overflow
xMax_mantissa_plus = xMax_mantissa + 2.0*10.0^(-16)
-(xMax_mantissa_plus * 10.0^(308)) # why is this still
Out[1]:
In [1]:
b =1
c = logspace(-20,-1,50)
length(c)
x = b - sqrt(b^2 - c)
# multiply by b^2
set_bigfloat_precision(256)
xbig = b - sqrt(b^2 - big(c))
err = float64(abs(x-xbig) ./ abs(xbig))
using PyPlot
loglog(c, err, "bo-")
xlabel(L"c")
ylabel(L"relative error in $x$")
title(L"standard quadratic formula $x = b - \sqrt{b^2 - c}$")
Out[1]:
The poor relative error occurs when the b^2 is subtracted from c. Because b >> c, many of the accurate digits are lost when this subtraction occurs.
In [18]:
b =1
c = logspace(-20,-1,50)
length(c)
#x = b - sqrt(b^2 - c)
x = c ./ (b + sqrt(b^2 - c))
#x = b - (b^2 - c) ./ sqrt(b^2 - c)
# multiply by b^2
set_bigfloat_precision(256)
xbig = b - sqrt(b^2 - big(c))
err = float64(abs(x-xbig) ./ abs(xbig))
using PyPlot
loglog(c, err, "bo-")
xlabel(L"c")
ylabel(L"relative error in $x$")
title(L"Alternative calculation of root: $\frac{c}{b + \sqrt{b^2 - c}}$")
Out[18]:
In [4]:
# individual values
# when the value of c is very small (~10^-20), the difference (b^2 - c)
# is evaluated to (b^2 - c) = 1.0
# when subtracting 1.0 from a very small number, the gap is too big so the
# difference is just reported as 1.0
c = 1e-20
b =1
length(c)
#x = b - sqrt(b^2 - c)
x = c ./ (-b + sqrt(b^2 - c))
set_bigfloat_precision(256)
xbig = b - sqrt(b^2 - big(c))
err = float64(abs(x-xbig) ./ abs(xbig))
x,xbig,err
Out[4]:
In [15]:
-b + (b^2 - c) ./ sqrt(b^2 - c)
Out[15]:
part b) to resolve the relative error - try making the gap based on e_machine
In [39]:
c=1e-20
sqr_diff = b^2 - c
if c != b^2 && sqr_diff == 1.0
sqr_diff = 1-eps()
end
x = b - sqrt(sqr_diff)
set_bigfloat_precision(256)
xbig = b - sqrt(b^2 - big(c))
err = float64(abs(x-xbig) ./ abs(xbig))
x,xbig,err
Out[39]:
In [1]:
x^2 + 2x + c = 0
x(x+2) + c = 0
x(x+2) = -c
(x - rt(c)) (x - rt(c))
"Catastrophic cancellation occurs if two numbers are close and are the result of previous calculations, so have inaccuracies in the lower digits. Most of the accurate digits may cancel as a result of subtraction, leaving behind cruft in the lower order digits. On the other hand, if the operands are exactly known rather than calculated their difference is guaranteed to have a very small relative error, which is referred to as benign cancellation" http://accu.org/index.php/journals/1558
*The last equation represents an iteration for a Newton-like root estimation
If an iteration of our Newtonish method takes the form
$$ x_{n+1} = x_n - \frac{2f(x_n) f'(x_n)}{2f'(x_n)^2 - f''(x_n)(f(x_n))} \\ $$Then estimate at a given Newtonish step will have an error associated with it $x_n = x(1 + δ_n)$ and so will the estimate on the next iteration $x_{n+1} = x(1 + δ_{n+1})$. We can plug these in:
$$ x(1 + δ_{n+1}) = x(1 + δ_n) - \frac{2f(x(1 + δ_n)) f'(x(1 + δ_n))}{2f'(x(1 + δ_n))^2 - f''(x(1 + δ_n))(f(x(1 + δ_n)))} \\ $$Then solve for $δ_{n+1}$
$$ δ_{n+1} = δ_n - \frac{2f(x(1 + δ_n)) f'(x(1 + δ_n))}{2xf'(x(1 + δ_n))^2 - xf''(x(1 + δ_n))(f(x(1 + δ_n)))} \\ $$Modify the Julia Newton’s-method notebook from class to implement your method to compute a root of $f(x) = x^3 − 1$. In particular start with x1 = 2, so that your scheme should(!) converge to x = 1, and look at the error $x_n − 1$. Demonstrate that it agrees with your predicted convergence rate from the previous part. [You should use arbitrary precision as in the notebook from class, so that you can watch the convergence rate for many digits. An approximate number of accurate digits is given by − log10(xn − 1).]
In [1]:
Because the true root is x=1, the error takes the follwoing term:
$$ δ_{n+1} = δ_n - \frac{2f((1 + δ_n)) f'((1 + δ_n))}{2f'((1 + δ_n))^2 - f''((1 + δ_n))(f((1 + δ_n)))} \\ $$$$ = δ_n - \frac{2((1 + δ_n)^3 -1 ) (3(1 + δ_n)^2)}{2(3(1 + δ_n)^2)^2 - 6(1 + δ_n)(1 + δ_n)^3 -1} \\ $$
In [23]:
In [105]:
function iterate_newton_basic(n,x)
# find root for x^3−1
# n is number of iterations
# x is initial guess
for i = 1:n
x = x - (x^3 - 1)/(3x^2)
end
return x
end
Out[105]:
In [106]:
function iterate_newton_like(n,x)
# find root for x^3−1
# n is number of iterations
# x is initial guess
for i = 1:n
x = x - ( (6*x^5) - (6*x^2) )/( (12*x^4) + 6*x)
#x = x - ( 2*(x^3 -1) * (3x^2) )/( 2* (3*x^2)^2 - 6x*(x^3 - 1))
end
return x
end
Out[106]:
In [101]:
x1 = 2
n = 2
xn_basic = iterate_newton_basic(n,x1)
xn_like = iterate_newton_like(n,x1)
err = xn_like-1
xn_like-xn_basic
### To do:
# represent with arbitrary percision
# aproximate number of digits with:
log10(xn_like - 1)
#make table
Out[101]:
In [102]:
xn_basic,xn_basic^3
Out[102]:
In [103]:
xn_like,xn_like
Out[103]:
Show that algorithm $\widetilde{f}(x)$ for the function $ f(x) = \sum_{i=1}^{n} x_i $: $$ \widetilde{f}(x) = \sum_{i=1}^{n} x_i \prod_{k=1}^{n} (1+\varepsilon_k ) \\ \text{where } \varepsilon_0 = 0 \text{ and } |\varepsilon_k| \le \varepsilon_{machine} $$
Start with an inductive definition of the sum: $s_0 = 0 $ and $s_k = s_{k-1} + x_k $ for $ 0 \le k \le n$
If $ f(x) = s_n $ then $ \tilde{f}(x) = \tilde{s}_n $ where $$ \tilde{s}_k = \tilde{s}_{k-1} \oplus x_k $$ $$ = (\tilde{s}_{k-1} + x_k) (1 + \epsilon_k)$$
Expanding this form:
$ \tilde{f}(x) = \tilde{s}_n = (\tilde{s}_{n-1} + x_n) (1 + \epsilon_n)$
$= (( \tilde{s}_{n-2} \oplus x_{n-1}) + x_n)(1+\epsilon_n)$
$ = ((( \tilde{s}_{n-3} \oplus x_{n-2}) + x_{n-1})(1+ \epsilon_{n-1}) + x_n)(1+\epsilon_n) $
$ = (((( \tilde{s}_{n-4} \oplus x_{n-3}) + x_{n-2})(1 +\epsilon_{n-2}) + x_{n-1})(1+ \epsilon_{n-1}) + x_n)(1+\epsilon_n) $
Continuing in this form, breaking each $\tilde{s}_k$ down to the base case where we can solve for $\tilde{s}_2$:
$ \tilde{s}_2 = \tilde{s}_{1} \oplus x_2 = (x_1(1 + \epsilon_1) + x_2)(1 + \epsilon_2) $
The general form can be rearranged
$ \tilde{f}(x) = ( \tilde{s}_{n-4} \oplus x_{n-3})(1+ \epsilon_{n-2})(1+ \epsilon_{n-1})(1+ \epsilon_{n}) + x_{n-2}(1 + \epsilon_{n-2})(1+ \epsilon_{n-1})(1+ \epsilon_{n}) + x_{n-1}(1+ \epsilon_{n-1})(1 + \epsilon_{n}) + x_n(1+\epsilon_n)$
This is equivent to:
$ = x_1(1+ \epsilon_{1})(1+ \epsilon_{2})\dots (1+ \epsilon_{n}) + x_2(1+ \epsilon_{2})\dots (1+ \epsilon_{n}) + \dots + x_{n-1}(1+ \epsilon_{n-1})(1 + \epsilon_{n}) + x_n(1+\epsilon_n)$
From here we obtain the final form: $$ \widetilde{f}(x) = \sum_{i=1}^{n} x_i \prod_{k=1}^{n} (1+\varepsilon_k ) \\ \text{where } \varepsilon_0 = 0 \text{ and } |\varepsilon_k| \le \varepsilon_{machine} $$
In [2]:
Show that the error can be bounded by $$ | \widetilde{f}(x) - f(x) | \le n \varepsilon_{machine} \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}^2) $$
$$ \text{where } \varepsilon_0 = 0 \text{ , } |\varepsilon_k| \le \varepsilon_{machine} \text{ and }\\ \widetilde{f}(x) = \sum_{i=1}^{n} x_i \prod_{k=1}^{n} (1+\varepsilon_k ) \\ $$Start by expanding this sum to examine its structure: $$ \widetilde{f}(x) = x_1 (1 + \varepsilon_1) + x_2 (1 + \varepsilon_1)(1 + \varepsilon_2) + x_3 (1 + \varepsilon_1)(1 + \varepsilon_2)(1 + \varepsilon_3) \dots $$
because $ |\varepsilon_k| \le \varepsilon_{machine} $ $$ \widetilde{f}(x) \le |x_1| (1 + \varepsilon_{mach}) + |x_2| (1 + \varepsilon_{mach})^2 + |x_3| (1 + \varepsilon_{mach})^3 \dots \\ = \sum_{i=1}^{n} |x_i| (1+ \varepsilon_{mach})^i \\ = \sum_{i=1}^{n} |x_i| (1+ \varepsilon_{mach}) (1+ \varepsilon_{mach})^{i-1} $$
$$ = \sum_{i=1}^{n} (|x_i| + |x_i| \varepsilon_{mach}) (1+ \varepsilon_{mach})^{i-1} \\ = \sum_{i=1}^{n} |x_i|(1+ \varepsilon_{mach})^{i-1} + \sum_{i=1}^{n} (|x_i| \varepsilon_{mach}) (1+ \varepsilon_{mach})^{i-1} \\ = \sum_{i=1}^{n} |x_i|(1+ \varepsilon_{mach})^{i-1} + n \varepsilon_{mach} \sum_{i=1}^{n} |x_i|(1+ \varepsilon_{mach})^{i-1} \\ = (1 + n \varepsilon_{mach}) \sum_{i=1}^{n} |x_i|(1+ \varepsilon_{mach})^{i-1} $$Because $ \sum_{i=1}^{n} a^{i-1} = \frac{a^n -1 }{a-1} $
Therefore $ \sum_{i=1}^{n} (1+ \varepsilon_{mach})^{i-1} = \frac{(1+ \varepsilon_{mach})^n -1 }{(1+ \varepsilon_{mach})-1} = \frac{(1+ \varepsilon_{mach})^n -1 }{ \varepsilon_{mach}} $. We can continue that
$$ (1+ n\varepsilon_{mach}) \sum_{i=1}^{n} |x_i| (1+ \varepsilon_{mach})^{i-1} \\ = (1+ n\varepsilon_{mach}) \frac{(1+ \varepsilon_{mach})^n -1 }{ \varepsilon_{mach}} \sum_{i=1}^{n} |x_i| $$Because $ (1+ \varepsilon_{mach})^n = 1 + n*O(\varepsilon_{mach}) $ we now have
$$= (1+ n\varepsilon_{mach}) nO(\varepsilon_{mach}) \sum_{i=1}^{n} |x_i| \\ \widetilde{f}(x) \le (1+ n\varepsilon_{mach}) \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) \\ $$Returning to the bounded error: $$ | \widetilde{f}(x) - f(x) | \le (1+ n\varepsilon_{mach}) \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) - \sum_{i=1}^{n} x_i\\ = (n\varepsilon_{mach} \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) ) + \sum_{i=1}^{n} |x_i| + - \sum_{i=1}^{n} x_i\\ \le n\varepsilon_{mach} \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) $$
In [3]:
# test summation error
t = 1/10
# .1 not computed exactly
.1 == 1//10
### does adding .1 many times yield error
g =0
exponent = 7
for i = 1:10^exponent
g = g + .1
end
# calculate relative error
exact = 10^(exponent-1)
rel_error = abs(exact-g)/exact
Out[3]:
The main result for the random walk is that the expected numer of equalizations $g_{2m}$ is proportional to the length of the walk $2m$
$$ g_{2m} \sim \sqrt{\frac{2}{\pi}} \sqrt{2m} $$In errors accumulating from sequencial summation, the number of additions n is analagous to the length traveled $2m$ during a random walk. This result explains the mean error bound for randomly distributed values of $\varepsilon_{k}$ because $\sqrt{n}$ can be used to represent the average error away from the 0 that accumulates during n additions. This yields the result
$$ | \tilde{f}(x) - f(x) | = O( \sqrt{n} \varepsilon_{machine} \sum_{i=1}^{n} x_i)) $$In randomly distributed values of $\varepsilon_{k}$ positive and negative can cancel out in some cases, yeilding an overall lower bound.
In [1]:
function my_cumsum(x)
y = similar(x) # allocate an array of the same type and size as x
y[1] = x[1]
for i = 2:length(x)
y[i] = y[i-1] + x[i]
end
return y
end
x = rand(Float32, 10^7) # 10^7 single-precision values uniform in [0,1)
@time y = my_cumsum(x)
yexact = my_cumsum(float64(x)) # same thing in double precision
err = abs(y - yexact) ./ abs(yexact) # relative error in y
using PyPlot
n = 1:10:length(err) # downsample by 10 for plotting speed
loglog(n, err[n])
ylabel("relative error")
xlabel("# summands")
# plot a √n line for comparison
loglog([1,length(err)], sqrt([1,length(err)]) * 1e-7, "k--")
gca()[:annotate]("~ √n", xy=(1e3,1e-5), xytext=(1e3,1e-5))
title("naive cumsum implementation")
Out[1]:
For simplicity, assume n is a power of 2 (so that the set of numbers to add divides evenly in two at each stage of the recursion). With an analysis similar to that of problem 2, prove that | ˜f(x) − f(x)| ≤ machine log2 (n) Pn i=1 |x | + O( 2 machine). That is, show that the worst-case error bound grows logarithmically rather than linearly with n!
$$ \tilde{f}(x) = \left\{\begin{matrix} 0 & \text{ if } n=0 \\ x_1 & \text{ if } n = 1\\ \tilde{f}(x_{1: \lfloor n/2 \rfloor }) \oplus \tilde{f}(x_{ \lfloor n/2 \rfloor +1 :n }) + & \text{ if } n > 1 \end{matrix}\right. $$Assume n is in the form $n=2^m$ where $m = \log_{2}{n}$. In this case $\lfloor n/2 \rfloor = 2^{m-1}$. Therefore for n>1:
$$ \tilde{f}(x) = \tilde{f}(x_{1:2^{m-1}}) \oplus \tilde{f}(x_{2^{m-1}+1 : 2^m}) $$In sequence summation, the $x_1$ entry takes part in n-1 number of additions, the $x_2$ entry takes part in n-2 number of additions, and so on, while the $x_n$ entry is only added once. This is not the case for recursive summation where each entry takes part in the exactly $\log_{2}{n}$ additons. This leads to a samilar form as Problem 4:
$$ \tilde{f}(x) = \sum_{i=1}^{n} x_i \prod_{k=1}^{\log_{2}{n}} (1+\varepsilon_k) $$Since $ |\varepsilon_k| \le \varepsilon_{machine} $ and following the same path as problem 4:
$$ \tilde{f}(x) \le (1+ n\varepsilon_{mach}) \frac{(1+ \varepsilon_{mach})^{\log_{2}{n}} -1 }{ \varepsilon_{mach}} \sum_{i=1}^{n} |x_i| $$Because $ (1+ \varepsilon_{mach})^{\log_{2}{n}} = 1 + \log_{2}{n}*O(\varepsilon_{mach}) $ we now have
$$= \tilde{f}(x) \le (1+ \log_{2}{n} * \varepsilon_{mach}) \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) \\$$subracting $\sum_{i=1}^{n} x_i$ to obtain the bounded error: $$ | \widetilde{f}(x) - f(x) | \le (1+ \log_{2}{n} * \varepsilon_{mach}) \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) - \sum_{i=1}^{n} x_i\\ = (\log_{2}{n} * \varepsilon_{mach} \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) ) + \sum_{i=1}^{n} |x_i| + - \sum_{i=1}^{n} x_i\\ \le \log_{2}{n} * \varepsilon_{mach} \sum_{i=1}^{n} |x_i| + O(\varepsilon_{mach}) $$
Similar to Part 4c, total acumulated error after the addition should be proportional to the square root of the number of additions performed. For the sequencial addition scheme, there were $n$ additions performed so error was proportional to $\sqrt{n}$. In the recursive scheme, each entry takes part in $\log_{2}{n}$ additions, so the overall error is proportional to $\sqrt{\log_{2}{n}}$
$$ | \tilde{f}(x) - f(x) | = O( \sqrt{\log_{2}{n}} * \varepsilon_{machine} \sum_{i=1}^{n} x_i)) $$
In [ ]:
In the pset 1 Julia notebook, there is a function “div2sum” that computes ˜f(x) =div2sum(x) in single precision by the above algorithm. Modify it to not be horrendously slow via your suggestion in (c), and then plot its errors for random inputs as a function of n with the help of the example code in the Julia notebook (but with a larger range of lengths n). Are your results consistent with your error estimates above?
In [19]:
# Sum x[first:last]. This function works, but is a little slower than we would like.
function div2sum(x, first=1, last=length(x))
n = last - first + 1;
if n < 2
s = zero(eltype(x))
for i = first:last
s += x[i]
end
return s
else
mid = div(first + last, 2) # find middle as (first+last)/2, rounding down
return div2sum(x, first, mid) + div2sum(x, mid+1, last)
end
end
function div2sum_performance(x, first=1, last=length(x))
n = last - first + 1;
while n > 1
xsum = zeros(eltype(x),int(n/2))
for i = 1:int(n/2)
odd_val = 2i-1
xsum[i] = x[odd_val] + x[odd_val+1]
end
x = xsum
n = int(n/2)
end
return x[1]
end
# check its accuracy for a set logarithmically spaced n's. Since div2sum is slow,
# we won't go to very large n or use too many points
#N = round(Int, logspace(1,7,50)) # 50 points from 10¹ to 10⁷
n = 16
N = zeros(n)
for i = 1:n
N[i] = int(2^i)
end
err = Float64[]
err_perform = Float64[]
for n in N
x = rand(Float32, int(n))
xdouble = float64(x)
push!(err, abs(div2sum(x) - sum(xdouble)) / abs(sum(xdouble)))
push!(err_perform, abs(div2sum_performance(x) - sum(xdouble)) / abs(sum(xdouble)))
end
In [20]:
### make plots of divsum vs divsum_performance
using PyPlot
loglog(N, err, "bo-", label="div2sum")
loglog(N, err_perform, "ro--", label="div2sum_performance")
title("simple div2sum vs. performance")
xlabel("number of summands")
ylabel("relative error")
legend(loc="upper right",fancybox="true")
grid()
In [10]:
### plot extended number of summands & plot upper bound
n = 25
N = zeros(n)
for i = 1:n
N[i] = int(2^i)
end
err_perform = Float64[]
err_bound = zeros(length(N))
for i = 1:length(N)
n = N[i]
x = rand(Float32, int(n))
xdouble = float64(x)
## slope for error bound
#push!(err_bound, (eps(Float32)*float32(Base.log2(n))*sum(abs(x))))
err_bound[i] = (eps(Float32)*float32(Base.log2(n))*sum(abs(x)))
# push!(err_perform, abs(div2sum_performance(x) - sum(xdouble)) / abs(sum(xdouble)))
push!(err_perform, abs(sum(x) - sum(xdouble)) / abs(sum(xdouble)))
end
### make plots of divsum vs divsum_performance
using PyPlot
loglog(N, err_perform, "ro--",label="observed error")
loglog(N, err_bound, "bo-",label="error bound")
title("div2sum performance")
xlabel("number of summands")
ylabel("relative error")
legend(loc="upper right",fancybox="true")
grid()
In [14]:
x = rand(Float32, int(n))
sum(x),div2sum(x),div2sum_performance(x)
Out[14]:
In [21]:
## compare speed
x = rand(Float32, 2^25)
@time div2sum(x)
@time div2sum_performance(x)
@time sum(x)
div2sum(x), div2sum_performance(x), sum(x)
Out[21]:
In [9]:
n = 30
N = zeros(n)
for i = 1:n
N[i] = int(2^i)
end
x = rand(Float32, int(n))
eps(Float32)*float32(Base.log2(n))*sum(abs(x))
Out[9]:
In [7]:
float32(Base.log2(n))
Out[7]:
In [2]:
n
Out[2]:
In [ ]: