Manipulating numbers in Python

Disclaimer: Much of this section has been transcribed from https://pymotw.com/2/math/

Every computer represents numbers using the IEEE floating point standard. The math module implements many of the IEEE functions that would normally be found in the native platform C libraries for complex mathematical operations using floating point values, including logarithms and trigonometric operations.

The fundamental information about number representation is contained in the module sys


In [1]:
import sys 

sys.float_info


Out[1]:
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

From here we can learn, for instance:


In [2]:
sys.float_info.max


Out[2]:
1.7976931348623157e+308

Similarly, we can learn the limits of the IEEE 754 standard

Largest Real = 1.79769e+308, 7fefffffffffffff // -Largest Real = -1.79769e+308, ffefffffffffffff

Smallest Real = 2.22507e-308, 0010000000000000 // -Smallest Real = -2.22507e-308, 8010000000000000

Zero = 0, 0000000000000000 // -Zero = -0, 8000000000000000

eps = 2.22045e-16, 3cb0000000000000 // -eps = -2.22045e-16, bcb0000000000000

Interestingly, one could define an even larger constant (more about this below)


In [3]:
infinity = float("inf")
infinity


Out[3]:
inf

In [4]:
infinity/10000


Out[4]:
inf

Special constants

Many math operations depend on special constants. math includes values for $\pi$ and $e$.


In [26]:
import math

print ('π: %.30f' % math.pi)
print ('e: %.30f' % math.e)
print('nan: {:.30f}'.format(math.nan))
print('inf: {:.30f}'.format(math.inf))


π: 3.141592653589793115997963468544
e: 2.718281828459045090795598298428
nan: nan
inf: inf

Both values are limited in precision only by the platform’s floating point C library.

Testing for exceptional values

Floating point calculations can result in two types of exceptional values. INF (“infinity”) appears when the double used to hold a floating point value overflows from a value with a large absolute value. There are several reserved bit patterns, mostly those with all ones in the exponent field. These allow for tagging special cases as Not A Number—NaN. If there are all ones and the fraction is zero, the number is Infinite.

The IEEE standard specifies:

Inf = Inf, 7ff0000000000000 // -Inf = -Inf, fff0000000000000

NaN = NaN, fff8000000000000 // -NaN = NaN, 7ff8000000000000


In [7]:
float("inf")-float("inf")


Out[7]:
nan

In [27]:
import math

print('{:^3} {:6} {:6} {:6}'.format(
    'e', 'x', 'x**2', 'isinf'))
print('{:-^3} {:-^6} {:-^6} {:-^6}'.format(
    '', '', '', ''))

for e in range(0, 201, 20):
    x = 10.0 ** e
    y = x * x
    print('{:3d} {:<6g} {:<6g} {!s:6}'.format(
        e, x, y, math.isinf(y),))


 e  x      x**2   isinf 
--- ------ ------ ------
  0 1      1      False 
 20 1e+20  1e+40  False 
 40 1e+40  1e+80  False 
 60 1e+60  1e+120 False 
 80 1e+80  1e+160 False 
100 1e+100 1e+200 False 
120 1e+120 1e+240 False 
140 1e+140 1e+280 False 
160 1e+160 inf    True  
180 1e+180 inf    True  
200 1e+200 inf    True  

When the exponent in this example grows large enough, the square of x no longer fits inside a double, and the value is recorded as infinite. Not all floating point overflows result in INF values, however. Calculating an exponent with floating point values, in particular, raises OverflowError instead of preserving the INF result.


In [28]:
x = 10.0 ** 200

print('x    =', x)
print('x*x  =', x*x)
try:
    print('x**2 =', x**2)
except OverflowError as err:
    print(err)


x    = 1e+200
x*x  = inf
(34, 'Result too large')

This discrepancy is caused by an implementation difference in the library used by C Python.

Division operations using infinite values are undefined. The result of dividing a number by infinity is NaN (“not a number”).


In [12]:
import math

x = (10.0 ** 200) * (10.0 ** 200)
y = x/x

print('x =', x)
print('isnan(x) =', math.isnan(x))
print('y = x / x =', x/x)
print('y == nan =', y == float('nan'))
print('isnan(y) =', math.isnan(y))


x = inf
isnan(x) = False
y = x / x = nan
y == nan = False
isnan(y) = True

Comparing

Comparisons for floating point values can be error prone, with each step of the computation potentially introducing errors due to the numerical representation. The isclose() function uses a stable algorithm to minimize these errors and provide a way for relative as well as absolute comparisons. The formula used is equivalent to

abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol) By default, isclose() uses relative comparison with the tolerance set to 1e-09, meaning that the difference between the values must be less than or equal to 1e-09 times the larger absolute value between a and b. Passing a keyword argument rel_tol to isclose() changes the tolerance. In this example, the values must be within 10% of each other.

The comparison between 0.1 and 0.09 fails because of the error representing 0.1.


In [29]:
import math

INPUTS = [
    (1000, 900, 0.1),
    (100, 90, 0.1),
    (10, 9, 0.1),
    (1, 0.9, 0.1),
    (0.1, 0.09, 0.1),
]

print('{:^8} {:^8} {:^8} {:^8} {:^8} {:^8}'.format(
    'a', 'b', 'rel_tol', 'abs(a-b)', 'tolerance', 'close')
)
print('{:-^8} {:-^8} {:-^8} {:-^8} {:-^8} {:-^8}'.format(
    '-', '-', '-', '-', '-', '-'),
)

fmt = '{:8.2f} {:8.2f} {:8.2f} {:8.2f} {:8.2f} {!s:>8}'

for a, b, rel_tol in INPUTS:
    close = math.isclose(a, b, rel_tol=rel_tol)
    tolerance = rel_tol * max(abs(a), abs(b))
    abs_diff = abs(a - b)
    print(fmt.format(a, b, rel_tol, abs_diff, tolerance, close))


   a        b     rel_tol  abs(a-b) tolerance  close  
-------- -------- -------- -------- -------- --------
 1000.00   900.00     0.10   100.00   100.00     True
  100.00    90.00     0.10    10.00    10.00     True
   10.00     9.00     0.10     1.00     1.00     True
    1.00     0.90     0.10     0.10     0.10     True
    0.10     0.09     0.10     0.01     0.01    False

To use a fixed or "absolute" tolerance, pass abs_tol instead of rel_tol.

For an absolute tolerance, the difference between the input values must be less than the tolerance given.


In [30]:
import math

INPUTS = [
    (1.0, 1.0 + 1e-07, 1e-08),
    (1.0, 1.0 + 1e-08, 1e-08),
    (1.0, 1.0 + 1e-09, 1e-08),
]

print('{:^8} {:^11} {:^8} {:^10} {:^8}'.format(
    'a', 'b', 'abs_tol', 'abs(a-b)', 'close')
)
print('{:-^8} {:-^11} {:-^8} {:-^10} {:-^8}'.format(
    '-', '-', '-', '-', '-'),
)

for a, b, abs_tol in INPUTS:
    close = math.isclose(a, b, abs_tol=abs_tol)
    abs_diff = abs(a - b)
    print('{:8.2f} {:11} {:8} {:0.9f} {!s:>8}'.format(
        a, b, abs_tol, abs_diff, close))


   a          b      abs_tol   abs(a-b)   close  
-------- ----------- -------- ---------- --------
    1.00   1.0000001    1e-08 0.000000100    False
    1.00  1.00000001    1e-08 0.000000010     True
    1.00 1.000000001    1e-08 0.000000001     True

nan and inf are special cases. nan is never close to another value, including itself. inf is only close to itself.


In [36]:
import math

print('nan, nan:', math.isclose(math.nan, math.nan))
print('nan, 1.0:', math.isclose(math.nan, 1.0))
print('inf, inf:', math.isclose(math.inf, math.inf))
print('inf, 1.0:', math.isclose(math.inf, 1.0))


nan, nan: False
nan, 1.0: False
inf, inf: True
inf, 1.0: False

Converting to Integers

The math module includes three functions for converting floating point values to whole numbers. Each takes a different approach, and will be useful in different circumstances.

The simplest is trunc(), which truncates the digits following the decimal, leaving only the significant digits making up the whole number portion of the value. floor() converts its input to the largest preceding integer, and ceil() (ceiling) produces the largest integer following sequentially after the input value.


In [13]:
import math

print('{:^5}  {:^5}  {:^5}  {:^5}  {:^5}'.format('i', 'int', 'trunk', 'floor', 'ceil'))
print('{:-^5}  {:-^5}  {:-^5}  {:-^5}  {:-^5}'.format('', '', '', '', ''))

fmt = '  '.join(['{:5.1f}'] * 5)

for i in [ -1.5, -0.8, -0.5, -0.2, 0, 0.2, 0.5, 0.8, 1 ]:
    print (fmt.format(i, int(i), math.trunc(i), math.floor(i), math.ceil(i)))


  i     int   trunk  floor  ceil 
-----  -----  -----  -----  -----
 -1.5   -1.0   -1.0   -2.0   -1.0
 -0.8    0.0    0.0   -1.0    0.0
 -0.5    0.0    0.0   -1.0    0.0
 -0.2    0.0    0.0   -1.0    0.0
  0.0    0.0    0.0    0.0    0.0
  0.2    0.0    0.0    0.0    1.0
  0.5    0.0    0.0    0.0    1.0
  0.8    0.0    0.0    0.0    1.0
  1.0    1.0    1.0    1.0    1.0

Alternate Representations

modf() takes a single floating point number and returns a tuple containing the fractional and whole number parts of the input value.


In [14]:
import math

for i in range(6):
    print('{}/2 = {}'.format(i, math.modf(i/2.0)))


0/2 = (0.0, 0.0)
1/2 = (0.5, 0.0)
2/2 = (0.0, 1.0)
3/2 = (0.5, 1.0)
4/2 = (0.0, 2.0)
5/2 = (0.5, 2.0)

frexp() returns the mantissa and exponent of a floating point number, and can be used to create a more portable representation of the value. It uses the formula x = m * 2 ** e, and returns the values m and e.


In [15]:
import math

print('{:^7}  {:^7}  {:^7}'.format('x', 'm', 'e'))
print('{:-^7}  {:-^7}  {:-^7}'.format('', '', ''))

for x in [ 0.1, 0.5, 4.0 ]:
    m, e = math.frexp(x)
    print('{:7.2f}  {:7.2f}  {:7d}'.format(x, m, e))


   x        m        e   
-------  -------  -------
   0.10     0.80       -3
   0.50     0.50        0
   4.00     0.50        3

ldexp() is the inverse of frexp(). Using the same formula as frexp(), ldexp() takes the mantissa and exponent values as arguments and returns a floating point number.


In [16]:
import math

print('{:^7}  {:^7}  {:^7}'.format('m', 'e', 'x'))
print('{:-^7}  {:-^7}  {:-^7}'.format('', '', ''))

for m, e in [ (0.8, -3),
              (0.5,  0),
              (0.5,  3),
              ]:
    x = math.ldexp(m, e)
    print('{:7.2f}  {:7d}  {:7.2f}'.format(m, e, x))


   m        e        x   
-------  -------  -------
   0.80       -3     0.10
   0.50        0     0.50
   0.50        3     4.00

Positive and Negative Signs

The absolute value of number is its value without a sign. Use fabs() to calculate the absolute value of a floating point number.


In [17]:
import math

print(math.fabs(-1.1))
print(math.fabs(-0.0))
print(math.fabs(0.0))
print(math.fabs(1.1))


1.1
0.0
0.0
1.1

To determine the sign of a value, either to give a set of values the same sign or simply for comparison, use copysign() to set the sign of a known good value. An extra function like copysign() is needed because comparing NaN and -NaN directly with other values does not work.


In [18]:
import math

print
print('{:^5}  {:^5}  {:^5}  {:^5}  {:^5}'.format('f', 's', '< 0', '> 0', '= 0'))
print('{:-^5}  {:-^5}  {:-^5}  {:-^5}  {:-^5}'.format('', '', '', '', ''))

for f in [ -1.0,
            0.0,
            1.0,
            float('-inf'),
            float('inf'),
            float('-nan'),
            float('nan'),
            ]:
    s = int(math.copysign(1, f))
    print('{:5.1f}  {:5d}  {!s:5}  {!s:5}  {!s:5}'.format(f, s, f < 0, f > 0, f==0))


  f      s     < 0    > 0    = 0 
-----  -----  -----  -----  -----
 -1.0     -1  True   False  False
  0.0      1  False  False  True 
  1.0      1  False  True   False
 -inf     -1  True   False  False
  inf      1  False  True   False
  nan     -1  False  False  False
  nan      1  False  False  False

Commonly Used Calculations

Representing precise values in binary floating point memory is challenging. Some values cannot be represented exactly, and the more often a value is manipulated through repeated calculations, the more likely a representation error will be introduced. math includes a function for computing the sum of a series of floating point numbers using an efficient algorithm that minimize such errors.


In [19]:
import math

values = [ 0.1 ] * 10

print('Input values:', values)

print('sum()       : {:.20f}'.format(sum(values)))

s = 0.0
for i in values:
    s += i
print('for-loop    : {:.20f}'.format(s))
    
print('math.fsum() : {:.20f}'.format(math.fsum(values)))


Input values: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
sum()       : 0.99999999999999988898
for-loop    : 0.99999999999999988898
math.fsum() : 1.00000000000000000000

Given a sequence of ten values each equal to 0.1, the expected value for the sum of the sequence is 1.0. Since 0.1 cannot be represented exactly as a floating point value, however, errors are introduced into the sum unless it is calculated with fsum().

factorial() is commonly used to calculate the number of permutations and combinations of a series of objects. The factorial of a positive integer n, expressed n!, is defined recursively as (n-1)! * n and stops with 0! == 1. factorial() only works with whole numbers, but does accept float arguments as long as they can be converted to an integer without losing value.


In [37]:
import math

for i in [ 0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.1 ]:
    try:
        print('{:2.0f}  {:6.0f}'.format(i, math.factorial(i)))
    except ValueError as err:
        print('Error computing factorial(%s):' % i, err)


 0       1
 1       1
 2       2
 3       6
 4      24
 5     120
Error computing factorial(6.1): factorial() only accepts integral values

The modulo operator (%) computes the remainder of a division expression (i.e., 5 % 2 = 1). The operator built into the language works well with integers but, as with so many other floating point operations, intermediate calculations cause representational issues that result in a loss of data. fmod() provides a more accurate implementation for floating point values.


In [21]:
import math

print('{:^4}  {:^4}  {:^5}  {:^5}'.format('x', 'y', '%', 'fmod'))
print('----  ----  -----  -----')

for x, y in [ (5, 2),
              (5, -2),
              (-5, 2),
              ]:
    print('{:4.1f}  {:4.1f}  {:5.2f}  {:5.2f}'.format(x, y, x % y, math.fmod(x, y)))


 x     y      %    fmod 
----  ----  -----  -----
 5.0   2.0   1.00   1.00
 5.0  -2.0  -1.00   1.00
-5.0   2.0   1.00  -1.00

A potentially more frequent source of confusion is the fact that the algorithm used by fmod for computing modulo is also different from that used by %, so the sign of the result is different. mixed-sign inputs.

Exponents and Logarithms

Exponential growth curves appear in economics, physics, and other sciences. Python has a built-in exponentiation operator (“**”), but pow() can be useful when you need to pass a callable function as an argument.


In [22]:
import math

for x, y in [
    # Typical uses
    (2, 3),
    (2.1, 3.2),

    # Always 1
    (1.0, 5),
    (2.0, 0),

    # Not-a-number
    (2, float('nan')),

    # Roots
    (9.0, 0.5),
    (27.0, 1.0/3),
    ]:
    print('{:5.1f} ** {:5.3f} = {:6.3f}'.format(x, y, math.pow(x, y)))


  2.0 ** 3.000 =  8.000
  2.1 ** 3.200 = 10.742
  1.0 ** 5.000 =  1.000
  2.0 ** 0.000 =  1.000
  2.0 **   nan =    nan
  9.0 ** 0.500 =  3.000
 27.0 ** 0.333 =  3.000

Raising 1 to any power always returns 1.0, as does raising any value to a power of 0.0. Most operations on the not-a-number value nan return nan. If the exponent is less than 1, pow() computes a root.

Since square roots (exponent of 1/2) are used so frequently, there is a separate function for computing them.


In [40]:
import math

print(math.sqrt(9.0))
print(math.sqrt(3))
try:
    print(math.sqrt(-1))
except ValueError as err:
    print('Cannot compute sqrt(-1):', err)


3.0
1.7320508075688772
Cannot compute sqrt(-1): math domain error

Computing the square roots of negative numbers requires complex numbers, which are not handled by math. Any attempt to calculate a square root of a negative value results in a ValueError.

There are two variations of log(). Given floating point representation and rounding errors the computed value produced by log(x, b) has limited accuracy, especially for some bases. log10() computes log(x, 10), using a more accurate algorithm than log().


In [41]:
import math

print('{:2}  {:^12}  {:^20}  {:^20}  {:8}'.format('i', 'x', 'accurate', 'inaccurate', 'mismatch'))
print('{:-^2}  {:-^12}  {:-^20}  {:-^20}  {:-^8}'.format('', '', '', '', ''))

for i in range(0, 10):
    x = math.pow(10, i)
    accurate = math.log10(x)
    inaccurate = math.log(x, 10)
    match = '' if int(inaccurate) == i else '*'
    print('{:2d}  {:12.1f}  {:20.18f}  {:20.18f}  {:^5}'.format(i, x, accurate, inaccurate, match))


i        x              accurate             inaccurate       mismatch
--  ------------  --------------------  --------------------  --------
 0           1.0  0.000000000000000000  0.000000000000000000       
 1          10.0  1.000000000000000000  1.000000000000000000       
 2         100.0  2.000000000000000000  2.000000000000000000       
 3        1000.0  3.000000000000000000  2.999999999999999556    *  
 4       10000.0  4.000000000000000000  4.000000000000000000       
 5      100000.0  5.000000000000000000  5.000000000000000000       
 6     1000000.0  6.000000000000000000  5.999999999999999112    *  
 7    10000000.0  7.000000000000000000  7.000000000000000000       
 8   100000000.0  8.000000000000000000  8.000000000000000000       
 9  1000000000.0  9.000000000000000000  8.999999999999998224    *  

The lines in the output with trailing * highlight the inaccurate values.

As with other special-case functions, the function exp() uses an algorithm that produces more accurate results than the general-purpose equivalent math.pow(math.e, x).


In [42]:
import math

x = 2

fmt = '%.20f'
print(fmt % (math.e ** 2))
print(fmt % math.pow(math.e, 2))
print(fmt % math.exp(2))


7.38905609893064951876
7.38905609893064951876
7.38905609893065040694

For more information about other mathematical functions, including trigonometric ones, we refer to https://pymotw.com/2/math/

The python references can be found at https://docs.python.org/2/library/math.html