An IPython Notebook Documenting an Awesome Thing I Realized
by Daniel P. Gettings
In Python and other interpreted languages (e.g. IDL), one should attempt to do as little looping (at the interpreted level) as possible to maximize speed. This is what the vectorized array routines in Numpy are for -- they do most of the looping in C or C++ (which is much faster at looping). Some of the coolest and most efficient uses of Numpy arrays come from "array broadcasting". This set of notes give a short demo showing how broadcasting can dramatically increase the efficiency of the spherical distance calculations that are common in Astronomy.
In [1]:
from kapteyn.maputils import dist_on_sphere as kap_dist
import numpy as np
# Nicer printing for a notebook
np.set_printoptions(formatter={'float':lambda x: '{0:0.3f}'.format(x)})
"""
Using the angular distance function from the Kapteyn
package (http://www.astro.rug.nl/software/kapteyn),
which uses the formula from Vincenty (1975), which
is accurate over the whole sphere.
"""
Out[1]:
Task:
Given two sets of astronomical coordinates (cat1 and cat2):
In [10]:
# Generate Random (RA,Dec) Points
ra1 = np.random.rand(6) - 0.5
dec1 = np.random.rand(6) - 0.5
ra2 = np.random.rand(4) + 180-0.5
dec2 = np.random.rand(4) - 0.5
Method 1 -- Looping
Loop over every source in cat1, calculating the distance to every source in cat2. This is a pretty standard "vectorized" method.
In [11]:
for ind in xrange(ra1.size):
# Calculate the distance to every cat2 source, one cat1 source at a time
dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print 'Separations:',repr(dist),
print '| Min Sep Value: {0:0.3f}'.format(dist.min()), '| Cat2 Min Index:', dist.argmin()
Method 2 -- Broadcasting
Broadcasting kind of works like the "Outer Product" in matrix multiplication. You will notice that the results are the same!
In [12]:
# Reshaping one of the sets of coordinates tells Numpy to "broadcast"
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# The Numpy Functions in the Distance Calculation propagate the broadcasting
dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())
print 'Separations:'
print ' ',repr(dist_broadcast)
print 'Min Sep Values:'
print ' ',repr(dist_broadcast.min(axis=1))
print 'Cat2 Min Indices:'
print ' ',repr(dist_broadcast.argmin(axis=1))
In [5]:
# Small cat1, large cat2
# -----------------------
# Generate Points, Normal and Broadcasted
ra1 = np.random.rand(20) - 0.5
dec1 = np.random.rand(20) - 0.5
ra2 = np.random.rand(100) + 180-0.5
dec2 = np.random.rand(100) - 0.5
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# Do Time Testing
print 'Test 1'
print ' cat1: {0:d} points, cat2: {1:d} points'.format(ra1.size, ra2.size)
print '--------------------------------------'
print '>> Looping Method: '
%timeit for ind in xrange(ra1.size): dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print '>> Broadcasting Method: '
%timeit dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())
In [6]:
# Large cat1, small cat2
# -----------------------
# Generate Points, Normal and Broadcasted
ra1 = np.random.rand(100) - 0.5
dec1 = np.random.rand(100) - 0.5
ra2 = np.random.rand(20) + 180-0.5
dec2 = np.random.rand(20) - 0.5
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# Do Time Testing
print 'Test 2'
print ' cat1: {0:d} points, cat2: {1:d} points'.format(ra1.size, ra2.size)
print '--------------------------------------'
print '>> Looping Method: '
%timeit for ind in xrange(ra1.size): dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print '>> Broadcasting Method: '
%timeit dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())
In [7]:
# Large cat1, Large cat2
# -----------------------
# Generate Points, Normal and Broadcasted
ra1 = np.random.rand(100) - 0.5
dec1 = np.random.rand(100) - 0.5
ra2 = np.random.rand(100) + 180-0.5
dec2 = np.random.rand(100) - 0.5
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# Do Time Testing
print 'Test 3'
print ' cat1: {0:d} points, cat2: {1:d} points'.format(ra1.size, ra2.size)
print '--------------------------------------'
print '>> Looping Method: '
%timeit for ind in xrange(ra1.size): dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print '>> Broadcasting Method: '
%timeit dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())
Here we compare the worst-case runtime scenarios for each method.
For broadcasting, the worst-case is a very small number of sources in cat1 relative to cat2. In this case, looping gains its greatest level of optimization from Numpy's vectorized array operations, putting nearly all the actual looping in C/C++. But it still performs fewer loop iterations in C/C++ than the broadcasted calculation.
The worse-case for looping is a very large number of cat1 sources relative to cat2. Only a tiny fraction of the actual looping gets done in C/C++, while the vast majority of it is performed in (slow) Python. For broadcasting, the calculation looks nearly identical to its worst-case, putting all the looping in C/C++.
In [8]:
# ----------------------------------
# Broadcasting Worst Case Scenario
# --> Tiny cat1, huge cat2
# ----------------------------------
# Generate Points, Normal and Broadcasted
ra1 = np.random.rand(5) - 0.5
dec1 = np.random.rand(5) - 0.5
ra2 = np.random.rand(1e4) + 180-0.5
dec2 = np.random.rand(1e4) - 0.5
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# Do Time Testing
print 'Broadcasting Worst Case Scenario'
print ' cat1: {0:d} points, cat2: {1:d} points'.format(ra1.size, ra2.size)
print '--------------------------------------'
print '>> Looping Method: '
%timeit for ind in xrange(ra1.size): dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print '>> Broadcasting Method: '
%timeit dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())
In [9]:
# ----------------------------------
# Looping Worst Case Scenario
# --> Huge cat1, tiny cat2
# ----------------------------------
# Generate Points, Normal and Broadcasted
ra1 = np.random.rand(1e4) - 0.5
dec1 = np.random.rand(1e4) - 0.5
ra2 = np.random.rand(5) + 180-0.5
dec2 = np.random.rand(5) - 0.5
ra1_vert = ra1.reshape((-1,1))
dec1_vert = dec1.reshape((-1,1))
# Do Time Testing
print 'Looping Worst Case Scenario'
print ' cat1: {0:d} points, cat2: {1:d} points'.format(ra1.size, ra2.size)
print '--------------------------------------'
print '>> Looping Method: '
%timeit for ind in xrange(ra1.size): dist = kap_dist(ra1[ind].copy(), dec1[ind].copy(), ra2.copy(), dec2.copy())
print '>> Broadcasting Method: '
%timeit dist_broadcast = kap_dist(ra1_vert.copy(), dec1_vert.copy(), ra2.copy(), dec2.copy())