Using numpy and KD-trees with netCDF data

There is now a Unidata Developer's Blog entry accompanying this iPython notebook.

The goal is to demonstrate how to quickly access netCDF data based on geospatial coordinates instead of array indices.

  • First we show a naive and slow way to do this, in which we also have to worry about longitude anomalies
  • Then we speed up access with numpy arrays
  • Next, we demonstrate how to eliminate longitude anomalies
  • Finally, we use a kd-tree data structure to significantly speed up access by coordinates for large problems

Getting data by coordinates from a netCDF File

Let's look at a netCDF file from the Atlantic Real-Time Ocean Forecast System. If you have cloned the Unidata 2013 TDS-Python Workshop, this data is already available in 'data/rtofs_glo_3dz_f006_6hrly_reg3.nc'. Otherwise you can get it from rtofs_glo_3dz_f006_6hrly_reg3.nc.

Looking at netCDF metadata from Python

In iPython, we could invoke the ncdump utility like this:

filename = 'data/rtofs_glo_3dz_f006_6hrly_reg3.nc'
!ncdump -h $filename

if we know that a recent version of ncdump is installed that can read compressed data from netCDF-4 classic model files.

Alternatively, we'll use the netCDF4python package to show information about the file in a form that's somewhat less familiar, but contains the information we need for the subsequent examples. This works for any netCDF file format:


In [1]:
import netCDF4
filename = 'data/rtofs_glo_3dz_f006_6hrly_reg3.nc'
ncfile = netCDF4.Dataset(filename, 'r')
print ncfile                # shows global attributes, dimensions, and variables
ncvars = ncfile.variables   # a dictionary of variables
# print information about specific variables, including type, shape, and attributes
for varname in ['temperature', 'salinity', 'Latitude', 'Longitude']:
    print ncvars[varname]


<type 'netCDF4.Dataset'>
root group (NETCDF4_CLASSIC file format):
    Conventions: CF-1.0
    title: HYCOM ATLb2.00
    institution: National Centers for Environmental Prediction
    source: HYCOM archive file
    experiment: 90.9
    history: archv2ncdf3z
    dimensions = (u'MT', u'Y', u'X', u'Depth')
    variables = (u'MT', u'Date', u'Depth', u'Y', u'X', u'Latitude', u'Longitude', u'u', u'v', u'temperature', u'salinity')
    groups = ()

<type 'netCDF4.Variable'>
float32 temperature(u'MT', u'Depth', u'Y', u'X')
    coordinates: Longitude Latitude Date
    standard_name: sea_water_potential_temperature
    units: degC
    _FillValue: 1.26765e+30
    valid_range: [ -5.07860279  11.14989948]
    long_name:   temp [90.9H]
unlimited dimensions = (u'MT',)
current size = (1, 10, 850, 712)

<type 'netCDF4.Variable'>
float32 salinity(u'MT', u'Depth', u'Y', u'X')
    coordinates: Longitude Latitude Date
    standard_name: sea_water_salinity
    units: psu
    _FillValue: 1.26765e+30
    valid_range: [ 11.61832619  35.04695129]
    long_name:  salinity [90.9H]
unlimited dimensions = (u'MT',)
current size = (1, 10, 850, 712)

<type 'netCDF4.Variable'>
float32 Latitude(u'Y', u'X')
    standard_name: latitude
    units: degrees_north
unlimited dimensions = ()
current size = (850, 712)

<type 'netCDF4.Variable'>
float32 Longitude(u'Y', u'X')
    standard_name: longitude
    units: degrees_east
unlimited dimensions = ()
current size = (850, 712)

Here's a sparse picture (every 25th point on each axis) of what the grid looks like on which Latitude, Longitude, Temperature, Salinity, and other variables are defined:

Example query: sea surface temperature and salinity at 50N, 140W?

  • So Longitude and Latitude are 2D netCDF variables of shape 850 x 712, indexed by Y and X dimensions
  • That's 605200 values for each
  • There's no direct way in this file (and many netCDF files) to compute grid indexes from coordinates via a coordinate system and projection parameters. Instead, we have to rely on the latitude and longitude auxiliary coordinate variables, as required by the CF conventions for data not on a simple lat,lon grid.
  • To get the temperature at 50N, 140W, we need to find Y and X indexes iy and ix such that (Longitude[iy, ix], Latitude[iy, ix]) is "close" to (50.0, -140.0).

Naive, slow way using nested loops

  • Initially, for simplicity, we just use Euclidean distance squared, as if the Earth is flat, latitude and longitude are $x$- and $y$-coordinates, and the distance squared between points $(lat_1,lon_1)$ and $(lat_0,lon_0)$ is $( lat_1 - lat_0 )^2 + ( lon_1 - lon_0 )^2$.
  • Note: these assumptions are wrong near the poles and on opposite sides of longitude boundary discontinuity.
  • So, keeping things simple, we want to find iy and ix to minimize

    (Latitude[iy, ix] - lat0)**2 + (Longitude[iy, ix] - lon0)**2

Reading netCDF data into numpy arrays

To access netCDF data, rather than just metadata, we will also need NumPy:

  • A Python library for scientific programming.
  • Supports n-dimensional array-based calculations similar to Fortran and IDL.
  • Includes fast mathematical functions to act on scalars and arrays.

With the Python netCDF4 package, using "[ ... ]" to index a netCDF variable object reads or writes a numpy array from the associated netCDF file.

The code below reads latitude and longitude values into 2D numpy arrays named latvals and lonvals:

First version: slow and spatially challenged

Here's a function that uses simple nested loops to find indices that minimize the distance to the desired coordinates, written as if using Fortran or C rather than Python. We'll call this function in the cell following this definition ...


In [2]:
import numpy as np
import netCDF4

def naive_slow(latvar,lonvar,lat0,lon0):
    '''
    Find "closest" point in a set of (lat,lon) points to specified point
    latvar - 2D latitude variable from an open netCDF dataset
    lonvar - 2D longitude variable from an open netCDF dataset
    lat0,lon0 - query point
    Returns iy,ix such that 
     (lonval[iy,ix] - lon0)**2 + (latval[iy,ix] - lat0)**2
    is minimum.  This "closeness" measure works badly near poles and
    longitude boundaries.
    '''
    # Read from file into numpy arrays
    latvals = latvar[:]
    lonvals = lonvar[:]
    ny,nx = latvals.shape
    dist_sq_min = 1.0e30
    for iy in range(ny):
        for ix in range(nx):
            latval = latvals[iy, ix]
            lonval = lonvals[iy, ix]
            dist_sq = (latval - lat0)**2 + (lonval - lon0)**2
            if dist_sq < dist_sq_min:
                iy_min, ix_min, dist_sq_min = iy, ix, dist_sq
    return iy_min,ix_min

When we call the function above it takes several seconds to run, because it calculates distances one point at a time, for each of the 605200 $(lat, lon)$ points. Note that once indices for the point nearest to (50, -140) are found, they can be used to access temperature, salinity, and other netCDF variables that use the same dimensions.


In [3]:
ncfile = netCDF4.Dataset(filename, 'r')
latvar = ncfile.variables['Latitude']
lonvar = ncfile.variables['Longitude']
iy,ix = naive_slow(latvar, lonvar, 50.0, -140.0)
print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix]
tempvar = ncfile.variables['temperature']
salvar = ncfile.variables['salinity']
print 'temperature:', tempvar[0, 0, iy, ix], tempvar.units
print 'salinity:', salvar[0, 0, iy, ix], salvar.units
ncfile.close()


Closest lat lon: 49.9867 -139.982
temperature: 6.46312 degC
salinity: 32.6572 psu

NumPy arrays instead of loops: fast, but still assumes flat earth

The above function is slow, because it doesn't make good use of NumPy arrays. It's much faster to use whole array operations to eliminate loops and element-at-a-time computation. NumPy functions that help eliminate loops include:

  • The argmin() method that returns a 1D index of the minimum value of a NumPy array
  • The unravel_index() function that converts a 1D index back into a multidimensional index

In [4]:
import numpy as np
import netCDF4

def naive_fast(latvar,lonvar,lat0,lon0):
    # Read latitude and longitude from file into numpy arrays
    latvals = latvar[:]
    lonvals = lonvar[:]
    ny,nx = latvals.shape
    dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2
    minindex_flattened = dist_sq.argmin()  # 1D index of min element
    iy_min,ix_min = np.unravel_index(minindex_flattened, latvals.shape)
    return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
latvar = ncfile.variables['Latitude']
lonvar = ncfile.variables['Longitude']
iy,ix = naive_fast(latvar, lonvar, 50.0, -140.0)
print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

Spherical Earth with tunnel distance: fast and correct

Though assuming a flat Earth may work OK for this example, we'd like to not worry about whether longitudes are from 0 to 360 or -180 to 180, or whether points are close to the poles. The code below fixes this by using the square of "tunnel distance" between (lat,lon) points. This version is both fast and correct (for a spherical Earth).


In [5]:
import numpy as np
import netCDF4
from math import pi
from numpy import cos, sin

def tunnel_fast(latvar,lonvar,lat0,lon0):
    '''
    Find closest point in a set of (lat,lon) points to specified point
    latvar - 2D latitude variable from an open netCDF dataset
    lonvar - 2D longitude variable from an open netCDF dataset
    lat0,lon0 - query point
    Returns iy,ix such that the square of the tunnel distance
    between (latval[it,ix],lonval[iy,ix]) and (lat0,lon0)
    is minimum.
    '''
    rad_factor = pi/180.0 # for trignometry, need angles in radians
    # Read latitude and longitude from file into numpy arrays
    latvals = latvar[:] * rad_factor
    lonvals = lonvar[:] * rad_factor
    ny,nx = latvals.shape
    lat0_rad = lat0 * rad_factor
    lon0_rad = lon0 * rad_factor
    # Compute numpy arrays for all values, no loops
    clat,clon = cos(latvals),cos(lonvals)
    slat,slon = sin(latvals),sin(lonvals)
    delX = cos(lat0_rad)*cos(lon0_rad) - clat*clon
    delY = cos(lat0_rad)*sin(lon0_rad) - clat*slon
    delZ = sin(lat0_rad) - slat;
    dist_sq = delX**2 + delY**2 + delZ**2
    minindex_1d = dist_sq.argmin()  # 1D index of minimum element
    iy_min,ix_min = np.unravel_index(minindex_1d, latvals.shape)
    return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
latvar = ncfile.variables['Latitude']
lonvar = ncfile.variables['Longitude']
iy,ix = tunnel_fast(latvar, lonvar, 50.0, -140.0)
print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

KD-Trees: faster data structure for lots of queries

We can still do better, by using a data structure designed to support efficient nearest-neighbor queries: the KD-tree. It works like a multidimensional binary tree, so finding the point nearest to a query point is much faster than computing all the distances to find the minimum. It takes some setup time to load all the points into the data structure, but that only has to be done once for a given set of points.

For a single point query, it's still more than twice as fast as the naive slow version above, but building the KD-tree for 605,200 points takes more time than the fast numpy search through all the points, so in this case using the KD-tree for a single point query is sort of pointless ...


In [6]:
import numpy as np
import netCDF4
from math import pi
from numpy import cos, sin
from scipy.spatial import cKDTree

def kdtree_fast(latvar,lonvar,lat0,lon0):
    rad_factor = pi/180.0 # for trignometry, need angles in radians
    # Read latitude and longitude from file into numpy arrays
    latvals = latvar[:] * rad_factor
    lonvals = lonvar[:] * rad_factor
    ny,nx = latvals.shape
    clat,clon = cos(latvals),cos(lonvals)
    slat,slon = sin(latvals),sin(lonvals)
    # Build kd-tree from big arrays of 3D coordinates
    triples = zip(np.ravel(clat*clon), np.ravel(clat*slon), np.ravel(slat))
    kdt = cKDTree(triples)
    lat0_rad = lat0 * rad_factor
    lon0_rad = lon0 * rad_factor
    clat0,clon0 = cos(lat0_rad),cos(lon0_rad)
    slat0,slon0 = sin(lat0_rad),sin(lon0_rad)
    dist_sq_min, minindex_1d = kdt.query([clat0*clon0, clat0*slon0, slat0])
    iy_min, ix_min = np.unravel_index(minindex_1d, latvals.shape)
    return iy_min,ix_min
                
ncfile = netCDF4.Dataset(filename, 'r')
latvar = ncfile.variables['Latitude']
lonvar = ncfile.variables['Longitude']
iy,ix = kdtree_fast(latvar, lonvar, 50.0, -140.0)
print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

Timing the functions

If you're curious about actual times for the versions above, the iPython notebook "%%timeit" statement gets accurate timings of all of them. Below, we time just a single query point, in this case (50.0, -140.0). To get accurate timings, the "%%timeit" statement lets us do untimed setup first on the same line, before running the function call in a loop.


In [7]:
%%timeit ncfile = netCDF4.Dataset(filename,'r');latvar = ncfile.variables['Latitude'];lonvar = ncfile.variables['Longitude']
naive_slow(latvar, lonvar, 50.0, -140.0)


1 loops, best of 3: 7.54 s per loop

In [8]:
%%timeit ncfile = netCDF4.Dataset(filename,'r');latvar = ncfile.variables['Latitude'];lonvar = ncfile.variables['Longitude']
naive_fast(latvar, lonvar, 50.0, -140.0)


100 loops, best of 3: 7.65 ms per loop

In [9]:
%%timeit ncfile = netCDF4.Dataset(filename,'r');latvar = ncfile.variables['Latitude'];lonvar = ncfile.variables['Longitude']
tunnel_fast(latvar, lonvar, 50.0, -140.0)


10 loops, best of 3: 35.8 ms per loop

In [10]:
%%timeit ncfile = netCDF4.Dataset(filename,'r');latvar = ncfile.variables['Latitude'];lonvar = ncfile.variables['Longitude']
kdtree_fast(latvar, lonvar, 50.0, -140.0)


1 loops, best of 3: 2.54 s per loop

In [11]:
ncfile.close()

Separating setup from query

The above use of the KD-tree data structure is not the way it's meant to be used. Instead, it should be initialized once with all the k-dimensional data for which nearest-neighbors are desired, then used repeatedly on each query, amortizing the work done to build the data structure over all the following queries. By separately timing the setup and the time required per query, the threshold for number of queries beyond which the KD-tree is faster can be determined.

That's exactly what we'll do now. We split each algorithm into two functions, a setup function and a query function. The times per query go from seconds (the naive version) to milliseconds (the array-oriented numpy version) to microseconds (the turbo-charged KD-tree, once it's built).

Rather than just using functions, we define a Class for each algorithm, do the setup in the class constructor, and provide a query method.


In [12]:
# Split naive_slow into initialization and query, so we can time them separately
import numpy as np
import netCDF4

class Naive_slow(object):
    def __init__(self, ncfile, latvarname, lonvarname):
        self.ncfile = ncfile
        self.latvar = self.ncfile.variables[latvarname]
        self.lonvar = self.ncfile.variables[lonvarname]
        # Read latitude and longitude from file into numpy arrays
        self.latvals = self.latvar[:]
        self.lonvals = self.lonvar[:]
        self.shape = self.latvals.shape

    def query(self,lat0,lon0):
        ny,nx = self.shape
        dist_sq_min = 1.0e30
        for iy in range(ny):
            for ix in range(nx):
                latval = self.latvals[iy, ix]
                lonval = self.lonvals[iy, ix]
                dist_sq = (latval - lat0)**2 + (lonval - lon0)**2
                if dist_sq < dist_sq_min:
                    iy_min, ix_min, dist_sq_min = iy, ix, dist_sq
        return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
ns = Naive_slow(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)
print 'Closest lat lon:', ns.latvar[iy,ix], ns.lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

In [13]:
# Split naive_fast into initialization and query, so we can time them separately
import numpy as np
import netCDF4

class Naive_fast(object):
    def __init__(self, ncfile, latvarname, lonvarname):
        self.ncfile = ncfile
        self.latvar = self.ncfile.variables[latvarname]
        self.lonvar = self.ncfile.variables[lonvarname]        
        # Read latitude and longitude from file into numpy arrays
        self.latvals = self.latvar[:]
        self.lonvals = self.lonvar[:]
        self.shape = self.latvals.shape
        

    def query(self,lat0,lon0):
        dist_sq = (self.latvals-lat0)**2 + (self.lonvals-lon0)**2
        minindex_flattened = dist_sq.argmin()                             # 1D index
        iy_min, ix_min = np.unravel_index(minindex_flattened, self.shape) # 2D indexes
        return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
ns = Naive_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)
print 'Closest lat lon:', ns.latvar[iy,ix], ns.lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

In [14]:
# Split tunnel_fast into initialization and query, so we can time them separately
import numpy as np
import netCDF4
from math import pi
from numpy import cos, sin

class Tunnel_fast(object):
    def __init__(self, ncfile, latvarname, lonvarname):
        self.ncfile = ncfile
        self.latvar = self.ncfile.variables[latvarname]
        self.lonvar = self.ncfile.variables[lonvarname]        
        # Read latitude and longitude from file into numpy arrays
        rad_factor = pi/180.0 # for trignometry, need angles in radians
        self.latvals = self.latvar[:] * rad_factor
        self.lonvals = self.lonvar[:] * rad_factor
        self.shape = self.latvals.shape
        clat,clon,slon = cos(self.latvals),cos(self.lonvals),sin(self.lonvals)
        self.clat_clon = clat*clon
        self.clat_slon = clat*slon
        self.slat = sin(self.latvals)
 
    def query(self,lat0,lon0):
        # for trignometry, need angles in radians
        rad_factor = pi/180.0 
        lat0_rad = lat0 * rad_factor
        lon0_rad = lon0 * rad_factor
        delX = cos(lat0_rad)*cos(lon0_rad) - self.clat_clon
        delY = cos(lat0_rad)*sin(lon0_rad) - self.clat_slon
        delZ = sin(lat0_rad) - self.slat;
        dist_sq = delX**2 + delY**2 + delZ**2
        minindex_1d = dist_sq.argmin()                              # 1D index 
        iy_min, ix_min = np.unravel_index(minindex_1d, self.shape)  # 2D indexes
        return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
ns = Tunnel_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)
print 'Closest lat lon:', ns.latvar[iy,ix], ns.lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

In [15]:
# Split kdtree_fast into initialization and query, so we can time them separately
import numpy as np
import netCDF4
from math import pi
from numpy import cos, sin
from scipy.spatial import cKDTree

class Kdtree_fast(object):
    def __init__(self, ncfile, latvarname, lonvarname):
        self.ncfile = ncfile
        self.latvar = self.ncfile.variables[latvarname]
        self.lonvar = self.ncfile.variables[lonvarname]        
        # Read latitude and longitude from file into numpy arrays
        rad_factor = pi/180.0 # for trignometry, need angles in radians
        self.latvals = self.latvar[:] * rad_factor
        self.lonvals = self.lonvar[:] * rad_factor
        self.shape = self.latvals.shape
        clat,clon = cos(self.latvals),cos(self.lonvals)
        slat,slon = sin(self.latvals),sin(self.lonvals)
        clat_clon = clat*clon
        clat_slon = clat*slon
        triples = zip(np.ravel(clat*clon), np.ravel(clat*slon), np.ravel(slat))
        self.kdt = cKDTree(triples)

    def query(self,lat0,lon0):
        rad_factor = pi/180.0 
        lat0_rad = lat0 * rad_factor
        lon0_rad = lon0 * rad_factor
        clat0,clon0 = cos(lat0_rad),cos(lon0_rad)
        slat0,slon0 = sin(lat0_rad),sin(lon0_rad)
        dist_sq_min, minindex_1d = self.kdt.query([clat0*clon0,clat0*slon0,slat0])
        iy_min, ix_min = np.unravel_index(minindex_1d, self.shape)
        return iy_min,ix_min

ncfile = netCDF4.Dataset(filename, 'r')
ns = Kdtree_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)
print 'Closest lat lon:', ns.latvar[iy,ix], ns.lonvar[iy,ix]
ncfile.close()


Closest lat lon: 49.9867 -139.982

Setup times for the four algorithms


In [16]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r')
ns = Naive_slow(ncfile,'Latitude','Longitude')


100 loops, best of 3: 3.76 ms per loop

In [17]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r')
ns = Naive_fast(ncfile,'Latitude','Longitude')


100 loops, best of 3: 3.8 ms per loop

In [18]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r')
ns = Tunnel_fast(ncfile,'Latitude','Longitude')


10 loops, best of 3: 27.4 ms per loop

In [19]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r')
ns = Kdtree_fast(ncfile,'Latitude','Longitude')


1 loops, best of 3: 2.52 s per loop

Query times for the four algorithms


In [20]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r'); ns = Naive_slow(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)


1 loops, best of 3: 7.79 s per loop

In [21]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r'); ns = Naive_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)


100 loops, best of 3: 2.46 ms per loop

In [22]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r'); ns = Tunnel_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)


100 loops, best of 3: 5.14 ms per loop

In [23]:
%%timeit ncfile = netCDF4.Dataset(filename, 'r'); ns = Kdtree_fast(ncfile,'Latitude','Longitude')
iy,ix = ns.query(50.0, -140.0)


10000 loops, best of 3: 73.8 µs per loop

In the next cell, we copy the results of the %%timeit runs into Python variables. (Is there a way to capture %%timeit output, so we don't have to do this manually?)


In [26]:
ns0,nf0,tf0,kd0 = 3.76, 3.8, 27.4, 2520  # setup times in msec
nsq,nfq,tfq,kdq = 7790, 2.46, 5.14, .0738 # query times in msec

Summary of timings

The naive_slow method is always slower than all other methods. The naive_fast method would only be worth considering if non-flatness of the Earth is irrelevant, for example in a relatively small region not close to the poles and not crossing a longitude discontinuity.

Total time for running initialization followed by N queries is:

  • naive_slow: $ns0 + nsq * N$
  • naive_fast: $nf0 + nfq * N$
  • tunnel_fast: $nt0 + ntq * N$
  • kdtree_fast: $kd0 + kdq * N$

In [27]:
N = 10000
print N, "queries using naive_slow:", round((ns0 + nsq*N)/1000,1), "seconds"
print N, "queries using naive_fast:", round((nf0 + nfq*N)/1000,1), "seconds"
print N, "queries using tunnel_fast:", round((tf0 + tfq*N)/1000,1), "seconds"
print N, "queries using kdtree_fast:", round((kd0 + kdq*N)/1000,1), "seconds"
print
print "kd_tree_fast outperforms naive_fast above:", int((kd0-nf0)/(nfq-kdq)), "queries"
print "kd_tree_fast outperforms tunnel_fast above:", int((kd0-tf0)/(tfq-kdq)), "queries"


10000 queries using naive_slow: 77900.0 seconds
10000 queries using naive_fast: 24.6 seconds
10000 queries using tunnel_fast: 51.4 seconds
10000 queries using kdtree_fast: 3.3 seconds

kd_tree_fast outperforms naive_fast above: 1054 queries
kd_tree_fast outperforms tunnel_fast above: 492 queries

The advantage of using KD-trees is much greater for more search set points, as KD-tree query complexity is O(log(N)), but the other algorithms are O(N), the same as the difference between using binary search versus linear search.