K-Nest-Neighbors
In [1]:
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[-1,-1],[-2, -1],[1,1],[2,1],[3,2]])
nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree').fit(X)
distance, indices = nbrs.kneighbors(X)
indices
Out[1]:
In [3]:
distance
Out[3]:
We can also use the KDTree
and BallTree
classes directly to find the nearest neighbors, alternatively.
In [4]:
from sklearn.neighbors import KDTree
import numpy as np
X = np.array([[-1,-1],[-2, -1],[1,1],[2,1],[3,2]])
kdt=KDTree(X, leaf_size=30, metric='euclidean')
kdt.query(X, k=2, return_distance=False)
Out[4]:
scikit-learn
implements two different nearest neghbors classifiers: KNeighborsClassifier
which based on the k
nearest neighbors of each query point, where k is specified by the users. RadiusNeighborsClassifier
based ont the number of neighbors with in a fixed radius r
of each training point.
calculate the all pairs of points in the dataset: for $N$ samples in $D$ dimension
Reduce the required number of distance calculations by efficiently encoding aggerate distance information for the sample.
Divide the data into nodes defined by a centroid $C$ and radius $r$, such that each point in the node lies within the hyper-sphere defined by $r$ and $C$.