In [4]:
#importing scikit-learn datasets package and the kNeighborsRegressor package
from sklearn import datasets
from sklearn.neighbors import KNeighborsRegressor
#reading the iris dataset from the datasets package
iris = datasets.load_iris()
iris.data.shape,iris.target.shape
#implementing the methods of KNeighborsRegressor
m=KNeighborsRegressor()
#printing the labels, feature names and their description
print(iris.target,iris.target_names,iris.feature_names,iris.DESCR)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2] ['setosa' 'versicolor' 'virginica'] ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] Iris Plants Database
====================
Notes
-----
Data Set Characteristics:
:Number of Instances: 150 (50 in each of three classes)
:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- class:
- Iris-Setosa
- Iris-Versicolour
- Iris-Virginica
:Summary Statistics:
============== ==== ==== ======= ===== ====================
Min Max Mean SD Class Correlation
============== ==== ==== ======= ===== ====================
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
============== ==== ==== ======= ===== ====================
:Missing Attribute Values: None
:Class Distribution: 33.3% for each of 3 classes.
:Creator: R.A. Fisher
:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
:Date: July, 1988
This is a copy of UCI ML iris datasets.
http://archive.ics.uci.edu/ml/datasets/Iris
The famous Iris database, first used by Sir R.A Fisher
This is perhaps the best known database to be found in the
pattern recognition literature. Fisher's paper is a classic in the field and
is referenced frequently to this day. (See Duda & Hart, for example.) The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant. One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.
References
----------
- Fisher,R.A. "The use of multiple measurements in taxonomic problems"
Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
Mathematical Statistics" (John Wiley, NY, 1950).
- Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
(Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
- Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
Structure and Classification Rule for Recognition in Partially Exposed
Environments". IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. PAMI-2, No. 1, 67-71.
- Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions
on Information Theory, May 1972, 431-433.
- See also: 1988 MLC Proceedings, 54-64. Cheeseman et al"s AUTOCLASS II
conceptual clustering system finds 3 classes in the data.
- Many, many more ...
In [5]:
#viewing the dataset
print(iris.data)
[[ 5.1 3.5 1.4 0.2]
[ 4.9 3. 1.4 0.2]
[ 4.7 3.2 1.3 0.2]
[ 4.6 3.1 1.5 0.2]
[ 5. 3.6 1.4 0.2]
[ 5.4 3.9 1.7 0.4]
[ 4.6 3.4 1.4 0.3]
[ 5. 3.4 1.5 0.2]
[ 4.4 2.9 1.4 0.2]
[ 4.9 3.1 1.5 0.1]
[ 5.4 3.7 1.5 0.2]
[ 4.8 3.4 1.6 0.2]
[ 4.8 3. 1.4 0.1]
[ 4.3 3. 1.1 0.1]
[ 5.8 4. 1.2 0.2]
[ 5.7 4.4 1.5 0.4]
[ 5.4 3.9 1.3 0.4]
[ 5.1 3.5 1.4 0.3]
[ 5.7 3.8 1.7 0.3]
[ 5.1 3.8 1.5 0.3]
[ 5.4 3.4 1.7 0.2]
[ 5.1 3.7 1.5 0.4]
[ 4.6 3.6 1. 0.2]
[ 5.1 3.3 1.7 0.5]
[ 4.8 3.4 1.9 0.2]
[ 5. 3. 1.6 0.2]
[ 5. 3.4 1.6 0.4]
[ 5.2 3.5 1.5 0.2]
[ 5.2 3.4 1.4 0.2]
[ 4.7 3.2 1.6 0.2]
[ 4.8 3.1 1.6 0.2]
[ 5.4 3.4 1.5 0.4]
[ 5.2 4.1 1.5 0.1]
[ 5.5 4.2 1.4 0.2]
[ 4.9 3.1 1.5 0.1]
[ 5. 3.2 1.2 0.2]
[ 5.5 3.5 1.3 0.2]
[ 4.9 3.1 1.5 0.1]
[ 4.4 3. 1.3 0.2]
[ 5.1 3.4 1.5 0.2]
[ 5. 3.5 1.3 0.3]
[ 4.5 2.3 1.3 0.3]
[ 4.4 3.2 1.3 0.2]
[ 5. 3.5 1.6 0.6]
[ 5.1 3.8 1.9 0.4]
[ 4.8 3. 1.4 0.3]
[ 5.1 3.8 1.6 0.2]
[ 4.6 3.2 1.4 0.2]
[ 5.3 3.7 1.5 0.2]
[ 5. 3.3 1.4 0.2]
[ 7. 3.2 4.7 1.4]
[ 6.4 3.2 4.5 1.5]
[ 6.9 3.1 4.9 1.5]
[ 5.5 2.3 4. 1.3]
[ 6.5 2.8 4.6 1.5]
[ 5.7 2.8 4.5 1.3]
[ 6.3 3.3 4.7 1.6]
[ 4.9 2.4 3.3 1. ]
[ 6.6 2.9 4.6 1.3]
[ 5.2 2.7 3.9 1.4]
[ 5. 2. 3.5 1. ]
[ 5.9 3. 4.2 1.5]
[ 6. 2.2 4. 1. ]
[ 6.1 2.9 4.7 1.4]
[ 5.6 2.9 3.6 1.3]
[ 6.7 3.1 4.4 1.4]
[ 5.6 3. 4.5 1.5]
[ 5.8 2.7 4.1 1. ]
[ 6.2 2.2 4.5 1.5]
[ 5.6 2.5 3.9 1.1]
[ 5.9 3.2 4.8 1.8]
[ 6.1 2.8 4. 1.3]
[ 6.3 2.5 4.9 1.5]
[ 6.1 2.8 4.7 1.2]
[ 6.4 2.9 4.3 1.3]
[ 6.6 3. 4.4 1.4]
[ 6.8 2.8 4.8 1.4]
[ 6.7 3. 5. 1.7]
[ 6. 2.9 4.5 1.5]
[ 5.7 2.6 3.5 1. ]
[ 5.5 2.4 3.8 1.1]
[ 5.5 2.4 3.7 1. ]
[ 5.8 2.7 3.9 1.2]
[ 6. 2.7 5.1 1.6]
[ 5.4 3. 4.5 1.5]
[ 6. 3.4 4.5 1.6]
[ 6.7 3.1 4.7 1.5]
[ 6.3 2.3 4.4 1.3]
[ 5.6 3. 4.1 1.3]
[ 5.5 2.5 4. 1.3]
[ 5.5 2.6 4.4 1.2]
[ 6.1 3. 4.6 1.4]
[ 5.8 2.6 4. 1.2]
[ 5. 2.3 3.3 1. ]
[ 5.6 2.7 4.2 1.3]
[ 5.7 3. 4.2 1.2]
[ 5.7 2.9 4.2 1.3]
[ 6.2 2.9 4.3 1.3]
[ 5.1 2.5 3. 1.1]
[ 5.7 2.8 4.1 1.3]
[ 6.3 3.3 6. 2.5]
[ 5.8 2.7 5.1 1.9]
[ 7.1 3. 5.9 2.1]
[ 6.3 2.9 5.6 1.8]
[ 6.5 3. 5.8 2.2]
[ 7.6 3. 6.6 2.1]
[ 4.9 2.5 4.5 1.7]
[ 7.3 2.9 6.3 1.8]
[ 6.7 2.5 5.8 1.8]
[ 7.2 3.6 6.1 2.5]
[ 6.5 3.2 5.1 2. ]
[ 6.4 2.7 5.3 1.9]
[ 6.8 3. 5.5 2.1]
[ 5.7 2.5 5. 2. ]
[ 5.8 2.8 5.1 2.4]
[ 6.4 3.2 5.3 2.3]
[ 6.5 3. 5.5 1.8]
[ 7.7 3.8 6.7 2.2]
[ 7.7 2.6 6.9 2.3]
[ 6. 2.2 5. 1.5]
[ 6.9 3.2 5.7 2.3]
[ 5.6 2.8 4.9 2. ]
[ 7.7 2.8 6.7 2. ]
[ 6.3 2.7 4.9 1.8]
[ 6.7 3.3 5.7 2.1]
[ 7.2 3.2 6. 1.8]
[ 6.2 2.8 4.8 1.8]
[ 6.1 3. 4.9 1.8]
[ 6.4 2.8 5.6 2.1]
[ 7.2 3. 5.8 1.6]
[ 7.4 2.8 6.1 1.9]
[ 7.9 3.8 6.4 2. ]
[ 6.4 2.8 5.6 2.2]
[ 6.3 2.8 5.1 1.5]
[ 6.1 2.6 5.6 1.4]
[ 7.7 3. 6.1 2.3]
[ 6.3 3.4 5.6 2.4]
[ 6.4 3.1 5.5 1.8]
[ 6. 3. 4.8 1.8]
[ 6.9 3.1 5.4 2.1]
[ 6.7 3.1 5.6 2.4]
[ 6.9 3.1 5.1 2.3]
[ 5.8 2.7 5.1 1.9]
[ 6.8 3.2 5.9 2.3]
[ 6.7 3.3 5.7 2.5]
[ 6.7 3. 5.2 2.3]
[ 6.3 2.5 5. 1.9]
[ 6.5 3. 5.2 2. ]
[ 6.2 3.4 5.4 2.3]
[ 5.9 3. 5.1 1.8]]
In [6]:
#fit - fit the model according to the given training data
m.fit(iris.data,iris.target)
Out[6]:
KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=5, p=2,
weights='uniform')
In [7]:
#listing the parameters of the kNeighborsRegressor classifier
m.get_params()
Out[7]:
{'algorithm': 'auto',
'leaf_size': 30,
'metric': 'minkowski',
'metric_params': None,
'n_jobs': 1,
'n_neighbors': 5,
'p': 2,
'weights': 'uniform'}
In [8]:
#kneighbors - returns indices and distances to the neighbors of each point.
m.kneighbors(iris.data,n_neighbors=2,return_distance=True)
Out[8]:
(array([[ 0. , 0.1 ],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.33166248],
[ 0. , 0.2236068 ],
[ 0. , 0.1 ],
[ 0. , 0.14142136],
[ 0. , 0. ],
[ 0. , 0.1 ],
[ 0. , 0.2236068 ],
[ 0. , 0.14142136],
[ 0. , 0.24494897],
[ 0. , 0.41231056],
[ 0. , 0.36055513],
[ 0. , 0.34641016],
[ 0. , 0.1 ],
[ 0. , 0.33166248],
[ 0. , 0.14142136],
[ 0. , 0.28284271],
[ 0. , 0.14142136],
[ 0. , 0.45825757],
[ 0. , 0.2 ],
[ 0. , 0.3 ],
[ 0. , 0.2 ],
[ 0. , 0.2 ],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.28284271],
[ 0. , 0.34641016],
[ 0. , 0.34641016],
[ 0. , 0. ],
[ 0. , 0.2236068 ],
[ 0. , 0.3 ],
[ 0. , 0. ],
[ 0. , 0.14142136],
[ 0. , 0.1 ],
[ 0. , 0.14142136],
[ 0. , 0.6244998 ],
[ 0. , 0.2 ],
[ 0. , 0.2236068 ],
[ 0. , 0.36055513],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.1 ],
[ 0. , 0.14142136],
[ 0. , 0.26457513],
[ 0. , 0.26457513],
[ 0. , 0.26457513],
[ 0. , 0.2 ],
[ 0. , 0.24494897],
[ 0. , 0.3 ],
[ 0. , 0.26457513],
[ 0. , 0.14142136],
[ 0. , 0.24494897],
[ 0. , 0.38729833],
[ 0. , 0.36055513],
[ 0. , 0.3 ],
[ 0. , 0.48989795],
[ 0. , 0.14142136],
[ 0. , 0.42426407],
[ 0. , 0.14142136],
[ 0. , 0.2 ],
[ 0. , 0.24494897],
[ 0. , 0.26457513],
[ 0. , 0.17320508],
[ 0. , 0.2236068 ],
[ 0. , 0.33166248],
[ 0. , 0.36055513],
[ 0. , 0.2236068 ],
[ 0. , 0.2 ],
[ 0. , 0.14142136],
[ 0. , 0.31622777],
[ 0. , 0.31622777],
[ 0. , 0.2 ],
[ 0. , 0.34641016],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.33166248],
[ 0. , 0.2 ],
[ 0. , 0.37416574],
[ 0. , 0.28284271],
[ 0. , 0.26457513],
[ 0. , 0.17320508],
[ 0. , 0.2 ],
[ 0. , 0.26457513],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.17320508],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.2 ],
[ 0. , 0.38729833],
[ 0. , 0.14142136],
[ 0. , 0.42426407],
[ 0. , 0. ],
[ 0. , 0.38729833],
[ 0. , 0.24494897],
[ 0. , 0.3 ],
[ 0. , 0.26457513],
[ 0. , 0.73484692],
[ 0. , 0.26457513],
[ 0. , 0.55677644],
[ 0. , 0.63245553],
[ 0. , 0.2236068 ],
[ 0. , 0.34641016],
[ 0. , 0.17320508],
[ 0. , 0.26457513],
[ 0. , 0.48989795],
[ 0. , 0.3 ],
[ 0. , 0.14142136],
[ 0. , 0.41231056],
[ 0. , 0.41231056],
[ 0. , 0.43588989],
[ 0. , 0.2236068 ],
[ 0. , 0.31622777],
[ 0. , 0.26457513],
[ 0. , 0.17320508],
[ 0. , 0.3 ],
[ 0. , 0.34641016],
[ 0. , 0.17320508],
[ 0. , 0.14142136],
[ 0. , 0.1 ],
[ 0. , 0.34641016],
[ 0. , 0.26457513],
[ 0. , 0.41231056],
[ 0. , 0.1 ],
[ 0. , 0.33166248],
[ 0. , 0.53851648],
[ 0. , 0.53851648],
[ 0. , 0.24494897],
[ 0. , 0.14142136],
[ 0. , 0.14142136],
[ 0. , 0.17320508],
[ 0. , 0.24494897],
[ 0. , 0.24494897],
[ 0. , 0. ],
[ 0. , 0.2236068 ],
[ 0. , 0.24494897],
[ 0. , 0.24494897],
[ 0. , 0.24494897],
[ 0. , 0.2236068 ],
[ 0. , 0.24494897],
[ 0. , 0.28284271]]), array([[ 0, 17],
[ 1, 45],
[ 2, 47],
[ 3, 47],
[ 4, 0],
[ 5, 18],
[ 6, 47],
[ 7, 39],
[ 8, 38],
[ 37, 34],
[ 10, 48],
[ 11, 29],
[ 12, 1],
[ 13, 38],
[ 14, 33],
[ 15, 33],
[ 16, 10],
[ 17, 0],
[ 18, 5],
[ 19, 21],
[ 20, 31],
[ 21, 19],
[ 22, 6],
[ 23, 26],
[ 24, 11],
[ 25, 34],
[ 26, 23],
[ 27, 28],
[ 28, 27],
[ 29, 30],
[ 30, 29],
[ 31, 20],
[ 32, 33],
[ 33, 32],
[ 37, 34],
[ 35, 49],
[ 36, 10],
[ 37, 34],
[ 38, 8],
[ 39, 7],
[ 40, 17],
[ 41, 8],
[ 42, 38],
[ 43, 26],
[ 44, 46],
[ 45, 1],
[ 46, 19],
[ 47, 3],
[ 48, 10],
[ 49, 7],
[ 50, 52],
[ 51, 56],
[ 52, 50],
[ 53, 89],
[ 54, 58],
[ 55, 66],
[ 56, 51],
[ 57, 93],
[ 58, 75],
[ 59, 89],
[ 60, 93],
[ 61, 96],
[ 62, 92],
[ 63, 91],
[ 64, 82],
[ 65, 75],
[ 66, 84],
[ 67, 92],
[ 68, 87],
[ 69, 80],
[ 70, 138],
[ 71, 97],
[ 72, 133],
[ 73, 63],
[ 74, 97],
[ 75, 65],
[ 76, 58],
[ 77, 52],
[ 78, 91],
[ 79, 81],
[ 80, 81],
[ 81, 80],
[ 82, 92],
[ 83, 133],
[ 84, 66],
[ 85, 56],
[ 86, 52],
[ 87, 68],
[ 88, 96],
[ 89, 53],
[ 90, 94],
[ 91, 63],
[ 92, 82],
[ 93, 57],
[ 94, 99],
[ 95, 96],
[ 96, 95],
[ 97, 74],
[ 98, 57],
[ 99, 96],
[100, 136],
[142, 101],
[102, 125],
[103, 116],
[104, 132],
[105, 122],
[106, 84],
[107, 130],
[108, 128],
[109, 143],
[110, 147],
[111, 147],
[112, 139],
[113, 101],
[114, 121],
[115, 148],
[116, 137],
[117, 131],
[118, 122],
[119, 72],
[120, 143],
[121, 101],
[122, 105],
[123, 126],
[124, 120],
[125, 129],
[126, 123],
[127, 138],
[128, 132],
[129, 125],
[130, 107],
[131, 117],
[132, 128],
[133, 83],
[134, 103],
[135, 130],
[136, 148],
[137, 116],
[138, 127],
[139, 112],
[140, 144],
[141, 145],
[142, 101],
[143, 120],
[144, 140],
[145, 141],
[146, 123],
[147, 110],
[148, 136],
[149, 127]], dtype=int64))
In [10]:
#calling kneighbors_graph method to compute
#the graph of k-Neighbors for points in X
m.kneighbors_graph(iris.data,n_neighbors=2,mode='connectivity')
Out[10]:
<150x150 sparse matrix of type '<class 'numpy.float64'>'
with 300 stored elements in Compressed Sparse Row format>
In [12]:
#calling the predict function of the kNeighborsRegressor classifier
#the predict function predicts the class labels for the provided data
m.predict(iris.data)
Out[12]:
array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 1. , 1. , 1. , 1. , 1. ,
1. , 1.2, 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1.2, 1. , 1.6, 1. , 1.6, 1. , 1. , 1. , 1. ,
1.4, 1. , 1. , 1. , 1. , 1. , 1.8, 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ,
1. , 2. , 2. , 2. , 2. , 2. , 2. , 1.2, 2. , 2. , 2. ,
1.8, 2. , 2. , 2. , 2. , 2. , 2. , 2. , 2. , 1.4, 2. ,
2. , 2. , 1.8, 2. , 2. , 2. , 1.8, 2. , 2. , 2. , 2. ,
2. , 1.6, 1.8, 2. , 2. , 2. , 1.8, 2. , 2. , 2. , 2. ,
2. , 2. , 2. , 1.8, 2. , 2. , 2. ])
In [14]:
#the score method returns the mean accuracy
#on the given test data and labels.
m.score(iris.data,iris.target)
Out[14]:
0.96999999999999997
Content source: harishkrao/Machine-Learning
Similar notebooks: