In [1]:
#importing scikit-learn datasets package and the LinearRegression package
from sklearn import datasets
from sklearn.linear_model import LinearRegression
#reading the iris dataset from the datasets package
iris = datasets.load_iris()
iris.data.shape,iris.target.shape
#implementing the methods of Logistic Regression
m=LinearRegression()
#printing the labels, feature names and their description
print(iris.target,iris.target_names,iris.feature_names,iris.DESCR)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2] ['setosa' 'versicolor' 'virginica'] ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] Iris Plants Database
====================
Notes
-----
Data Set Characteristics:
:Number of Instances: 150 (50 in each of three classes)
:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- class:
- Iris-Setosa
- Iris-Versicolour
- Iris-Virginica
:Summary Statistics:
============== ==== ==== ======= ===== ====================
Min Max Mean SD Class Correlation
============== ==== ==== ======= ===== ====================
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
============== ==== ==== ======= ===== ====================
:Missing Attribute Values: None
:Class Distribution: 33.3% for each of 3 classes.
:Creator: R.A. Fisher
:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
:Date: July, 1988
This is a copy of UCI ML iris datasets.
http://archive.ics.uci.edu/ml/datasets/Iris
The famous Iris database, first used by Sir R.A Fisher
This is perhaps the best known database to be found in the
pattern recognition literature. Fisher's paper is a classic in the field and
is referenced frequently to this day. (See Duda & Hart, for example.) The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant. One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.
References
----------
- Fisher,R.A. "The use of multiple measurements in taxonomic problems"
Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
Mathematical Statistics" (John Wiley, NY, 1950).
- Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
(Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
- Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
Structure and Classification Rule for Recognition in Partially Exposed
Environments". IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. PAMI-2, No. 1, 67-71.
- Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions
on Information Theory, May 1972, 431-433.
- See also: 1988 MLC Proceedings, 54-64. Cheeseman et al"s AUTOCLASS II
conceptual clustering system finds 3 classes in the data.
- Many, many more ...
In [2]:
#viewing the dataset
print(iris.data)
[[ 5.1 3.5 1.4 0.2]
[ 4.9 3. 1.4 0.2]
[ 4.7 3.2 1.3 0.2]
[ 4.6 3.1 1.5 0.2]
[ 5. 3.6 1.4 0.2]
[ 5.4 3.9 1.7 0.4]
[ 4.6 3.4 1.4 0.3]
[ 5. 3.4 1.5 0.2]
[ 4.4 2.9 1.4 0.2]
[ 4.9 3.1 1.5 0.1]
[ 5.4 3.7 1.5 0.2]
[ 4.8 3.4 1.6 0.2]
[ 4.8 3. 1.4 0.1]
[ 4.3 3. 1.1 0.1]
[ 5.8 4. 1.2 0.2]
[ 5.7 4.4 1.5 0.4]
[ 5.4 3.9 1.3 0.4]
[ 5.1 3.5 1.4 0.3]
[ 5.7 3.8 1.7 0.3]
[ 5.1 3.8 1.5 0.3]
[ 5.4 3.4 1.7 0.2]
[ 5.1 3.7 1.5 0.4]
[ 4.6 3.6 1. 0.2]
[ 5.1 3.3 1.7 0.5]
[ 4.8 3.4 1.9 0.2]
[ 5. 3. 1.6 0.2]
[ 5. 3.4 1.6 0.4]
[ 5.2 3.5 1.5 0.2]
[ 5.2 3.4 1.4 0.2]
[ 4.7 3.2 1.6 0.2]
[ 4.8 3.1 1.6 0.2]
[ 5.4 3.4 1.5 0.4]
[ 5.2 4.1 1.5 0.1]
[ 5.5 4.2 1.4 0.2]
[ 4.9 3.1 1.5 0.1]
[ 5. 3.2 1.2 0.2]
[ 5.5 3.5 1.3 0.2]
[ 4.9 3.1 1.5 0.1]
[ 4.4 3. 1.3 0.2]
[ 5.1 3.4 1.5 0.2]
[ 5. 3.5 1.3 0.3]
[ 4.5 2.3 1.3 0.3]
[ 4.4 3.2 1.3 0.2]
[ 5. 3.5 1.6 0.6]
[ 5.1 3.8 1.9 0.4]
[ 4.8 3. 1.4 0.3]
[ 5.1 3.8 1.6 0.2]
[ 4.6 3.2 1.4 0.2]
[ 5.3 3.7 1.5 0.2]
[ 5. 3.3 1.4 0.2]
[ 7. 3.2 4.7 1.4]
[ 6.4 3.2 4.5 1.5]
[ 6.9 3.1 4.9 1.5]
[ 5.5 2.3 4. 1.3]
[ 6.5 2.8 4.6 1.5]
[ 5.7 2.8 4.5 1.3]
[ 6.3 3.3 4.7 1.6]
[ 4.9 2.4 3.3 1. ]
[ 6.6 2.9 4.6 1.3]
[ 5.2 2.7 3.9 1.4]
[ 5. 2. 3.5 1. ]
[ 5.9 3. 4.2 1.5]
[ 6. 2.2 4. 1. ]
[ 6.1 2.9 4.7 1.4]
[ 5.6 2.9 3.6 1.3]
[ 6.7 3.1 4.4 1.4]
[ 5.6 3. 4.5 1.5]
[ 5.8 2.7 4.1 1. ]
[ 6.2 2.2 4.5 1.5]
[ 5.6 2.5 3.9 1.1]
[ 5.9 3.2 4.8 1.8]
[ 6.1 2.8 4. 1.3]
[ 6.3 2.5 4.9 1.5]
[ 6.1 2.8 4.7 1.2]
[ 6.4 2.9 4.3 1.3]
[ 6.6 3. 4.4 1.4]
[ 6.8 2.8 4.8 1.4]
[ 6.7 3. 5. 1.7]
[ 6. 2.9 4.5 1.5]
[ 5.7 2.6 3.5 1. ]
[ 5.5 2.4 3.8 1.1]
[ 5.5 2.4 3.7 1. ]
[ 5.8 2.7 3.9 1.2]
[ 6. 2.7 5.1 1.6]
[ 5.4 3. 4.5 1.5]
[ 6. 3.4 4.5 1.6]
[ 6.7 3.1 4.7 1.5]
[ 6.3 2.3 4.4 1.3]
[ 5.6 3. 4.1 1.3]
[ 5.5 2.5 4. 1.3]
[ 5.5 2.6 4.4 1.2]
[ 6.1 3. 4.6 1.4]
[ 5.8 2.6 4. 1.2]
[ 5. 2.3 3.3 1. ]
[ 5.6 2.7 4.2 1.3]
[ 5.7 3. 4.2 1.2]
[ 5.7 2.9 4.2 1.3]
[ 6.2 2.9 4.3 1.3]
[ 5.1 2.5 3. 1.1]
[ 5.7 2.8 4.1 1.3]
[ 6.3 3.3 6. 2.5]
[ 5.8 2.7 5.1 1.9]
[ 7.1 3. 5.9 2.1]
[ 6.3 2.9 5.6 1.8]
[ 6.5 3. 5.8 2.2]
[ 7.6 3. 6.6 2.1]
[ 4.9 2.5 4.5 1.7]
[ 7.3 2.9 6.3 1.8]
[ 6.7 2.5 5.8 1.8]
[ 7.2 3.6 6.1 2.5]
[ 6.5 3.2 5.1 2. ]
[ 6.4 2.7 5.3 1.9]
[ 6.8 3. 5.5 2.1]
[ 5.7 2.5 5. 2. ]
[ 5.8 2.8 5.1 2.4]
[ 6.4 3.2 5.3 2.3]
[ 6.5 3. 5.5 1.8]
[ 7.7 3.8 6.7 2.2]
[ 7.7 2.6 6.9 2.3]
[ 6. 2.2 5. 1.5]
[ 6.9 3.2 5.7 2.3]
[ 5.6 2.8 4.9 2. ]
[ 7.7 2.8 6.7 2. ]
[ 6.3 2.7 4.9 1.8]
[ 6.7 3.3 5.7 2.1]
[ 7.2 3.2 6. 1.8]
[ 6.2 2.8 4.8 1.8]
[ 6.1 3. 4.9 1.8]
[ 6.4 2.8 5.6 2.1]
[ 7.2 3. 5.8 1.6]
[ 7.4 2.8 6.1 1.9]
[ 7.9 3.8 6.4 2. ]
[ 6.4 2.8 5.6 2.2]
[ 6.3 2.8 5.1 1.5]
[ 6.1 2.6 5.6 1.4]
[ 7.7 3. 6.1 2.3]
[ 6.3 3.4 5.6 2.4]
[ 6.4 3.1 5.5 1.8]
[ 6. 3. 4.8 1.8]
[ 6.9 3.1 5.4 2.1]
[ 6.7 3.1 5.6 2.4]
[ 6.9 3.1 5.1 2.3]
[ 5.8 2.7 5.1 1.9]
[ 6.8 3.2 5.9 2.3]
[ 6.7 3.3 5.7 2.5]
[ 6.7 3. 5.2 2.3]
[ 6.3 2.5 5. 1.9]
[ 6.5 3. 5.2 2. ]
[ 6.2 3.4 5.4 2.3]
[ 5.9 3. 5.1 1.8]]
In [3]:
#fit - fit the model according to the given training data
m.fit(iris.data,iris.target)
Out[3]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
In [5]:
#decision_function - to predict confidence scores (signed distance of that sample to the hyperplane) for examples.
# DeprecationWarning: Function decision_function is deprecated; and will be removed in 0.19.
m.decision_function(iris.data)
C:\Users\priyu\Anaconda3\lib\site-packages\sklearn\utils\deprecation.py:70: DeprecationWarning: Function decision_function is deprecated; and will be removed in 0.19.
warnings.warn(msg, category=DeprecationWarning)
Out[5]:
array([ -8.26582725e-02, -3.85897565e-02, -4.81896914e-02,
1.26087761e-02, -7.61081708e-02, 5.68023484e-02,
3.76259158e-02, -4.45599433e-02, 2.07050198e-02,
-8.13030749e-02, -1.01728663e-01, 8.84875996e-05,
-8.86050221e-02, -1.01834705e-01, -2.26997797e-01,
-4.36405904e-02, -3.39982044e-02, -2.16688605e-02,
-3.26854579e-02, -1.22408563e-02, -4.30562522e-02,
5.31726003e-02, -1.23012138e-01, 1.77258467e-01,
6.81889023e-02, -4.16362637e-03, 1.00119019e-01,
-7.09322806e-02, -8.92083742e-02, 1.99107233e-02,
1.33606216e-02, 3.35222953e-02, -1.58465961e-01,
-1.57523171e-01, -8.13030749e-02, -1.03812269e-01,
-1.49254996e-01, -8.13030749e-02, -6.41916305e-03,
-5.55340896e-02, -3.33948524e-02, 7.45644153e-02,
-1.52672524e-02, 2.17673798e-01, 1.39549109e-01,
3.33738018e-02, -5.05301301e-02, -1.45154068e-02,
-9.07545163e-02, -6.28360368e-02, 1.20308259e+00,
1.28451660e+00, 1.32487047e+00, 1.18762080e+00,
1.31393877e+00, 1.25705298e+00, 1.39745639e+00,
9.07172433e-01, 1.17656176e+00, 1.24113634e+00,
9.59294742e-01, 1.28013501e+00, 9.54205881e-01,
1.31512204e+00, 1.05930184e+00, 1.17232866e+00,
1.38115786e+00, 9.76734088e-01, 1.35070534e+00,
1.02311961e+00, 1.59045598e+00, 1.09965570e+00,
1.41725961e+00, 1.19756726e+00, 1.13040963e+00,
1.18772685e+00, 1.26542720e+00, 1.49592176e+00,
1.34168532e+00, 8.55931450e-01, 1.01581766e+00,
9.32128108e-01, 1.05331264e+00, 1.54772365e+00,
1.40310615e+00, 1.38055451e+00, 1.30141848e+00,
1.19062819e+00, 1.16837848e+00, 1.17877271e+00,
1.20415981e+00, 1.28799785e+00, 1.08043682e+00,
9.00622332e-01, 1.20435076e+00, 1.11911506e+00,
1.18452852e+00, 1.15235793e+00, 8.73689093e-01,
1.16625243e+00, 2.24146289e+00, 1.75264018e+00,
1.90028407e+00, 1.74143264e+00, 2.00441822e+00,
2.00431431e+00, 1.60207593e+00, 1.79059214e+00,
1.76063251e+00, 2.15212358e+00, 1.71469034e+00,
1.73219558e+00, 1.84240596e+00, 1.81075169e+00,
2.05316319e+00, 1.95403300e+00, 1.69236016e+00,
2.04163735e+00, 2.20111558e+00, 1.48615432e+00,
1.98996282e+00, 1.78575356e+00, 1.96389898e+00,
1.59137976e+00, 1.88550825e+00, 1.72019374e+00,
1.57522972e+00, 1.60005592e+00, 1.91785077e+00,
1.56166273e+00, 1.79963117e+00, 1.82960982e+00,
1.97884018e+00, 1.44938775e+00, 1.53269542e+00,
2.00181829e+00, 2.08524888e+00, 1.69891026e+00,
1.58832992e+00, 1.80430763e+00, 2.05462443e+00,
1.85818604e+00, 1.75264018e+00, 2.04633725e+00,
2.12946589e+00, 1.90725851e+00, 1.68391740e+00,
1.74623857e+00, 1.98983334e+00, 1.66740449e+00])
In [7]:
m.get_params()
Out[7]:
{'copy_X': True, 'fit_intercept': True, 'n_jobs': 1, 'normalize': False}
In [8]:
#predict - in the output below, 0-Setosa, 1-Versicolour, 2-Virginica
m.predict(iris.data)
Out[8]:
array([ -8.26582725e-02, -3.85897565e-02, -4.81896914e-02,
1.26087761e-02, -7.61081708e-02, 5.68023484e-02,
3.76259158e-02, -4.45599433e-02, 2.07050198e-02,
-8.13030749e-02, -1.01728663e-01, 8.84875996e-05,
-8.86050221e-02, -1.01834705e-01, -2.26997797e-01,
-4.36405904e-02, -3.39982044e-02, -2.16688605e-02,
-3.26854579e-02, -1.22408563e-02, -4.30562522e-02,
5.31726003e-02, -1.23012138e-01, 1.77258467e-01,
6.81889023e-02, -4.16362637e-03, 1.00119019e-01,
-7.09322806e-02, -8.92083742e-02, 1.99107233e-02,
1.33606216e-02, 3.35222953e-02, -1.58465961e-01,
-1.57523171e-01, -8.13030749e-02, -1.03812269e-01,
-1.49254996e-01, -8.13030749e-02, -6.41916305e-03,
-5.55340896e-02, -3.33948524e-02, 7.45644153e-02,
-1.52672524e-02, 2.17673798e-01, 1.39549109e-01,
3.33738018e-02, -5.05301301e-02, -1.45154068e-02,
-9.07545163e-02, -6.28360368e-02, 1.20308259e+00,
1.28451660e+00, 1.32487047e+00, 1.18762080e+00,
1.31393877e+00, 1.25705298e+00, 1.39745639e+00,
9.07172433e-01, 1.17656176e+00, 1.24113634e+00,
9.59294742e-01, 1.28013501e+00, 9.54205881e-01,
1.31512204e+00, 1.05930184e+00, 1.17232866e+00,
1.38115786e+00, 9.76734088e-01, 1.35070534e+00,
1.02311961e+00, 1.59045598e+00, 1.09965570e+00,
1.41725961e+00, 1.19756726e+00, 1.13040963e+00,
1.18772685e+00, 1.26542720e+00, 1.49592176e+00,
1.34168532e+00, 8.55931450e-01, 1.01581766e+00,
9.32128108e-01, 1.05331264e+00, 1.54772365e+00,
1.40310615e+00, 1.38055451e+00, 1.30141848e+00,
1.19062819e+00, 1.16837848e+00, 1.17877271e+00,
1.20415981e+00, 1.28799785e+00, 1.08043682e+00,
9.00622332e-01, 1.20435076e+00, 1.11911506e+00,
1.18452852e+00, 1.15235793e+00, 8.73689093e-01,
1.16625243e+00, 2.24146289e+00, 1.75264018e+00,
1.90028407e+00, 1.74143264e+00, 2.00441822e+00,
2.00431431e+00, 1.60207593e+00, 1.79059214e+00,
1.76063251e+00, 2.15212358e+00, 1.71469034e+00,
1.73219558e+00, 1.84240596e+00, 1.81075169e+00,
2.05316319e+00, 1.95403300e+00, 1.69236016e+00,
2.04163735e+00, 2.20111558e+00, 1.48615432e+00,
1.98996282e+00, 1.78575356e+00, 1.96389898e+00,
1.59137976e+00, 1.88550825e+00, 1.72019374e+00,
1.57522972e+00, 1.60005592e+00, 1.91785077e+00,
1.56166273e+00, 1.79963117e+00, 1.82960982e+00,
1.97884018e+00, 1.44938775e+00, 1.53269542e+00,
2.00181829e+00, 2.08524888e+00, 1.69891026e+00,
1.58832992e+00, 1.80430763e+00, 2.05462443e+00,
1.85818604e+00, 1.75264018e+00, 2.04633725e+00,
2.12946589e+00, 1.90725851e+00, 1.68391740e+00,
1.74623857e+00, 1.98983334e+00, 1.66740449e+00])
In [11]:
#returns the coefficient determination R^2 of the prediction
m.score(iris.data,iris.target)
Out[11]:
0.93042236753315966
Content source: harishkrao/Machine-Learning
Similar notebooks: