notebook.community

Edit and run



In [1]:

    
#importing scikit-learn datasets package and the LinearRegression package
from sklearn import datasets
from sklearn.linear_model import LinearRegression

#reading the iris dataset from the datasets package
iris = datasets.load_iris()
iris.data.shape,iris.target.shape

#implementing the methods of Logistic Regression 
m=LinearRegression()

#printing the labels, feature names and their description
print(iris.target,iris.target_names,iris.feature_names,iris.DESCR)









    



[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2] ['setosa' 'versicolor' 'virginica'] ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] Iris Plants Database
====================

Notes
-----
Data Set Characteristics:
    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
    :Summary Statistics:

    ============== ==== ==== ======= ===== ====================
                    Min  Max   Mean    SD   Class Correlation
    ============== ==== ==== ======= ===== ====================
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20  0.76     0.9565  (high!)
    ============== ==== ==== ======= ===== ====================

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :Date: July, 1988

This is a copy of UCI ML iris datasets.
http://archive.ics.uci.edu/ml/datasets/Iris

The famous Iris database, first used by Sir R.A Fisher

This is perhaps the best known database to be found in the
pattern recognition literature.  Fisher's paper is a classic in the field and
is referenced frequently to this day.  (See Duda & Hart, for example.)  The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant.  One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.

References
----------
   - Fisher,R.A. "The use of multiple measurements in taxonomic problems"
     Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
     Mathematical Statistics" (John Wiley, NY, 1950).
   - Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
     (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
   - Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
     Structure and Classification Rule for Recognition in Partially Exposed
     Environments".  IEEE Transactions on Pattern Analysis and Machine
     Intelligence, Vol. PAMI-2, No. 1, 67-71.
   - Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactions
     on Information Theory, May 1972, 431-433.
   - See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al"s AUTOCLASS II
     conceptual clustering system finds 3 classes in the data.
   - Many, many more ...



In [2]:

    
#viewing the dataset
print(iris.data)









    



[[ 5.1  3.5  1.4  0.2]
 [ 4.9  3.   1.4  0.2]
 [ 4.7  3.2  1.3  0.2]
 [ 4.6  3.1  1.5  0.2]
 [ 5.   3.6  1.4  0.2]
 [ 5.4  3.9  1.7  0.4]
 [ 4.6  3.4  1.4  0.3]
 [ 5.   3.4  1.5  0.2]
 [ 4.4  2.9  1.4  0.2]
 [ 4.9  3.1  1.5  0.1]
 [ 5.4  3.7  1.5  0.2]
 [ 4.8  3.4  1.6  0.2]
 [ 4.8  3.   1.4  0.1]
 [ 4.3  3.   1.1  0.1]
 [ 5.8  4.   1.2  0.2]
 [ 5.7  4.4  1.5  0.4]
 [ 5.4  3.9  1.3  0.4]
 [ 5.1  3.5  1.4  0.3]
 [ 5.7  3.8  1.7  0.3]
 [ 5.1  3.8  1.5  0.3]
 [ 5.4  3.4  1.7  0.2]
 [ 5.1  3.7  1.5  0.4]
 [ 4.6  3.6  1.   0.2]
 [ 5.1  3.3  1.7  0.5]
 [ 4.8  3.4  1.9  0.2]
 [ 5.   3.   1.6  0.2]
 [ 5.   3.4  1.6  0.4]
 [ 5.2  3.5  1.5  0.2]
 [ 5.2  3.4  1.4  0.2]
 [ 4.7  3.2  1.6  0.2]
 [ 4.8  3.1  1.6  0.2]
 [ 5.4  3.4  1.5  0.4]
 [ 5.2  4.1  1.5  0.1]
 [ 5.5  4.2  1.4  0.2]
 [ 4.9  3.1  1.5  0.1]
 [ 5.   3.2  1.2  0.2]
 [ 5.5  3.5  1.3  0.2]
 [ 4.9  3.1  1.5  0.1]
 [ 4.4  3.   1.3  0.2]
 [ 5.1  3.4  1.5  0.2]
 [ 5.   3.5  1.3  0.3]
 [ 4.5  2.3  1.3  0.3]
 [ 4.4  3.2  1.3  0.2]
 [ 5.   3.5  1.6  0.6]
 [ 5.1  3.8  1.9  0.4]
 [ 4.8  3.   1.4  0.3]
 [ 5.1  3.8  1.6  0.2]
 [ 4.6  3.2  1.4  0.2]
 [ 5.3  3.7  1.5  0.2]
 [ 5.   3.3  1.4  0.2]
 [ 7.   3.2  4.7  1.4]
 [ 6.4  3.2  4.5  1.5]
 [ 6.9  3.1  4.9  1.5]
 [ 5.5  2.3  4.   1.3]
 [ 6.5  2.8  4.6  1.5]
 [ 5.7  2.8  4.5  1.3]
 [ 6.3  3.3  4.7  1.6]
 [ 4.9  2.4  3.3  1. ]
 [ 6.6  2.9  4.6  1.3]
 [ 5.2  2.7  3.9  1.4]
 [ 5.   2.   3.5  1. ]
 [ 5.9  3.   4.2  1.5]
 [ 6.   2.2  4.   1. ]
 [ 6.1  2.9  4.7  1.4]
 [ 5.6  2.9  3.6  1.3]
 [ 6.7  3.1  4.4  1.4]
 [ 5.6  3.   4.5  1.5]
 [ 5.8  2.7  4.1  1. ]
 [ 6.2  2.2  4.5  1.5]
 [ 5.6  2.5  3.9  1.1]
 [ 5.9  3.2  4.8  1.8]
 [ 6.1  2.8  4.   1.3]
 [ 6.3  2.5  4.9  1.5]
 [ 6.1  2.8  4.7  1.2]
 [ 6.4  2.9  4.3  1.3]
 [ 6.6  3.   4.4  1.4]
 [ 6.8  2.8  4.8  1.4]
 [ 6.7  3.   5.   1.7]
 [ 6.   2.9  4.5  1.5]
 [ 5.7  2.6  3.5  1. ]
 [ 5.5  2.4  3.8  1.1]
 [ 5.5  2.4  3.7  1. ]
 [ 5.8  2.7  3.9  1.2]
 [ 6.   2.7  5.1  1.6]
 [ 5.4  3.   4.5  1.5]
 [ 6.   3.4  4.5  1.6]
 [ 6.7  3.1  4.7  1.5]
 [ 6.3  2.3  4.4  1.3]
 [ 5.6  3.   4.1  1.3]
 [ 5.5  2.5  4.   1.3]
 [ 5.5  2.6  4.4  1.2]
 [ 6.1  3.   4.6  1.4]
 [ 5.8  2.6  4.   1.2]
 [ 5.   2.3  3.3  1. ]
 [ 5.6  2.7  4.2  1.3]
 [ 5.7  3.   4.2  1.2]
 [ 5.7  2.9  4.2  1.3]
 [ 6.2  2.9  4.3  1.3]
 [ 5.1  2.5  3.   1.1]
 [ 5.7  2.8  4.1  1.3]
 [ 6.3  3.3  6.   2.5]
 [ 5.8  2.7  5.1  1.9]
 [ 7.1  3.   5.9  2.1]
 [ 6.3  2.9  5.6  1.8]
 [ 6.5  3.   5.8  2.2]
 [ 7.6  3.   6.6  2.1]
 [ 4.9  2.5  4.5  1.7]
 [ 7.3  2.9  6.3  1.8]
 [ 6.7  2.5  5.8  1.8]
 [ 7.2  3.6  6.1  2.5]
 [ 6.5  3.2  5.1  2. ]
 [ 6.4  2.7  5.3  1.9]
 [ 6.8  3.   5.5  2.1]
 [ 5.7  2.5  5.   2. ]
 [ 5.8  2.8  5.1  2.4]
 [ 6.4  3.2  5.3  2.3]
 [ 6.5  3.   5.5  1.8]
 [ 7.7  3.8  6.7  2.2]
 [ 7.7  2.6  6.9  2.3]
 [ 6.   2.2  5.   1.5]
 [ 6.9  3.2  5.7  2.3]
 [ 5.6  2.8  4.9  2. ]
 [ 7.7  2.8  6.7  2. ]
 [ 6.3  2.7  4.9  1.8]
 [ 6.7  3.3  5.7  2.1]
 [ 7.2  3.2  6.   1.8]
 [ 6.2  2.8  4.8  1.8]
 [ 6.1  3.   4.9  1.8]
 [ 6.4  2.8  5.6  2.1]
 [ 7.2  3.   5.8  1.6]
 [ 7.4  2.8  6.1  1.9]
 [ 7.9  3.8  6.4  2. ]
 [ 6.4  2.8  5.6  2.2]
 [ 6.3  2.8  5.1  1.5]
 [ 6.1  2.6  5.6  1.4]
 [ 7.7  3.   6.1  2.3]
 [ 6.3  3.4  5.6  2.4]
 [ 6.4  3.1  5.5  1.8]
 [ 6.   3.   4.8  1.8]
 [ 6.9  3.1  5.4  2.1]
 [ 6.7  3.1  5.6  2.4]
 [ 6.9  3.1  5.1  2.3]
 [ 5.8  2.7  5.1  1.9]
 [ 6.8  3.2  5.9  2.3]
 [ 6.7  3.3  5.7  2.5]
 [ 6.7  3.   5.2  2.3]
 [ 6.3  2.5  5.   1.9]
 [ 6.5  3.   5.2  2. ]
 [ 6.2  3.4  5.4  2.3]
 [ 5.9  3.   5.1  1.8]]



In [3]:

    
#fit - fit the model according to the given training data
m.fit(iris.data,iris.target)









    Out[3]:





LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)



In [5]:

    
#decision_function - to predict confidence scores (signed distance of that sample to the hyperplane) for examples.
# DeprecationWarning: Function decision_function is deprecated;  and will be removed in 0.19.
m.decision_function(iris.data)









    



C:\Users\priyu\Anaconda3\lib\site-packages\sklearn\utils\deprecation.py:70: DeprecationWarning: Function decision_function is deprecated;  and will be removed in 0.19.
  warnings.warn(msg, category=DeprecationWarning)






    Out[5]:





array([ -8.26582725e-02,  -3.85897565e-02,  -4.81896914e-02,
         1.26087761e-02,  -7.61081708e-02,   5.68023484e-02,
         3.76259158e-02,  -4.45599433e-02,   2.07050198e-02,
        -8.13030749e-02,  -1.01728663e-01,   8.84875996e-05,
        -8.86050221e-02,  -1.01834705e-01,  -2.26997797e-01,
        -4.36405904e-02,  -3.39982044e-02,  -2.16688605e-02,
        -3.26854579e-02,  -1.22408563e-02,  -4.30562522e-02,
         5.31726003e-02,  -1.23012138e-01,   1.77258467e-01,
         6.81889023e-02,  -4.16362637e-03,   1.00119019e-01,
        -7.09322806e-02,  -8.92083742e-02,   1.99107233e-02,
         1.33606216e-02,   3.35222953e-02,  -1.58465961e-01,
        -1.57523171e-01,  -8.13030749e-02,  -1.03812269e-01,
        -1.49254996e-01,  -8.13030749e-02,  -6.41916305e-03,
        -5.55340896e-02,  -3.33948524e-02,   7.45644153e-02,
        -1.52672524e-02,   2.17673798e-01,   1.39549109e-01,
         3.33738018e-02,  -5.05301301e-02,  -1.45154068e-02,
        -9.07545163e-02,  -6.28360368e-02,   1.20308259e+00,
         1.28451660e+00,   1.32487047e+00,   1.18762080e+00,
         1.31393877e+00,   1.25705298e+00,   1.39745639e+00,
         9.07172433e-01,   1.17656176e+00,   1.24113634e+00,
         9.59294742e-01,   1.28013501e+00,   9.54205881e-01,
         1.31512204e+00,   1.05930184e+00,   1.17232866e+00,
         1.38115786e+00,   9.76734088e-01,   1.35070534e+00,
         1.02311961e+00,   1.59045598e+00,   1.09965570e+00,
         1.41725961e+00,   1.19756726e+00,   1.13040963e+00,
         1.18772685e+00,   1.26542720e+00,   1.49592176e+00,
         1.34168532e+00,   8.55931450e-01,   1.01581766e+00,
         9.32128108e-01,   1.05331264e+00,   1.54772365e+00,
         1.40310615e+00,   1.38055451e+00,   1.30141848e+00,
         1.19062819e+00,   1.16837848e+00,   1.17877271e+00,
         1.20415981e+00,   1.28799785e+00,   1.08043682e+00,
         9.00622332e-01,   1.20435076e+00,   1.11911506e+00,
         1.18452852e+00,   1.15235793e+00,   8.73689093e-01,
         1.16625243e+00,   2.24146289e+00,   1.75264018e+00,
         1.90028407e+00,   1.74143264e+00,   2.00441822e+00,
         2.00431431e+00,   1.60207593e+00,   1.79059214e+00,
         1.76063251e+00,   2.15212358e+00,   1.71469034e+00,
         1.73219558e+00,   1.84240596e+00,   1.81075169e+00,
         2.05316319e+00,   1.95403300e+00,   1.69236016e+00,
         2.04163735e+00,   2.20111558e+00,   1.48615432e+00,
         1.98996282e+00,   1.78575356e+00,   1.96389898e+00,
         1.59137976e+00,   1.88550825e+00,   1.72019374e+00,
         1.57522972e+00,   1.60005592e+00,   1.91785077e+00,
         1.56166273e+00,   1.79963117e+00,   1.82960982e+00,
         1.97884018e+00,   1.44938775e+00,   1.53269542e+00,
         2.00181829e+00,   2.08524888e+00,   1.69891026e+00,
         1.58832992e+00,   1.80430763e+00,   2.05462443e+00,
         1.85818604e+00,   1.75264018e+00,   2.04633725e+00,
         2.12946589e+00,   1.90725851e+00,   1.68391740e+00,
         1.74623857e+00,   1.98983334e+00,   1.66740449e+00])



In [7]:

    
m.get_params()









    Out[7]:





{'copy_X': True, 'fit_intercept': True, 'n_jobs': 1, 'normalize': False}



In [8]:

    
#predict - in the output below, 0-Setosa, 1-Versicolour, 2-Virginica
m.predict(iris.data)









    Out[8]:





array([ -8.26582725e-02,  -3.85897565e-02,  -4.81896914e-02,
         1.26087761e-02,  -7.61081708e-02,   5.68023484e-02,
         3.76259158e-02,  -4.45599433e-02,   2.07050198e-02,
        -8.13030749e-02,  -1.01728663e-01,   8.84875996e-05,
        -8.86050221e-02,  -1.01834705e-01,  -2.26997797e-01,
        -4.36405904e-02,  -3.39982044e-02,  -2.16688605e-02,
        -3.26854579e-02,  -1.22408563e-02,  -4.30562522e-02,
         5.31726003e-02,  -1.23012138e-01,   1.77258467e-01,
         6.81889023e-02,  -4.16362637e-03,   1.00119019e-01,
        -7.09322806e-02,  -8.92083742e-02,   1.99107233e-02,
         1.33606216e-02,   3.35222953e-02,  -1.58465961e-01,
        -1.57523171e-01,  -8.13030749e-02,  -1.03812269e-01,
        -1.49254996e-01,  -8.13030749e-02,  -6.41916305e-03,
        -5.55340896e-02,  -3.33948524e-02,   7.45644153e-02,
        -1.52672524e-02,   2.17673798e-01,   1.39549109e-01,
         3.33738018e-02,  -5.05301301e-02,  -1.45154068e-02,
        -9.07545163e-02,  -6.28360368e-02,   1.20308259e+00,
         1.28451660e+00,   1.32487047e+00,   1.18762080e+00,
         1.31393877e+00,   1.25705298e+00,   1.39745639e+00,
         9.07172433e-01,   1.17656176e+00,   1.24113634e+00,
         9.59294742e-01,   1.28013501e+00,   9.54205881e-01,
         1.31512204e+00,   1.05930184e+00,   1.17232866e+00,
         1.38115786e+00,   9.76734088e-01,   1.35070534e+00,
         1.02311961e+00,   1.59045598e+00,   1.09965570e+00,
         1.41725961e+00,   1.19756726e+00,   1.13040963e+00,
         1.18772685e+00,   1.26542720e+00,   1.49592176e+00,
         1.34168532e+00,   8.55931450e-01,   1.01581766e+00,
         9.32128108e-01,   1.05331264e+00,   1.54772365e+00,
         1.40310615e+00,   1.38055451e+00,   1.30141848e+00,
         1.19062819e+00,   1.16837848e+00,   1.17877271e+00,
         1.20415981e+00,   1.28799785e+00,   1.08043682e+00,
         9.00622332e-01,   1.20435076e+00,   1.11911506e+00,
         1.18452852e+00,   1.15235793e+00,   8.73689093e-01,
         1.16625243e+00,   2.24146289e+00,   1.75264018e+00,
         1.90028407e+00,   1.74143264e+00,   2.00441822e+00,
         2.00431431e+00,   1.60207593e+00,   1.79059214e+00,
         1.76063251e+00,   2.15212358e+00,   1.71469034e+00,
         1.73219558e+00,   1.84240596e+00,   1.81075169e+00,
         2.05316319e+00,   1.95403300e+00,   1.69236016e+00,
         2.04163735e+00,   2.20111558e+00,   1.48615432e+00,
         1.98996282e+00,   1.78575356e+00,   1.96389898e+00,
         1.59137976e+00,   1.88550825e+00,   1.72019374e+00,
         1.57522972e+00,   1.60005592e+00,   1.91785077e+00,
         1.56166273e+00,   1.79963117e+00,   1.82960982e+00,
         1.97884018e+00,   1.44938775e+00,   1.53269542e+00,
         2.00181829e+00,   2.08524888e+00,   1.69891026e+00,
         1.58832992e+00,   1.80430763e+00,   2.05462443e+00,
         1.85818604e+00,   1.75264018e+00,   2.04633725e+00,
         2.12946589e+00,   1.90725851e+00,   1.68391740e+00,
         1.74623857e+00,   1.98983334e+00,   1.66740449e+00])



In [11]:

    
#returns the coefficient determination R^2 of the prediction
m.score(iris.data,iris.target)









    Out[11]:





0.93042236753315966