Basic Metrics

When we think about summarizing data, what are the metrics that we look at?

In this notebook, we will look at the car dataset

To read how the data was acquired, please read this repo to get more information



In [2]:

    
#Import the required libraries
import numpy as np
import pandas as pd
from datetime import datetime as dt
from scipy import stats

Read the dataset



In [3]:

    
cars = pd.read_csv("cars_v1.csv", encoding = "ISO-8859-1")

Warm up



In [4]:

    
cars.head()









    Out[4]:






  
    
      
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      0
      Ashok Leyland Stile
      Ashok Leyland Stile LE 8-STR (Diesel)
      750
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      1
      Ashok Leyland Stile
      Ashok Leyland Stile LS 8-STR (Diesel)
      800
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      2
      Ashok Leyland Stile
      Ashok Leyland Stile LX 8-STR (Diesel)
      830
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      3
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR (Diesel)
      850
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      4
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR Alloy (Diesel)
      880
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7

Exercise



In [5]:

    
#Display the first 10 records
cars.head(10)









    Out[5]:






  
    
      
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      0
      Ashok Leyland Stile
      Ashok Leyland Stile LE 8-STR (Diesel)
      750
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      1
      Ashok Leyland Stile
      Ashok Leyland Stile LS 8-STR (Diesel)
      800
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      2
      Ashok Leyland Stile
      Ashok Leyland Stile LX 8-STR (Diesel)
      830
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      3
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR (Diesel)
      850
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      4
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR Alloy (Diesel)
      880
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      5
      Ashok Leyland Stile
      Ashok Leyland Stile LX 7-STR (Diesel)
      900
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      6
      Ashok Leyland Stile
      Ashok Leyland Stile LX 7-STR Alloy (Diesel)
      930
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      7
      Aston Martin Rapide
      Aston Martin Rapide LUXE (Petrol)
      35000
      Sedan
      Yes
      300.0
      Automatic
      Yes
      5935.0
      90.5
      7.0
    
    
      8
      Aston Martin Rapide S
      Aston Martin Rapide S (Petrol)
      44000
      Sedan
      Yes
      NaN
      Automatic
      Yes
      5935.0
      90.0
      11.9
    
    
      9
      Aston Martin V12 Vantage
      Aston Martin V12 Vantage Coupe (Petrol)
      35000
      Coupe
      Yes
      300.0
      Automatic
      Yes
      5935.0
      80.0
      9.0



In [6]:

    
#Display the last 5 records
cars.tail()









    Out[6]:






  
    
      
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      828
      Volvo V40
      Volvo V40 D3 R-Design (Diesel)
      2770
      Hatchback
      Yes
      324.0
      Automatic
      Yes
      1984.0
      62.0
      27.8
    
    
      829
      Volvo V40 Cross Country
      Volvo V40 D3 Cross Country (Diesel)
      3180
      Hatchback
      Yes
      335.0
      Automatic
      Yes
      1984.0
      60.0
      16.8
    
    
      830
      Volvo V40 Cross Country
      Volvo V40 T4 Cross Country (Petrol)
      3240
      Hatchback
      Yes
      324.0
      Automatic
      Yes
      1596.0
      50.0
      16.0
    
    
      831
      Volvo XC90
      Volvo XC 90 D5 Momentum(Diesel)
      6490
      SUV
      Yes
      NaN
      Automatic
      Yes
      1969.0
      68.0
      11.1
    
    
      832
      Volvo XC90
      Volvo XC 90 D5 Inscription(Diesel)
      7790
      SUV
      Yes
      NaN
      Automatic
      Yes
      1969.0
      68.0
      11.1



In [7]:

    
#Find the number of rows and columns in the dataset
cars.shape









    Out[7]:





(833, 11)



In [8]:

    
#What are the column names in the dataset?
cars.columns









    Out[8]:





Index(['Make', 'Model', 'Price', 'Type', 'ABS', 'BootSpace', 'GearType',
       'AirBag', 'Engine', 'FuelCapacity', 'Mileage'],
      dtype='object')



In [9]:

    
#What are the types of those columns ? 
cars.dtypes









    Out[9]:





Make             object
Model            object
Price             int64
Type             object
ABS              object
BootSpace       float64
GearType         object
AirBag           object
Engine          float64
FuelCapacity    float64
Mileage         float64
dtype: object



In [ ]:

    
cars.head()



In [11]:

    
#How to check if there are null values in any of the columns?

#Hint: use the isnull() function  (how about using sum or values/any with it?)
cars.isnull().sum()









    Out[11]:





Make              0
Model             0
Price             0
Type              0
ABS              16
BootSpace       179
GearType         16
AirBag           21
Engine            7
FuelCapacity      0
Mileage         171
dtype: int64



In [20]:









    Out[20]:





Make              0
Model             0
Price             0
Type              0
ABS              16
BootSpace       179
GearType         16
AirBag           21
Engine            7
FuelCapacity      0
Mileage         171
dtype: int64

How to handle missing values?



In [15]:

    
#fillna function

Mean, Median, Variance, Standard Deviation

Mean

arithmetic average of a range of values or quantities, computed by dividing the total of all values by the number of values.



In [17]:

    
#Find mean of price
cars.Price.mean()









    Out[17]:





3159.4957983193276



In [18]:

    
#Find mean of Mileage
cars.Mileage.mean()









    Out[18]:





17.480407854984893

Let's do something fancier. Let's find mean mileage of every make.

Hint: need to use groupby



In [23]:

    
#cars.groupby('Make') : Finish the code
cars.groupby('Make').Mileage.mean().reset_index()









    Out[23]:






  
    
      
      Make
      Mileage
    
  
  
    
      0
      Ashok Leyland Stile
      20.700000
    
    
      1
      Aston Martin Rapide
      7.000000
    
    
      2
      Aston Martin Rapide S
      11.900000
    
    
      3
      Aston Martin V12 Vantage
      9.000000
    
    
      4
      Aston Martin V8 Vantage
      5.000000
    
    
      5
      Aston Martin Vanquish
      8.000000
    
    
      6
      Audi A3 Cabriolet
      19.186667
    
    
      7
      Audi A4
      14.804000
    
    
      8
      Audi A6
      15.260000
    
    
      9
      Audi A7
      14.400000
    
    
      10
      Audi A8 L
      12.575000
    
    
      11
      Audi Q3
      16.260000
    
    
      12
      Audi Q5
      13.063333
    
    
      13
      Audi Q7
      12.466667
    
    
      14
      Audi R8
      9.515000
    
    
      15
      Audi RS5
      9.000000
    
    
      16
      Audi RS6
      10.410000
    
    
      17
      Audi RS7
      13.300000
    
    
      18
      Audi S6
      10.300000
    
    
      19
      Audi TT
      8.300000
    
    
      20
      BMW 1 Series
      20.580000
    
    
      21
      BMW 3 Series
      17.857500
    
    
      22
      BMW 3 Series GT
      19.590000
    
    
      23
      BMW 5 Series
      13.982500
    
    
      24
      BMW 5 Series GT
      NaN
    
    
      25
      BMW 6 Series
      17.540000
    
    
      26
      BMW 7 Series
      11.962500
    
    
      27
      BMW M3
      10.750000
    
    
      28
      BMW M5
      13.250000
    
    
      29
      BMW M6 Gran Coupe
      10.100000
    
    
      ...
      ...
      ...
    
    
      173
      Tata Movus
      NaN
    
    
      174
      Tata Nano
      24.083333
    
    
      175
      Tata Safari DICOR
      13.930000
    
    
      176
      Tata Safari Storme
      13.875000
    
    
      177
      Tata Sumo Gold
      14.612500
    
    
      178
      Tata Sumo Grande
      13.550000
    
    
      179
      Tata Venture
      15.420000
    
    
      180
      Tata Winger
      NaN
    
    
      181
      Tata Xenon
      12.745000
    
    
      182
      Tata Zest
      20.600000
    
    
      183
      Toyota Corolla Altis 2014
      21.322500
    
    
      184
      Toyota Etios
      20.185000
    
    
      185
      Toyota Etios Cross
      13.300000
    
    
      186
      Toyota Etios Liva
      NaN
    
    
      187
      Toyota Etios Xclusive
      16.780000
    
    
      188
      Toyota Fortuner
      12.550000
    
    
      189
      Volkswagen Cross Polo
      19.285000
    
    
      190
      Volkswagen New JETTA
      NaN
    
    
      191
      Volkswagen New Phaeton
      12.800000
    
    
      192
      Volkswagen Passat
      17.286667
    
    
      193
      Volkswagen Polo
      18.305000
    
    
      194
      Volkswagen Polo GT
      18.450000
    
    
      195
      Volkswagen Taigun
      NaN
    
    
      196
      Volkswagen Up
      21.000000
    
    
      197
      Volkswagen Vento
      18.957000
    
    
      198
      Volvo S60
      19.133333
    
    
      199
      Volvo S80
      16.660000
    
    
      200
      Volvo V40
      27.800000
    
    
      201
      Volvo V40 Cross Country
      16.400000
    
    
      202
      Volvo XC90
      11.100000
    
  

203 rows × 2 columns

Exercise

How about finding the average mileage for every Type-GearType combination?



In [46]:









    Out[46]:






  
    
      
      Type
      GearType
      Mileage
    
  
  
    
      0
      Convertible
      Automatic
      12.147143
    
    
      1
      Coupe
      Automatic
      9.746071
    
    
      2
      Hatchback
      Automatic
      19.446111
    
    
      3
      Hatchback
      Manual
      20.804878
    
    
      4
      Hatchback
      No
      21.748000
    
    
      5
      MPV
      Automatic
      20.434286
    
    
      6
      MPV
      Manual
      18.825862
    
    
      7
      MUV
      Automatic
      12.466667
    
    
      8
      MUV
      Manual
      15.634091
    
    
      9
      MUV
      No
      13.912000
    
    
      10
      SUV
      Automatic
      14.603176
    
    
      11
      SUV
      Manual
      15.772308
    
    
      12
      SUV
      No
      11.973333
    
    
      13
      Sedan
      Automatic
      16.670000
    
    
      14
      Sedan
      Manual
      20.143626
    
    
      15
      Sedan
      No
      19.500000

Median

Denotes value or quantity lying at the midpoint of a frequency distribution of observed values or quantities, such that there is an equal probability of falling above or below it. Simply put, it is the middle value in the list of numbers.

If count is odd, the median is the value at (n+1)/2,

else it is the average of n/2 and (n+1)/2

Find median of mileage



In [24]:

    
cars.Mileage.median()









    Out[24]:





17.985

Mode

It is the number which appears most often in a set of numbers.

Find the mode of Type of cars



In [55]:

    
#Let's first find count of each of the car Types
#Hint: use value_counts



In [26]:

    
cars.Type.value_counts()









    Out[26]:





Sedan          294
Hatchback      222
SUV            186
MPV             47
MUV             40
Coupe           33
Convertible     11
Name: Type, dtype: int64



In [56]:

    
#Mode of cars



In [30]:

    
cars.Type









    Out[30]:





0              MPV
1              MPV
2              MPV
3              MPV
4              MPV
5              MPV
6              MPV
7            Sedan
8            Sedan
9            Coupe
10           Coupe
11           Coupe
12           Coupe
13           Sedan
14           Sedan
15           Sedan
16           Sedan
17           Sedan
18           Sedan
19     Convertible
20           Sedan
21           Sedan
22           Sedan
23           Sedan
24           Sedan
25           Sedan
26           Sedan
27           Sedan
28           Sedan
29           Sedan
          ...     
803      Hatchback
804      Hatchback
805      Hatchback
806      Hatchback
807      Hatchback
808      Hatchback
809            SUV
810      Hatchback
811          Sedan
812          Sedan
813          Sedan
814          Sedan
815          Sedan
816          Sedan
817          Sedan
818          Sedan
819          Sedan
820          Sedan
821          Sedan
822          Sedan
823          Sedan
824          Sedan
825          Sedan
826          Sedan
827      Hatchback
828      Hatchback
829      Hatchback
830      Hatchback
831            SUV
832            SUV
Name: Type, dtype: object



In [28]:

    
cars.Type.mode()









    Out[28]:





0    Sedan
dtype: object



In [29]:

    
cars.head()









    Out[29]:






  
    
      
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      0
      Ashok Leyland Stile
      Ashok Leyland Stile LE 8-STR (Diesel)
      750
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      1
      Ashok Leyland Stile
      Ashok Leyland Stile LS 8-STR (Diesel)
      800
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      2
      Ashok Leyland Stile
      Ashok Leyland Stile LX 8-STR (Diesel)
      830
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      3
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR (Diesel)
      850
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      4
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR Alloy (Diesel)
      880
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7

Variance

Once two statistician of height 4 feet and 5 feet have to cross a river of AVERAGE depth 3 feet. Meanwhile, a third person comes and said, "what are you waiting for? You can easily cross the river"

It's the average distance of the data values from the mean

Find variance of mileage



In [31]:

    
cars.Mileage.var()









    Out[31]:





21.018811179847432

Standard Deviation

It is the square root of variance. This will have the same units as the data and mean.

Find standard deviation of mileage



In [32]:

    
cars.Mileage.std()









    Out[32]:





4.5846277035161132

Using Pandas built-in function



In [33]:

    
cars.describe()









    



/Users/amitkaps/miniconda3/lib/python3.5/site-packages/numpy/lib/function_base.py:3834: RuntimeWarning: Invalid value encountered in percentile
  RuntimeWarning)






    Out[33]:






  
    
      
      Price
      BootSpace
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      count
      833.000000
      654.000000
      826.000000
      833.000000
      662.000000
    
    
      mean
      3159.495798
      409.536697
      1900.533898
      51.341693
      17.480408
    
    
      std
      6272.519112
      175.351525
      1065.149758
      17.643132
      4.584628
    
    
      min
      199.000000
      80.000000
      624.000000
      0.000000
      5.000000
    
    
      25%
      619.000000
      NaN
      NaN
      42.000000
      NaN
    
    
      50%
      880.000000
      NaN
      NaN
      50.000000
      NaN
    
    
      75%
      2329.000000
      NaN
      NaN
      64.000000
      NaN
    
    
      max
      50000.000000
      1000.000000
      6592.000000
      120.000000
      30.000000

Co-variance

covariance as a measure of the (average) co-variation between two variables, say x and y. Covariance describes both how far the variables are spread out, and the nature of their relationship, Covariance is a measure of how much two variables change together. Compare this to Variance, which is just the range over which one measure (or variable) varies.

Co-variance of mileage of Automatic and Manual Gear Type



In [34]:

    
pd.unique(cars.GearType)









    Out[34]:





array([' Manual', ' Automatic', nan, ' No'], dtype=object)



In [36]:

    
cars_Automatic = cars[cars.GearType==' Automatic'].copy().reset_index()



In [37]:

    
cars_Manual = cars[cars.GearType==' Manual'].copy().reset_index()



In [38]:

    
cars_Automatic.head()









    Out[38]:






  
    
      
      index
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      0
      7
      Aston Martin Rapide
      Aston Martin Rapide LUXE (Petrol)
      35000
      Sedan
      Yes
      300.0
      Automatic
      Yes
      5935.0
      90.5
      7.0
    
    
      1
      8
      Aston Martin Rapide S
      Aston Martin Rapide S (Petrol)
      44000
      Sedan
      Yes
      NaN
      Automatic
      Yes
      5935.0
      90.0
      11.9
    
    
      2
      9
      Aston Martin V12 Vantage
      Aston Martin V12 Vantage Coupe (Petrol)
      35000
      Coupe
      Yes
      300.0
      Automatic
      Yes
      5935.0
      80.0
      9.0
    
    
      3
      10
      Aston Martin V8 Vantage
      Aston Martin V8 Vantage Coupe (Petrol)
      13500
      Coupe
      Yes
      300.0
      Automatic
      Yes
      4735.0
      80.0
      5.0
    
    
      4
      11
      Aston Martin V8 Vantage
      Aston Martin V8 Vantage S Coupe (Petrol)
      25500
      Coupe
      Yes
      300.0
      Automatic
      Yes
      4735.0
      80.0
      5.0



In [85]:

    
cars_Manual.head()









    Out[85]:






  
    
      
      index
      Make
      Model
      Price
      Type
      ABS
      BootSpace
      GearType
      AirBag
      Engine
      FuelCapacity
      Mileage
    
  
  
    
      0
      0
      Ashok Leyland Stile
      Ashok Leyland Stile LE 8-STR (Diesel)
      750
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      1
      1
      Ashok Leyland Stile
      Ashok Leyland Stile LS 8-STR (Diesel)
      800
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      2
      2
      Ashok Leyland Stile
      Ashok Leyland Stile LX 8-STR (Diesel)
      830
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      3
      3
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR (Diesel)
      850
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7
    
    
      4
      4
      Ashok Leyland Stile
      Ashok Leyland Stile LS 7-STR Alloy (Diesel)
      880
      MPV
      No
      500.0
      Manual
      No
      1461.0
      50.0
      20.7



In [86]:

    
cars_Manual.shape









    Out[86]:





(421, 12)



In [87]:

    
cars_Automatic.shape









    Out[87]:





(372, 12)

The number of observations have to be same. For the current exercise, let's take the first 300 observations in both the datasets



In [39]:

    
cars_Automatic = cars_Automatic.ix[:299,:]
cars_Manual = cars_Manual.ix[:299,:]



In [92]:

    
cars_Automatic.shape









    Out[92]:





(300, 12)



In [93]:

    
cars_Manual.shape









    Out[93]:





(300, 12)



In [40]:

    
cars_manual_automatic = pd.DataFrame([cars_Automatic.Mileage, cars_Manual.Mileage])



In [97]:

    
cars_manual_automatic









    Out[97]:






  
    
      
      0
      1
      2
      3
      4
      5
      6
      7
      8
      9
      ...
      290
      291
      292
      293
      294
      295
      296
      297
      298
      299
    
  
  
    
      Mileage
      7.0
      11.9
      9.0
      5.0
      5.0
      5.0
      8.0
      16.60
      17.0
      16.55
      ...
      13.1
      13.10
      17.2
      13.7
      17.20
      14.80
      NaN
      NaN
      NaN
      19.30
    
    
      Mileage
      20.7
      20.7
      20.7
      20.7
      20.7
      20.7
      20.7
      17.32
      13.7
      13.70
      ...
      20.5
      20.45
      20.5
      19.0
      19.01
      19.01
      19.01
      13.05
      19.87
      19.87
    
  

2 rows × 300 columns



In [41]:

    
cars_manual_automatic = cars_manual_automatic.T



In [101]:

    
cars_manual_automatic.head()



In [42]:

    
cars_manual_automatic.columns = ['Mileage_Automatic', 'Mileage_Manual']



In [104]:

    
cars_manual_automatic.head()









    Out[104]:






  
    
      
      Mileage_Automatic
      Mileage_Manual
    
  
  
    
      0
      7.0
      20.7
    
    
      1
      11.9
      20.7
    
    
      2
      9.0
      20.7
    
    
      3
      5.0
      20.7
    
    
      4
      5.0
      20.7



In [44]:

    
#Co-variance matrix between the mileages of automatic and manual:
cars_manual_automatic.cov()









    Out[44]:






  
    
      
      Mileage_Automatic
      Mileage_Manual
    
  
  
    
      Mileage_Automatic
      22.375515
      0.446292
    
    
      Mileage_Manual
      0.446292
      12.776373



In [ ]:

Correlation

Extent to which two or more variables fluctuate together. A positive correlation indicates the extent to which those variables increase or decrease in parallel; a negative correlation indicates the extent to which one variable increases as the other decreases.



In [106]:

    
#### Find the correlation between the mileages of automatic and manual in the above dataset



In [45]:

    
cars_manual_automatic.corr()









    Out[45]:






  
    
      
      Mileage_Automatic
      Mileage_Manual
    
  
  
    
      Mileage_Automatic
      1.000000
      0.026011
    
    
      Mileage_Manual
      0.026011
      1.000000



In [46]:

    
cars_manual_automatic.corrwith?

Correlation != Causation

correlation between two variables does not necessarily imply that one causes the other.



In [ ]:

	Make	Model	Price	Type	ABS	BootSpace	GearType	AirBag	Engine	FuelCapacity	Mileage
0	Ashok Leyland Stile	Ashok Leyland Stile LE 8-STR (Diesel)	750	MPV	No	500.0	Manual	No	1461.0	50.0	20.7
1	Ashok Leyland Stile	Ashok Leyland Stile LS 8-STR (Diesel)	800	MPV	No	500.0	Manual	No	1461.0	50.0	20.7
2	Ashok Leyland Stile	Ashok Leyland Stile LX 8-STR (Diesel)	830	MPV	No	500.0	Manual	No	1461.0	50.0	20.7
3	Ashok Leyland Stile	Ashok Leyland Stile LS 7-STR (Diesel)	850	MPV	No	500.0	Manual	No	1461.0	50.0	20.7
4	Ashok Leyland Stile	Ashok Leyland Stile LS 7-STR Alloy (Diesel)	880	MPV	No	500.0	Manual	No	1461.0	50.0	20.7

	Make	Model	Price	Type	ABS	BootSpace	GearType	AirBag	Engine	FuelCapacity	Mileage
828	Volvo V40	Volvo V40 D3 R-Design (Diesel)	2770	Hatchback	Yes	324.0	Automatic	Yes	1984.0	62.0	27.8
829	Volvo V40 Cross Country	Volvo V40 D3 Cross Country (Diesel)	3180	Hatchback	Yes	335.0	Automatic	Yes	1984.0	60.0	16.8
830	Volvo V40 Cross Country	Volvo V40 T4 Cross Country (Petrol)	3240	Hatchback	Yes	324.0	Automatic	Yes	1596.0	50.0	16.0
831	Volvo XC90	Volvo XC 90 D5 Momentum(Diesel)	6490	SUV	Yes	NaN	Automatic	Yes	1969.0	68.0	11.1
832	Volvo XC90	Volvo XC 90 D5 Inscription(Diesel)	7790	SUV	Yes	NaN	Automatic	Yes	1969.0	68.0	11.1

	Make	Mileage
0	Ashok Leyland Stile	20.700000
1	Aston Martin Rapide	7.000000
2	Aston Martin Rapide S	11.900000
3	Aston Martin V12 Vantage	9.000000
4	Aston Martin V8 Vantage	5.000000
5	Aston Martin Vanquish	8.000000
6	Audi A3 Cabriolet	19.186667
7	Audi A4	14.804000
8	Audi A6	15.260000
9	Audi A7	14.400000
10	Audi A8 L	12.575000
11	Audi Q3	16.260000
12	Audi Q5	13.063333
13	Audi Q7	12.466667
14	Audi R8	9.515000
15	Audi RS5	9.000000
16	Audi RS6	10.410000
17	Audi RS7	13.300000
18	Audi S6	10.300000
19	Audi TT	8.300000
20	BMW 1 Series	20.580000
21	BMW 3 Series	17.857500
22	BMW 3 Series GT	19.590000
23	BMW 5 Series	13.982500
24	BMW 5 Series GT	NaN
25	BMW 6 Series	17.540000
26	BMW 7 Series	11.962500
27	BMW M3	10.750000
28	BMW M5	13.250000
29	BMW M6 Gran Coupe	10.100000
...	...	...
173	Tata Movus	NaN
174	Tata Nano	24.083333
175	Tata Safari DICOR	13.930000
176	Tata Safari Storme	13.875000
177	Tata Sumo Gold	14.612500
178	Tata Sumo Grande	13.550000
179	Tata Venture	15.420000
180	Tata Winger	NaN
181	Tata Xenon	12.745000
182	Tata Zest	20.600000
183	Toyota Corolla Altis 2014	21.322500
184	Toyota Etios	20.185000
185	Toyota Etios Cross	13.300000
186	Toyota Etios Liva	NaN
187	Toyota Etios Xclusive	16.780000
188	Toyota Fortuner	12.550000
189	Volkswagen Cross Polo	19.285000
190	Volkswagen New JETTA	NaN
191	Volkswagen New Phaeton	12.800000
192	Volkswagen Passat	17.286667
193	Volkswagen Polo	18.305000
194	Volkswagen Polo GT	18.450000
195	Volkswagen Taigun	NaN
196	Volkswagen Up	21.000000
197	Volkswagen Vento	18.957000
198	Volvo S60	19.133333
199	Volvo S80	16.660000
200	Volvo V40	27.800000
201	Volvo V40 Cross Country	16.400000
202	Volvo XC90	11.100000

	Type	GearType	Mileage
0	Convertible	Automatic	12.147143
1	Coupe	Automatic	9.746071
2	Hatchback	Automatic	19.446111
3	Hatchback	Manual	20.804878
4	Hatchback	No	21.748000
5	MPV	Automatic	20.434286
6	MPV	Manual	18.825862
7	MUV	Automatic	12.466667
8	MUV	Manual	15.634091
9	MUV	No	13.912000
10	SUV	Automatic	14.603176
11	SUV	Manual	15.772308
12	SUV	No	11.973333
13	Sedan	Automatic	16.670000
14	Sedan	Manual	20.143626
15	Sedan	No	19.500000

	Price	BootSpace	Engine	FuelCapacity	Mileage
count	833.000000	654.000000	826.000000	833.000000	662.000000
mean	3159.495798	409.536697	1900.533898	51.341693	17.480408
std	6272.519112	175.351525	1065.149758	17.643132	4.584628
min	199.000000	80.000000	624.000000	0.000000	5.000000
25%	619.000000	NaN	NaN	42.000000	NaN
50%	880.000000	NaN	NaN	50.000000	NaN
75%	2329.000000	NaN	NaN	64.000000	NaN
max	50000.000000	1000.000000	6592.000000	120.000000	30.000000

	index	Make	Model	Price	Type	ABS	BootSpace	GearType	AirBag	Engine	FuelCapacity	Mileage
0	7	Aston Martin Rapide	Aston Martin Rapide LUXE (Petrol)	35000	Sedan	Yes	300.0	Automatic	Yes	5935.0	90.5	7.0
1	8	Aston Martin Rapide S	Aston Martin Rapide S (Petrol)	44000	Sedan	Yes	NaN	Automatic	Yes	5935.0	90.0	11.9
2	9	Aston Martin V12 Vantage	Aston Martin V12 Vantage Coupe (Petrol)	35000	Coupe	Yes	300.0	Automatic	Yes	5935.0	80.0	9.0
3	10	Aston Martin V8 Vantage	Aston Martin V8 Vantage Coupe (Petrol)	13500	Coupe	Yes	300.0	Automatic	Yes	4735.0	80.0	5.0
4	11	Aston Martin V8 Vantage	Aston Martin V8 Vantage S Coupe (Petrol)	25500	Coupe	Yes	300.0	Automatic	Yes	4735.0	80.0	5.0

	0	1	2	3	4	5	6	7	8	9	...	290	291	292	293	294	295	296	297	298	299
Mileage	7.0	11.9	9.0	5.0	5.0	5.0	8.0	16.60	17.0	16.55	...	13.1	13.10	17.2	13.7	17.20	14.80	NaN	NaN	NaN	19.30
Mileage	20.7	20.7	20.7	20.7	20.7	20.7	20.7	17.32	13.7	13.70	...	20.5	20.45	20.5	19.0	19.01	19.01	19.01	13.05	19.87	19.87

	Mileage_Automatic	Mileage_Manual
Mileage_Automatic	22.375515	0.446292
Mileage_Manual	0.446292	12.776373

	Mileage_Automatic	Mileage_Manual
Mileage_Automatic	1.000000	0.026011
Mileage_Manual	0.026011	1.000000