Import pandas and numpy



In [1]:

    
import pandas as pd
import numpy as np
from ggplot import *
import matplotlib
matplotlib.style.use('ggplot')

Test some python easter egg



In [2]:

    
import __hello__









    



Hello world...



In [3]:

    
import this









    



The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Data Preparation with Pandas

Data taken from Github: https://github.com/veekun/pokedex



In [6]:

    
path = "../pokedex/pokedex/data/csv/"

Load data



In [7]:

    
pokemons = pd.read_csv(path + "pokemon.csv",sep=",")
pokemon_species_names = pd.read_csv(path + "pokemon_species_names.csv",sep=",")
pokemon_species = pd.read_csv(path + "pokemon_species.csv",sep=",")
generation_names = pd.read_csv(path + "generation_names.csv", sep=",")
habitat = pd.read_csv(path + "pokemon_habitats.csv")

Series and DataFrames



In [8]:

    
type(pokemons)









    Out[8]:





pandas.core.frame.DataFrame



In [9]:

    
pokemons.head(2)









    Out[9]:






  
    
      
      id
      identifier
      species_id
      height
      weight
      base_experience
      order
      is_default
    
  
  
    
      0
      1
      bulbasaur
      1
      7
      69
      64
      1
      1
    
    
      1
      2
      ivysaur
      2
      10
      130
      142
      2
      1



In [11]:

    
pokemons.columns









    Out[11]:





Index([u'id', u'identifier', u'species_id', u'height', u'weight',
       u'base_experience', u'order', u'is_default'],
      dtype='object')



In [12]:

    
pokemons.dtypes









    Out[12]:





id                  int64
identifier         object
species_id          int64
height              int64
weight              int64
base_experience     int64
order               int64
is_default          int64
dtype: object



In [13]:

    
type(pokemons['identifier']), type(pokemons.identifier)









    Out[13]:





(pandas.core.series.Series, pandas.core.series.Series)



In [14]:

    
pokemons.identifier.head(5)









    Out[14]:





0     bulbasaur
1       ivysaur
2      venusaur
3    charmander
4    charmeleon
Name: identifier, dtype: object

Selecting Internal Elements



In [15]:

    
pokemons.ix[0]









    Out[15]:





id                         1
identifier         bulbasaur
species_id                 1
height                     7
weight                    69
base_experience           64
order                      1
is_default                 1
Name: 0, dtype: object



In [16]:

    
pokemons['identifier'].head(5)









    Out[16]:





0     bulbasaur
1       ivysaur
2      venusaur
3    charmander
4    charmeleon
Name: identifier, dtype: object



In [17]:

    
pokemons.ix[0,'weight']









    Out[17]:





69

Assigning Values



In [22]:

    
pokemons['new_column'] = 'MGA POGING POKEMON'



In [23]:

    
pokemons.head(2)









    Out[23]:






  
    
      
      id
      identifier
      species_id
      height
      weight
      base_experience
      order
      is_default
      new_column
    
  
  
    
      0
      1
      bulbasaur
      1
      7
      69
      64
      1
      1
      MGA POGING POKEMON
    
    
      1
      2
      ivysaur
      2
      10
      130
      142
      2
      1
      MGA POGING POKEMON



In [24]:

    
pokemons.ix[1,'new_column'] = 'AY HINDI PALA POGI, HALAMAN PALA SILA.'



In [25]:

    
pokemons.head(2)









    Out[25]:






  
    
      
      id
      identifier
      species_id
      height
      weight
      base_experience
      order
      is_default
      new_column
    
  
  
    
      0
      1
      bulbasaur
      1
      7
      69
      64
      1
      1
      MGA POGING POKEMON
    
    
      1
      2
      ivysaur
      2
      10
      130
      142
      2
      1
      AY HINDI PALA POGI, HALAMAN PALA SILA.

Evaluating values

Discover values of new_column



In [26]:

    
pd.unique(pokemons['new_column'])









    Out[26]:





array(['MGA POGING POKEMON', 'AY HINDI PALA POGI, HALAMAN PALA SILA.'], dtype=object)

Basic Stats



In [27]:

    
pokemons.describe()









    Out[27]:






  
    
      
      id
      species_id
      height
      weight
      base_experience
      order
      is_default
    
  
  
    
      count
      811.000000
      811.000000
      811.000000
      811.000000
      811.000000
      811.000000
      811.000000
    
    
      mean
      1435.728730
      360.880395
      12.516646
      636.096178
      151.742293
      406.000000
      0.889026
    
    
      std
      3050.119235
      209.331096
      12.266613
      1077.658710
      79.150608
      234.259827
      0.314294
    
    
      min
      1.000000
      1.000000
      1.000000
      1.000000
      36.000000
      1.000000
      0.000000
    
    
      25%
      203.500000
      181.500000
      6.000000
      94.500000
      67.000000
      203.500000
      1.000000
    
    
      50%
      406.000000
      362.000000
      10.000000
      295.000000
      158.000000
      406.000000
      1.000000
    
    
      75%
      608.500000
      538.500000
      15.000000
      650.000000
      189.000000
      608.500000
      1.000000
    
    
      max
      10090.000000
      721.000000
      145.000000
      9997.000000
      608.000000
      811.000000
      1.000000

Get a more relevant stats for a more relevant columns



In [28]:

    
pokemons[['height','weight','base_experience']].describe()









    Out[28]:






  
    
      
      height
      weight
      base_experience
    
  
  
    
      count
      811.000000
      811.000000
      811.000000
    
    
      mean
      12.516646
      636.096178
      151.742293
    
    
      std
      12.266613
      1077.658710
      79.150608
    
    
      min
      1.000000
      1.000000
      36.000000
    
    
      25%
      6.000000
      94.500000
      67.000000
    
    
      50%
      10.000000
      295.000000
      158.000000
    
    
      75%
      15.000000
      650.000000
      189.000000
    
    
      max
      145.000000
      9997.000000
      608.000000



In [30]:

    
%matplotlib inline
pokemons[['height']].hist(bins=50,figsize=(15,5)), pokemons[['weight']].hist(bins=50,figsize=(15,5))









    Out[30]:





(array([[<matplotlib.axes._subplots.AxesSubplot object at 0x11a0f7cd0>]], dtype=object),
 array([[<matplotlib.axes._subplots.AxesSubplot object at 0x11a36ef10>]], dtype=object))



In [63]:

    
pokemons[['height']].plot(kind='density',figsize=(15,5))









    Out[63]:





<matplotlib.axes._subplots.AxesSubplot at 0x19e4ba9d0>

Filtering Values



In [31]:

    
filtered_pokemons = pokemons[pokemons['height']>=15]
filtered_pokemons.head(2)









    Out[31]:






  
    
      
      id
      identifier
      species_id
      height
      weight
      base_experience
      order
      is_default
      new_column
    
  
  
    
      2
      3
      venusaur
      3
      20
      1000
      236
      3
      1
      MGA POGING POKEMON
    
    
      5
      6
      charizard
      6
      17
      905
      240
      7
      1
      MGA POGING POKEMON



In [32]:

    
filtered_pokemons[['height']].hist(bins=20,figsize=(15,5))









    Out[32]:





array([[<matplotlib.axes._subplots.AxesSubplot object at 0x11c747f50>]], dtype=object)

Merge Data



In [33]:

    
pokemon_species.columns, pokemons.columns









    Out[33]:





(Index([u'id', u'identifier', u'generation_id', u'evolves_from_species_id',
        u'evolution_chain_id', u'color_id', u'shape_id', u'habitat_id',
        u'gender_rate', u'capture_rate', u'base_happiness', u'is_baby',
        u'hatch_counter', u'has_gender_differences', u'growth_rate_id',
        u'forms_switchable', u'order', u'conquest_order'],
       dtype='object'),
 Index([u'id', u'identifier', u'species_id', u'height', u'weight',
        u'base_experience', u'order', u'is_default', u'new_column'],
       dtype='object'))



In [34]:

    
pokemon_species.columns = [u'species_id', u'identifier', u'generation_id', u'evolves_from_species_id',\
        u'evolution_chain_id', u'color_id', u'shape_id', u'habitat_id',\
        u'gender_rate', u'capture_rate', u'base_happiness', u'is_baby',\
        u'hatch_counter', u'has_gender_differences', u'growth_rate_id',\
        u'forms_switchable', u'order', u'conquest_order']



In [35]:

    
pokemon_species_merged = pd.DataFrame()
pokemon_species_merged = pokemons.merge(pokemon_species,on=['species_id'],how='left')



In [36]:

    
pokemon_species_merged.head(2)









    Out[36]:






  
    
      
      id
      identifier_x
      species_id
      height
      weight
      base_experience
      order_x
      is_default
      new_column
      identifier_y
      ...
      gender_rate
      capture_rate
      base_happiness
      is_baby
      hatch_counter
      has_gender_differences
      growth_rate_id
      forms_switchable
      order_y
      conquest_order
    
  
  
    
      0
      1
      bulbasaur
      1
      7
      69
      64
      1
      1
      MGA POGING POKEMON
      bulbasaur
      ...
      1
      45
      70
      0
      20
      0
      4
      0
      1
      NaN
    
    
      1
      2
      ivysaur
      2
      10
      130
      142
      2
      1
      AY HINDI PALA POGI, HALAMAN PALA SILA.
      ivysaur
      ...
      1
      45
      70
      0
      20
      0
      4
      0
      2
      NaN
    
  

2 rows × 26 columns



In [37]:

    
pokemon_species_names_english = pokemon_species_names[pokemon_species_names.local_language_id==9]
pokemon_species_names_english.columns = [u'species_id', u'local_language_id', u'name', u'genus']
pokemon_species_merged_english = pokemon_species_merged.merge(pokemon_species_names_english,\
                                                             on=['species_id'],how='left')



In [38]:

    
pokemon_species_merged_english.columns









    Out[38]:





Index([                     u'id',            u'identifier_x',
                    u'species_id',                  u'height',
                        u'weight',         u'base_experience',
                       u'order_x',              u'is_default',
                    u'new_column',            u'identifier_y',
                 u'generation_id', u'evolves_from_species_id',
            u'evolution_chain_id',                u'color_id',
                      u'shape_id',              u'habitat_id',
                   u'gender_rate',            u'capture_rate',
                u'base_happiness',                 u'is_baby',
                 u'hatch_counter',  u'has_gender_differences',
                u'growth_rate_id',        u'forms_switchable',
                       u'order_y',          u'conquest_order',
             u'local_language_id',                    u'name',
                         u'genus'],
      dtype='object')

Aggregations



In [39]:

    
pokemon_species_merged_agg = pokemon_species_merged_english.groupby("genus").agg({'id': np.count_nonzero, \
                                             'height': np.mean,\
                                             'weight': np.median})
pokemon_species_merged_agg.tail(5)



In [40]:

    
pokemon_species_merged_english[pokemon_species_merged_english.genus=='Worm']









    Out[40]:






  
    
      
      id
      identifier_x
      species_id
      height
      weight
      base_experience
      order_x
      is_default
      new_column
      identifier_y
      ...
      is_baby
      hatch_counter
      has_gender_differences
      growth_rate_id
      forms_switchable
      order_y
      conquest_order
      local_language_id
      name
      genus
    
  
  
    
      9
      10
      caterpie
      10
      3
      29
      39
      14
      1
      MGA POGING POKEMON
      caterpie
      ...
      0
      15
      0
      2
      0
      10
      NaN
      9
      Caterpie
      Worm
    
    
      264
      265
      wurmple
      265
      3
      36
      39
      320
      1
      MGA POGING POKEMON
      wurmple
      ...
      0
      15
      0
      2
      0
      290
      NaN
      9
      Wurmple
      Worm
    
  

2 rows × 29 columns



In [41]:

    
pokemon_species_merged_english['id'][pokemon_species_merged_english.genus=='Worm'].count()









    Out[41]:





2

Transform DataFrames (rows as columns)



In [42]:

    
pokemon_species_merged_agg.T









    Out[42]:






  
    
      genus
      Abundance
      Acorn
      Alpha
      Angler
      Ant Pit
      Anteater
      Antenna
      Aqua Mouse
      Aqua Rabbit
      Arm Thrust
      ...
      Wind Chime
      Windveiled
      Wing Fish
      Wish
      Wood Gecko
      Wool
      Worm
      Wrestling
      Young Fowl
      Zen Charm
    
  
  
    
      id
      2.0
      1.0
      1.0
      1.0
      1.0
      1.0
      1.0
      1.0
      1.0
      1.0
      ...
      1.0
      1.0
      1.0
      1.0
      2.0
      2.0
      2.0
      1.0
      1.0
      1.0
    
    
      weight
      680.0
      40.0
      3200.0
      120.0
      150.0
      580.0
      22.0
      85.0
      285.0
      2538.0
      ...
      10.0
      66.0
      70.0
      11.0
      133.0
      105.5
      32.5
      215.0
      195.0
      375.0
    
    
      height
      14.0
      5.0
      32.0
      5.0
      7.0
      14.0
      2.0
      4.0
      8.0
      23.0
      ...
      6.0
      7.0
      4.0
      3.0
      7.0
      7.0
      3.0
      8.0
      9.0
      6.0
    
  

3 rows × 524 columns

Working with Scikit Learn



In [44]:

    
test_data = pd.read_csv("../maprdata/pyspark-churn-prediction/data/churn-bigml-20.csv",sep=",")
train_data = pd.read_csv("../maprdata/pyspark-churn-prediction/data/churn-bigml-80.csv",sep=",")



In [54]:

    
test_data.columns









    Out[54]:





Index([u'State', u'Account length', u'Area code', u'International plan',
       u'Voice mail plan', u'Number vmail messages', u'Total day minutes',
       u'Total day calls', u'Total day charge', u'Total eve minutes',
       u'Total eve calls', u'Total eve charge', u'Total night minutes',
       u'Total night calls', u'Total night charge', u'Total intl minutes',
       u'Total intl calls', u'Total intl charge', u'Customer service calls',
       u'Churn'],
      dtype='object')



In [64]:

    
# test_data_sample = test_data[[u'Total day calls', u'Total day charge', u'Total eve minutes',\
#                              u'State', u'International plan',\
#                              u'Churn']]
test_data_sample = test_data



In [65]:

    
#test_data_sample.describe().T
test_data.describe().T









    Out[65]:






  
    
      
      count
      mean
      std
      min
      25%
      50%
      75%
      max
    
  
  
    
      Account length
      667.0
      102.841079
      40.819480
      1.00
      76.00
      102.00
      128.000
      232.00
    
    
      Area code
      667.0
      436.157421
      41.783305
      408.00
      408.00
      415.00
      415.000
      510.00
    
    
      Number vmail messages
      667.0
      8.407796
      13.994480
      0.00
      0.00
      0.00
      20.000
      51.00
    
    
      Total day minutes
      667.0
      180.948126
      55.508628
      25.90
      146.25
      178.30
      220.700
      334.30
    
    
      Total day calls
      667.0
      100.937031
      20.396790
      30.00
      87.50
      101.00
      115.000
      165.00
    
    
      Total day charge
      667.0
      30.761769
      9.436463
      4.40
      24.86
      30.31
      37.520
      56.83
    
    
      Total eve minutes
      667.0
      203.355322
      49.719268
      48.10
      171.05
      203.70
      236.450
      361.80
    
    
      Total eve calls
      667.0
      100.476762
      18.948262
      37.00
      88.00
      101.00
      113.000
      168.00
    
    
      Total eve charge
      667.0
      17.285262
      4.226160
      4.09
      14.54
      17.31
      20.095
      30.75
    
    
      Total night minutes
      667.0
      199.685307
      49.759931
      23.20
      167.95
      201.60
      231.500
      367.70
    
    
      Total night calls
      667.0
      100.113943
      20.172505
      42.00
      86.00
      100.00
      113.500
      175.00
    
    
      Total night charge
      667.0
      8.985907
      2.239429
      1.04
      7.56
      9.07
      10.420
      16.55
    
    
      Total intl minutes
      667.0
      10.238381
      2.807850
      0.00
      8.60
      10.50
      12.050
      18.30
    
    
      Total intl calls
      667.0
      4.527736
      2.482442
      0.00
      3.00
      4.00
      6.000
      18.00
    
    
      Total intl charge
      667.0
      2.764948
      0.758167
      0.00
      2.32
      2.84
      3.255
      4.94
    
    
      Customer service calls
      667.0
      1.563718
      1.333357
      0.00
      1.00
      1.00
      2.000
      8.00



In [85]:

    
axs = pd.tools.plotting.scatter_matrix(test_data,figsize=(25,20))

# From http://stackoverflow.com/questions/23009509/how-to-modify-pandas-plotting-integration
def wrap(txt, width=8):
    '''helper function to wrap text for long labels'''
    import textwrap
    return '\n'.join(textwrap.wrap(txt, width))

for ax in axs[:,0]: # the left boundary
    ax.grid('off', axis='both')
    ax.set_ylabel(wrap(ax.get_ylabel()), rotation=0, va='center', labelpad=20)
    ax.set_yticks([])

for ax in axs[-1,:]: # the lower boundary
    ax.grid('off', axis='both')
    ax.set_xlabel(wrap(ax.get_xlabel()), rotation=90)
    ax.set_xticks([])



In [97]:

    
test_data.head(2)









    Out[97]:






  
    
      
      State
      Account length
      Area code
      International plan
      Voice mail plan
      Number vmail messages
      Total day minutes
      Total day calls
      Total day charge
      Total eve minutes
      Total eve calls
      Total eve charge
      Total night minutes
      Total night calls
      Total night charge
      Total intl minutes
      Total intl calls
      Total intl charge
      Customer service calls
      Churn
    
  
  
    
      0
      LA
      117
      408
      No
      No
      0
      184.5
      97
      31.37
      351.6
      80
      29.89
      215.8
      90
      9.71
      8.7
      4
      2.35
      1
      False
    
    
      1
      IN
      65
      415
      No
      No
      0
      129.1
      137
      21.95
      228.5
      83
      19.42
      208.8
      111
      9.40
      12.7
      6
      3.43
      4
      True



In [107]:

    
test_data_features = test_data.drop(axis=1,labels=['State', 'Account length', 'Area code', 'International plan',\
       'Total day minutes', 'Total eve minutes', 'Total night minutes', 'Total intl minutes',\
                             'Voice mail plan'])



In [108]:

    
test_data_features.columns









    Out[108]:





Index([u'Number vmail messages', u'Total day calls', u'Total day charge',
       u'Total eve calls', u'Total eve charge', u'Total night calls',
       u'Total night charge', u'Total intl calls', u'Total intl charge',
       u'Customer service calls', u'Churn'],
      dtype='object')

Train a Random Forest Regressor



In [110]:

    
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor

rf = RandomForestRegressor()



In [111]:

    
test_predictors = test_data_features[[u'Number vmail messages', u'Total day calls', u'Total day charge',\
                                       u'Total eve calls', u'Total eve charge', u'Total night calls',\
                                      u'Total night charge', u'Total intl calls', u'Total intl charge',\
                                      u'Customer service calls']]
test_label = test_data_features[[u'Churn']]



In [117]:

    
rf.fit(test_predictors.values, np.ravel(test_label.values))









    Out[117]:





RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
           max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,
           min_samples_split=2, min_weight_fraction_leaf=0.0,
           n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
           verbose=0, warm_start=False)



In [126]:

    
rf.score(test_predictors.values,np.ravel(test_label.values))









    Out[126]:





0.88940614648509375



In [139]:

    
test_data_sample[['Churn','Total day calls']].groupby("Churn").count()









    Out[139]:






  
    
      
      Total day calls
    
    
      Churn
      
    
  
  
    
      False
      572
    
    
      True
      95

	id	identifier	species_id	height	weight	base_experience	order	is_default	new_column
0	1	bulbasaur	1	7	69	64	1	1	MGA POGING POKEMON
1	2	ivysaur	2	10	130	142	2	1	MGA POGING POKEMON

	id	species_id	height	weight	base_experience	order	is_default
count	811.000000	811.000000	811.000000	811.000000	811.000000	811.000000	811.000000
mean	1435.728730	360.880395	12.516646	636.096178	151.742293	406.000000	0.889026
std	3050.119235	209.331096	12.266613	1077.658710	79.150608	234.259827	0.314294
min	1.000000	1.000000	1.000000	1.000000	36.000000	1.000000	0.000000
25%	203.500000	181.500000	6.000000	94.500000	67.000000	203.500000	1.000000
50%	406.000000	362.000000	10.000000	295.000000	158.000000	406.000000	1.000000
75%	608.500000	538.500000	15.000000	650.000000	189.000000	608.500000	1.000000
max	10090.000000	721.000000	145.000000	9997.000000	608.000000	811.000000	1.000000

	id	identifier	species_id	height	weight	base_experience	order	is_default	new_column
2	3	venusaur	3	20	1000	236	3	1	MGA POGING POKEMON
5	6	charizard	6	17	905	240	7	1	MGA POGING POKEMON

	id	weight	height
genus
Wool	2	105.5	7.0
Worm	2	32.5	3.0
Wrestling	1	215.0	8.0
Young Fowl	1	195.0	9.0
Zen Charm	1	375.0	6.0

	id	identifier_x	species_id	height	weight	base_experience	order_x	is_default	new_column	identifier_y	...	is_baby	hatch_counter	has_gender_differences	growth_rate_id	forms_switchable	order_y	conquest_order	local_language_id	name	genus
9	10	caterpie	10	3	29	39	14	1	MGA POGING POKEMON	caterpie	...	0	15	0	2	0	10	NaN	9	Caterpie	Worm
264	265	wurmple	265	3	36	39	320	1	MGA POGING POKEMON	wurmple	...	0	15	0	2	0	290	NaN	9	Wurmple	Worm

genus	Abundance	Acorn	Alpha	Angler	Ant Pit	Anteater	Antenna	Aqua Mouse	Aqua Rabbit	Arm Thrust	...	Wind Chime	Windveiled	Wing Fish	Wish	Wood Gecko	Wool	Worm	Wrestling	Young Fowl	Zen Charm
id	2.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	...	1.0	1.0	1.0	1.0	2.0	2.0	2.0	1.0	1.0	1.0
weight	680.0	40.0	3200.0	120.0	150.0	580.0	22.0	85.0	285.0	2538.0	...	10.0	66.0	70.0	11.0	133.0	105.5	32.5	215.0	195.0	375.0
height	14.0	5.0	32.0	5.0	7.0	14.0	2.0	4.0	8.0	23.0	...	6.0	7.0	4.0	3.0	7.0	7.0	3.0	8.0	9.0	6.0

	count	mean	std	min	25%	50%	75%	max
Account length	667.0	102.841079	40.819480	1.00	76.00	102.00	128.000	232.00
Area code	667.0	436.157421	41.783305	408.00	408.00	415.00	415.000	510.00
Number vmail messages	667.0	8.407796	13.994480	0.00	0.00	0.00	20.000	51.00
Total day minutes	667.0	180.948126	55.508628	25.90	146.25	178.30	220.700	334.30
Total day calls	667.0	100.937031	20.396790	30.00	87.50	101.00	115.000	165.00
Total day charge	667.0	30.761769	9.436463	4.40	24.86	30.31	37.520	56.83
Total eve minutes	667.0	203.355322	49.719268	48.10	171.05	203.70	236.450	361.80
Total eve calls	667.0	100.476762	18.948262	37.00	88.00	101.00	113.000	168.00
Total eve charge	667.0	17.285262	4.226160	4.09	14.54	17.31	20.095	30.75
Total night minutes	667.0	199.685307	49.759931	23.20	167.95	201.60	231.500	367.70
Total night calls	667.0	100.113943	20.172505	42.00	86.00	100.00	113.500	175.00
Total night charge	667.0	8.985907	2.239429	1.04	7.56	9.07	10.420	16.55
Total intl minutes	667.0	10.238381	2.807850	0.00	8.60	10.50	12.050	18.30
Total intl calls	667.0	4.527736	2.482442	0.00	3.00	4.00	6.000	18.00
Total intl charge	667.0	2.764948	0.758167	0.00	2.32	2.84	3.255	4.94
Customer service calls	667.0	1.563718	1.333357	0.00	1.00	1.00	2.000	8.00

	State	Account length	Area code	International plan	Voice mail plan	Number vmail messages	Total day minutes	Total day calls	Total day charge	Total eve minutes	Total eve calls	Total eve charge	Total night minutes	Total night calls	Total night charge	Total intl minutes	Total intl calls	Total intl charge	Customer service calls	Churn
0	LA	117	408	No	No	0	184.5	97	31.37	351.6	80	29.89	215.8	90	9.71	8.7	4	2.35	1	False
1	IN	65	415	No	No	0	129.1	137	21.95	228.5	83	19.42	208.8	111	9.40	12.7	6	3.43	4	True