Shelter Animal Outcomes 1

Data visualization



In [1]:

    
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns



In [2]:

    
df = pd.read_csv('train.csv')



In [3]:

    
df.head()









    Out[3]:






  
    
      
      AnimalID
      Name
      DateTime
      OutcomeType
      OutcomeSubtype
      AnimalType
      SexuponOutcome
      AgeuponOutcome
      Breed
      Color
    
  
  
    
      0
      A671945
      Hambone
      2014-02-12 18:22:00
      Return_to_owner
      NaN
      Dog
      Neutered Male
      1 year
      Shetland Sheepdog Mix
      Brown/White
    
    
      1
      A656520
      Emily
      2013-10-13 12:44:00
      Euthanasia
      Suffering
      Cat
      Spayed Female
      1 year
      Domestic Shorthair Mix
      Cream Tabby
    
    
      2
      A686464
      Pearce
      2015-01-31 12:28:00
      Adoption
      Foster
      Dog
      Neutered Male
      2 years
      Pit Bull Mix
      Blue/White
    
    
      3
      A683430
      NaN
      2014-07-11 19:09:00
      Transfer
      Partner
      Cat
      Intact Male
      3 weeks
      Domestic Shorthair Mix
      Blue Cream
    
    
      4
      A667013
      NaN
      2013-11-15 12:52:00
      Transfer
      Partner
      Dog
      Neutered Male
      2 years
      Lhasa Apso/Miniature Poodle
      Tan



In [4]:

    
df['AnimalType'].unique()









    Out[4]:





array(['Dog', 'Cat'], dtype=object)



In [5]:

    
df.groupby(['AnimalType']).get_group('Cat').shape[0]









    Out[5]:





11134



In [6]:

    
df.groupby(['AnimalType']).get_group('Dog').shape[0]









    Out[6]:





15595



In [7]:

    
df['OutcomeType'].unique()









    Out[7]:





array(['Return_to_owner', 'Euthanasia', 'Adoption', 'Transfer', 'Died'], dtype=object)



In [8]:

    
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 4))
sns.countplot(x="OutcomeType", data=df,  ax=ax1)
sns.countplot(x="AnimalType", hue="OutcomeType", data=df,  ax=ax2)









    Out[8]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f9ad5cbdc50>

Overall it seems not many animals died of natural causes.

Doesn't seem like cats have nine lives unfortunately. Probably because of their shitty attitude and general evilness they are likely to get transferred. Dogs have tricked their masters with their sad puppy face to get returned more. Also they are told to be more loyal.



In [9]:

    
sns.countplot(x="SexuponOutcome", hue="OutcomeType", data=df)









    Out[9]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f9ad5f73ad0>

Overall sex likely does not play a big role in outcome, but spayed/neutered population is bigger they are more likely to get adopted



In [10]:

    
dfCat = df.groupby(['AnimalType']).get_group('Cat')
dfDog = df.groupby(['AnimalType']).get_group('Dog')



In [11]:

    
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 4))
sns.countplot(x="SexuponOutcome", hue="OutcomeType", data=dfCat, ax=ax1)
sns.countplot(x="SexuponOutcome", hue="OutcomeType", data=dfDog, ax=ax2)









    Out[11]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f9ad5924290>

Cats and dogs have different probability distributions for outcome



In [12]:

    
dfCat['Color'].describe()









    Out[12]:





count           11134
unique            146
top       Brown Tabby
freq             1635
Name: Color, dtype: object



In [13]:

    
dfDog['Color'].describe()









    Out[13]:





count           15595
unique            262
top       Black/White
freq             1730
Name: Color, dtype: object

As expected there are too many colors that makes it difficult to properly visualize without discarding a majority of colors. Thinking a bit, it makes more sense to have a combination of both color and breed to make a pet to be more appealing/attractive.



In [14]:

    
df['AgeuponOutcome'].unique()









    Out[14]:





array(['1 year', '2 years', '3 weeks', '1 month', '5 months', '4 years',
       '3 months', '2 weeks', '2 months', '10 months', '6 months',
       '5 years', '7 years', '3 years', '4 months', '12 years', '9 years',
       '6 years', '1 weeks', '11 years', '4 weeks', '7 months', '8 years',
       '11 months', '4 days', '9 months', '8 months', '15 years',
       '10 years', '1 week', '0 years', '14 years', '3 days', '6 days',
       '5 days', '5 weeks', '2 days', '16 years', '1 day', '13 years', nan,
       '17 years', '18 years', '19 years', '20 years'], dtype=object)

As expected there are animals over a wide spectrum of ages. Age should play a major role deciding the outcome.



In [15]:

    
df['NameIsPresent'] = df['Name'].isnull()



In [16]:

    
sns.countplot(x="NameIsPresent", hue="OutcomeType", data=df)









    Out[16]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f9ad5857a90>

Animals that didn't have names or their names were lost, as is evident from the graph above, that their outcome probability distribution would be very different. Named animals seem to be more popular for adoption. Named animals could mean that they had previous owners and possible stories.



In [17]:

    
df[df['NameIsPresent'] == True].shape[0]









    Out[17]:





7691



In [18]:

    
df[df['NameIsPresent'] == False].shape[0]









    Out[18]:





19038

We can see that out of the animals present in training set more than 2/3 had names and roughly about half of them got adopted.



In [19]:

    
df['OutcomeSubtype'].unique()









    Out[19]:





array([nan, 'Suffering', 'Foster', 'Partner', 'Offsite', 'SCRP',
       'Aggressive', 'Behavior', 'Rabies Risk', 'Medical', 'In Kennel',
       'In Foster', 'Barn', 'Court/Investigation', 'Enroute', 'At Vet',
       'In Surgery'], dtype=object)



In [20]:

    
sns.set_context("poster")
sns.countplot(x="OutcomeSubtype", hue="AnimalType", data=df)









    Out[20]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f9ad568c050>



In [25]:

    
df['DateTime']









    Out[25]:





0        2014-02-12 18:22:00
1        2013-10-13 12:44:00
2        2015-01-31 12:28:00
3        2014-07-11 19:09:00
4        2013-11-15 12:52:00
5        2014-04-25 13:04:00
6        2015-03-28 13:11:00
7        2015-04-30 17:02:00
8        2014-02-04 17:17:00
9        2014-05-03 07:48:00
10       2013-12-05 15:50:00
11       2013-11-04 14:48:00
12       2016-02-03 11:27:00
13       2015-06-08 16:30:00
14       2015-11-25 15:00:00
15       2014-07-12 12:10:00
16       2014-05-03 16:15:00
17       2014-06-07 12:54:00
18       2014-05-17 11:32:00
19       2014-07-30 17:34:00
20       2014-01-19 15:03:00
21       2015-09-18 15:19:00
22       2015-08-15 14:22:00
23       2013-10-28 16:32:00
24       2014-04-09 17:44:00
25       2015-10-03 15:44:00
26       2016-01-15 17:31:00
27       2015-03-25 18:50:00
28       2015-11-21 13:01:00
29       2015-07-30 14:30:00
                ...         
26699    2014-04-21 14:01:00
26700    2015-06-15 19:28:00
26701    2014-06-15 17:41:00
26702    2015-10-11 09:42:00
26703    2015-12-04 12:22:00
26704    2015-11-17 17:17:00
26705    2013-10-19 15:34:00
26706    2014-10-19 13:29:00
26707    2014-07-01 17:06:00
26708    2013-11-13 17:32:00
26709    2015-10-24 00:00:00
26710    2014-11-24 17:21:00
26711    2013-10-30 18:32:00
26712    2015-04-20 16:04:00
26713    2014-01-20 17:37:00
26714    2014-05-31 16:11:00
26715    2015-08-05 17:03:00
26716    2015-05-02 21:04:00
26717    2014-06-30 17:34:00
26718    2015-04-28 14:26:00
26719    2015-07-20 09:00:00
26720    2015-07-18 14:08:00
26721    2014-07-17 09:43:00
26722    2014-08-31 09:00:00
26723    2016-01-29 18:52:00
26724    2015-05-14 11:56:00
26725    2016-01-20 18:59:00
26726    2015-03-09 13:33:00
26727    2014-04-27 12:22:00
26728    2015-07-02 09:00:00
Name: DateTime, dtype: object

	AnimalID	Name	DateTime	OutcomeType	OutcomeSubtype	AnimalType	SexuponOutcome	AgeuponOutcome	Breed	Color
0	A671945	Hambone	2014-02-12 18:22:00	Return_to_owner	NaN	Dog	Neutered Male	1 year	Shetland Sheepdog Mix	Brown/White
1	A656520	Emily	2013-10-13 12:44:00	Euthanasia	Suffering	Cat	Spayed Female	1 year	Domestic Shorthair Mix	Cream Tabby
2	A686464	Pearce	2015-01-31 12:28:00	Adoption	Foster	Dog	Neutered Male	2 years	Pit Bull Mix	Blue/White
3	A683430	NaN	2014-07-11 19:09:00	Transfer	Partner	Cat	Intact Male	3 weeks	Domestic Shorthair Mix	Blue Cream
4	A667013	NaN	2013-11-15 12:52:00	Transfer	Partner	Dog	Neutered Male	2 years	Lhasa Apso/Miniature Poodle	Tan