Pandas Exercises

Time to test your new pandas skills! Use the two csv files in this folder to complete the tasks in bold below!

NOTE: ALL TASKS MUST BE DONE IN ONE LINE OF PANDAS CODE. GOT STUCK? NO PROBLEM! CHECK OUT THE SOLUTIONS LECTURE!

Import pandas and read in the banklist.csv file into a dataframe called banks.



In [1]:

    
import pandas as pd



In [2]:

    
df = pd.read_csv('./banklist.csv')

Show the head of the dataframe



In [3]:

    
# CODE HERE
df.head()









    Out[3]:







  
    
      
      Bank Name
      City
      ST
      CERT
      Acquiring Institution
      Closing Date
      Updated Date
    
  
  
    
      0
      Fayette County Bank
      Saint Elmo
      IL
      1802
      United Fidelity Bank, fsb
      26-May-17
      1-Jun-17
    
    
      1
      Guaranty Bank, (d/b/a BestBank in Georgia & Mi...
      Milwaukee
      WI
      30003
      First-Citizens Bank & Trust Company
      5-May-17
      1-Jun-17
    
    
      2
      First NBC Bank
      New Orleans
      LA
      58302
      Whitney Bank
      28-Apr-17
      23-May-17
    
    
      3
      Proficio Bank
      Cottonwood Heights
      UT
      35495
      Cache Valley Bank
      3-Mar-17
      18-May-17
    
    
      4
      Seaway Bank and Trust Company
      Chicago
      IL
      19328
      State Bank of Texas
      27-Jan-17
      18-May-17

What are the column names?



In [4]:

    
# CODE HERE
df.columns









    Out[4]:





Index(['Bank Name', 'City', 'ST', 'CERT', 'Acquiring Institution',
       'Closing Date', 'Updated Date'],
      dtype='object')

How many States (ST) are represented in this data set?



In [5]:

    
# CODE HERE
df['ST'].unique().shape[0]
# or
# len(df['ST'].unique())









    Out[5]:





44

Get a list or array of all the states in the data set.



In [6]:

    
# CODE HERE
df['ST'].unique()









    Out[6]:





array(['IL', 'WI', 'LA', 'UT', 'NJ', 'AR', 'GA', 'PA', 'TN', 'WA', 'CO',
       'PR', 'FL', 'MN', 'CA', 'MD', 'OK', 'OH', 'SC', 'VA', 'ID', 'TX',
       'CT', 'AZ', 'NV', 'NC', 'KY', 'MO', 'KS', 'AL', 'MI', 'IN', 'IA',
       'NE', 'MS', 'NM', 'OR', 'NY', 'MA', 'SD', 'WY', 'WV', 'NH', 'HI'], dtype=object)

What are the top 5 states with the most failed banks?



In [7]:

    
# CODE HERE
df['ST'].value_counts().head()









    Out[7]:





GA    93
FL    75
IL    67
CA    41
MN    23
Name: ST, dtype: int64

What are the top 5 acquiring institutions?



In [8]:

    
# CODE HERE
df['Acquiring Institution'].value_counts().head()









    Out[8]:





No Acquirer                            31
State Bank and Trust Company           12
First-Citizens Bank & Trust Company    11
Ameris Bank                            10
U.S. Bank N.A.                          9
Name: Acquiring Institution, dtype: int64

How many banks has the State Bank of Texas acquired? How many of them were actually in Texas?



In [9]:

    
# CODE HERE
df[df['Acquiring Institution'] == 'State Bank of Texas']









    Out[9]:







  
    
      
      Bank Name
      City
      ST
      CERT
      Acquiring Institution
      Closing Date
      Updated Date
    
  
  
    
      4
      Seaway Bank and Trust Company
      Chicago
      IL
      19328
      State Bank of Texas
      27-Jan-17
      18-May-17
    
    
      21
      The National Republic Bank of Chicago
      Chicago
      IL
      916
      State Bank of Texas
      24-Oct-14
      6-Jan-16
    
    
      450
      Millennium State Bank of Texas
      Dallas
      TX
      57667
      State Bank of Texas
      2-Jul-09
      26-Oct-12



In [10]:

    
len(df[df['Acquiring Institution'] == 'State Bank of Texas'])









    Out[10]:





3



In [11]:

    
len(df[(df['Acquiring Institution'] == 'State Bank of Texas') & (df['ST'] == 'TX')])









    Out[11]:





1

What is the most common city in California for a bank to fail in?



In [12]:

    
# CODE HERE
df[df['ST'] == 'CA']['City'].value_counts()[0]









    Out[12]:





4

How many failed banks don't have the word "Bank" in their name?



In [13]:

    
# CODE HERE
len(df[~df['Bank Name'].str.contains('Bank')]['Bank Name'].unique())









    Out[13]:





14

How many bank names start with the letter 's' ?



In [14]:

    
# CODE HERE
df['Bank Name'].apply(lambda name: name[0].lower() == 's').value_counts()









    Out[14]:





False    498
True      53
Name: Bank Name, dtype: int64

How many CERT values are above 20000 ?



In [15]:

    
# CODE HERE
sum(df['CERT'] > 20000)









    Out[15]:





417

How many bank names consist of just two words? (e.g. "First Bank" , "Bank Georgia" )



In [16]:

    
# CODE HERE
sum(df['Bank Name'].apply(lambda name: len(name.split()) == 2))









    Out[16]:





114

Bonus: How many banks closed in the year 2008? (this is hard because we technically haven't learned about time series with pandas yet! Feel free to skip this one!



In [17]:

    
# CODE HERE
sum(df['Closing Date'].apply(lambda date: date[-2:] == '08'))









    Out[17]:





25

	Bank Name	City	ST	CERT	Acquiring Institution	Closing Date	Updated Date
0	Fayette County Bank	Saint Elmo	IL	1802	United Fidelity Bank, fsb	26-May-17	1-Jun-17
1	Guaranty Bank, (d/b/a BestBank in Georgia & Mi...	Milwaukee	WI	30003	First-Citizens Bank & Trust Company	5-May-17	1-Jun-17
2	First NBC Bank	New Orleans	LA	58302	Whitney Bank	28-Apr-17	23-May-17
3	Proficio Bank	Cottonwood Heights	UT	35495	Cache Valley Bank	3-Mar-17	18-May-17
4	Seaway Bank and Trust Company	Chicago	IL	19328	State Bank of Texas	27-Jan-17	18-May-17

Pandas Exercises

GREAT JOB!