Pandas Apply Exercises

See also: Dataframe.apply()

See also: Series.map()



In [1]:
# 1. import pandas as pd / pd.read_csv() the simple.csv. Get top 3 entries via .head() and assign to df.

import pandas as pd
df = pd.read_csv('data/simple.csv').head(3)

In [2]:
# 2. Define a function that takes an input, prints .name attribute of input, prints type(input) and prints the input.

def print_info(x):
    print(x.name)
    print(type(x))
    print(x)

print_info


Out[2]:
<function __main__.print_info>

In [5]:
# 3. Use the dataframe.apply() method, and supply the function (with no parenthesis) and axis=1

output = df.apply(print_info, axis=1)


0
<class 'pandas.core.series.Series'>
Date                                12/1/2016
Count                                      11
Weird Date     \tThursday,  December 01, 2016
Weird Count                                11
Name: 0, dtype: object
1
<class 'pandas.core.series.Series'>
Date                           12/2/2016
Count                                NaN
Weird Date     Friday, December 02, 2016
Weird Count                          NaN
Name: 1, dtype: object
2
<class 'pandas.core.series.Series'>
Date                             12/3/2016
Count                                   49
Weird Date     Saturday, December 03, 2016
Weird Count                             49
Name: 2, dtype: object

In [6]:
# 4. Now do it with axis=0. Which one of these does a function by row? Which by column? Significance of .name?

df.apply(print_info, axis=0)


Date
<class 'pandas.core.series.Series'>
0    12/1/2016
1    12/2/2016
2    12/3/2016
Name: Date, dtype: object
Count
<class 'pandas.core.series.Series'>
0     11
1    NaN
2     49
Name: Count, dtype: object
Weird Date
<class 'pandas.core.series.Series'>
0    \tThursday,  December 01, 2016
1         Friday, December 02, 2016
2       Saturday, December 03, 2016
Name: Weird Date, dtype: object
Weird Count
<class 'pandas.core.series.Series'>
0     11
1    NaN
2     49
Name: Weird Count, dtype: object
Out[6]:
Date           None
Count          None
Weird Date     None
Weird Count    None
dtype: object

In [7]:
# 5. Write a function for a row series that gets the date and count and prints "The Count was X on Date X."

def get_count(row):
    date = row['Date']
    count = row['Count']
    print("The count was {} on date {}.".format(count, date))

get_count


Out[7]:
<function __main__.get_count>

In [8]:
# 6. Write a function that returns a string. Apply this function to your dataframe. What is the result?

def get_count_string(row):
    date = row['Date']
    count = row['Count']
    return "The count was {} on date {}.".format(count, date)

print(df.apply(get_count_string, axis=1))

type(df.apply(get_count_string, axis=1))


0    The count was 11.0 on date 12/1/2016.
1     The count was nan on date 12/2/2016.
2    The count was 49.0 on date 12/3/2016.
dtype: object
Out[8]:
pandas.core.series.Series

In [10]:
# 7. Use your new function, and assign the returned series it to a new column, df['New Column 1']

df['New Column 1'] = df.apply(get_count_string, axis=1)
df


Out[10]:
Date Count Weird Date Weird Count New Column 1
0 12/1/2016 11.0 \tThursday, December 01, 2016 11 The count was 11.0 on date 12/1/2016.
1 12/2/2016 NaN Friday, December 02, 2016 NaN The count was nan on date 12/2/2016.
2 12/3/2016 49.0 Saturday, December 03, 2016 49 The count was 49.0 on date 12/3/2016.

In [11]:
# 8. Write a function that takes a count and adds 5.  Use map() with your function on your df['Count'] 

def add_to_count(cell_value):
    new_value = cell_value + 5
    return new_value

df['Count'].map(add_to_count)


Out[11]:
0    16.0
1     NaN
2    54.0
Name: Count, dtype: float64

In [13]:
# 9. This returns a series with each value individually altered. Assign the result to df['New Column 2]

df['New Column 2'] = df['Count'].map(add_to_count)
df


Out[13]:
Date Count Weird Date Weird Count New Column 1 New Column 2
0 12/1/2016 11.0 \tThursday, December 01, 2016 11 The count was 11.0 on date 12/1/2016. 16.0
1 12/2/2016 NaN Friday, December 02, 2016 NaN The count was nan on date 12/2/2016. NaN
2 12/3/2016 49.0 Saturday, December 03, 2016 49 The count was 49.0 on date 12/3/2016. 54.0