Purpose of This Notebook

how to use apply on a pandas Series and DataFrame
show a bit about how lambda functions work



In [1]:

    
# numpy and pandas related imports 

import numpy as np
from pandas import Series, DataFrame
import pandas as pd

Setup: create Series and DataFrames

Let's make two Series and a DataFrame to use for our example



In [2]:

    
# for example, using lower and uppercase English letters

import string
string.lowercase, string.uppercase









    Out[2]:





('abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')



In [3]:

    
# we can make a list composed of the individual lowercase letters 

list(string.lowercase)









    Out[3]:





['a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
 'i',
 'j',
 'k',
 'l',
 'm',
 'n',
 'o',
 'p',
 'q',
 'r',
 's',
 't',
 'u',
 'v',
 'w',
 'x',
 'y',
 'z']



In [9]:

    
# create a pandas Series out of the list of lowercase letters

lower = Series(list(string.lowercase))
print type(lower)
lower.head()









    



<class 'pandas.core.series.Series'>






    Out[9]:





0    a
1    b
2    c
3    d
4    e
dtype: object



In [13]:

    
# create a pandas Series out of the list of lowercase letters

upper = Series(list(string.uppercase), name='upper')



In [14]:

    
# concatenate the two Series as columns, using axis=1 
# axis = 0 would result in two rows in the DataFrame

df = pd.concat((lower, upper), axis=1)
df.head()









    Out[14]:






  
    
      
      0
      1
    
  
  
    
      0
       a
       A
    
    
      1
       b
       B
    
    
      2
       c
       C
    
    
      3
       d
       D
    
    
      4
       e
       E
    
  

5 rows × 2 columns

Using apply

Series.apply

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html:

Series.apply(func, convert_dtype=True, args=(), **kwds)

Invoke function on values of Series.



In [15]:

    
# Let's start by using Series.apply
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html

# first of all, it's useful to find a way to use apply to return the exact same Series

def identity(s):
    return s

lower.apply(identity)









    Out[15]:





0     a
1     b
2     c
3     d
4     e
5     f
6     g
7     h
8     i
9     j
10    k
11    l
12    m
13    n
14    o
15    p
16    q
17    r
18    s
19    t
20    u
21    v
22    w
23    x
24    y
25    z
dtype: object



In [16]:

    
# show that identity yields the same Series -- first on element by element basis

lower.apply(identity) == lower









    Out[16]:





0     True
1     True
2     True
3     True
4     True
5     True
6     True
7     True
8     True
9     True
10    True
11    True
12    True
13    True
14    True
15    True
16    True
17    True
18    True
19    True
20    True
21    True
22    True
23    True
24    True
25    True
dtype: bool



In [17]:

    
# Check that match happens for every element in the Series using numpy.all
# http://docs.scipy.org/doc/numpy/reference/generated/numpy.all.html

np.all(lower.apply(identity) == lower)









    Out[17]:





True

Let's use `lambda`

Sometimes it's convenient to write functions using lambda, especially short functions for doing a simple transformation of the parameters. Only some functions can be rewritten with lambda.



In [ ]:

    
def add_preface(s):
    return 'letter ' + s

lower.apply(add_preface)



In [ ]:

    
# rewrite with lambda

lower.apply(lambda s: 'letter ' + s)

Another illustration of apply

Another illustration of using apply -- using ord and chr



In [ ]:

    
# ord: Given a string of length one, return an integer representing the Unicode code 
# point of the character when the argument is a unicode object, or the value of the 
# byte when the argument is an 8-bit string. 
# http://docs.python.org/2.7/library/functions.html#ord

ord('a')



In [ ]:

    
# chr: Return a string of one character whose ASCII code is the integer i.
# http://docs.python.org/2.7/library/functions.html#chr

chr(97)



In [ ]:

    
# show that for the case of 'a', chr(ord()) returns what we start with:'a'

chr(ord('a')) == 'a'



In [ ]:

    
# we can test whether chr reverses ord for all the lower case letters
# note how we chain two apply together

np.all(lower.apply(ord).apply(chr) == lower)

Note that we read off a specific series from the DataFrame



In [ ]:

    
type(df.upper)



In [ ]:

    
# transform
df.upper.apply(lambda s: s.lower())

DataFrame.apply

apply can also be applied to a DataFrame

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.apply.html

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)
Applies function along input axis of DataFrame.

Objects passed to functions are Series objects having index either the DataFrame’s index (axis=0) or the columns (axis=1). Return type depends on whether passed function aggregates, or the reduce argument if the DataFrame is empty.



In [ ]:

    
# let's show that whether we use apply on columns (axis=0) or rows (axis=1), we get the same 
# result

def identity(s):
    return s

np.all(df.apply(identity, axis=0) == df.apply(identity, axis=1))



In [ ]:

    
# for each column, first lower and then upper, return the index

def index(s):
    return s.index

df.apply(index, axis=0)



In [ ]:

    
# for each row (axis=1), first lower and then upper, return the index 
# (which are the column names)

def index(s):
    return s.index

df.apply(index, axis=1)



In [ ]:

    
# it might be easier to see the difference between axis=0 vs axis=1
# by using join

# Consider what you get with

"".join(df.lower)



In [ ]:

    
# Now compare (axis=0)

df.apply(lambda s: "".join(s), axis=0)



In [ ]:

    
# join with axis=1

df.apply(lambda s: "".join(s), axis=1)



In [ ]:

    
# note that you can access use the index in your function passed to apply

df.apply(lambda s: s['upper'] + s['lower'], axis=1)