In [1]:
import pandas as pd
from numpy import NaN

Create a sample DataFrame with some missing values.


In [4]:
df = pd.DataFrame({
    'colA': ['aaa', NaN, NaN, NaN, 'bbb', 'ccc'],
    'colB': ['xxx', 'yyy', NaN, 'zzz', NaN, 'www'], 
    #'colC': [NaN, 3, NaN, 1, 0, 9]
    })

In [5]:
df


Out[5]:
colA colB
0 aaa xxx
1 NaN yyy
2 NaN NaN
3 NaN zzz
4 bbb NaN
5 ccc www

Task: replace missing values in column colA with those in colB (if they exist).

First we define a filtering expression ("condition") cond which encodes the condition which we'd like to use for filling in the values. In this case we could actually use the simpler condition cond = df.colA.isnull() because it doesn't matter if the value in colB is also missing (since we would just replace NaN with NaN), but for the sake of illustration let's use this slightly more complicated expression.


In [13]:
cond = df.colA.isnull() & ~df.colB.isnull()
cond


Out[13]:
0    False
1     True
2    False
3     True
4    False
5    False
dtype: bool

We can use this to extract the desired columns if we wish.


In [14]:
df[cond]


Out[14]:
colA colB
1 NaN yyy
3 NaN zzz

Now we can do the assignment. Note that we use the .loc operator to avoid a warning about "trying to set values on a copy of a slice from a DataFrame" which would happen if we used for example the following expression

df[cond]['colA'] = df[cond]['colB']

(See http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy for details.)


In [18]:
df.loc[cond, 'colA'] = df.loc[cond, 'colB']

The resulting DataFrame does indeed have the values yyy and zzz filled in column colA.


In [19]:
df


Out[19]:
colA colB
0 aaa xxx
1 yyy yyy
2 NaN NaN
3 zzz zzz
4 bbb NaN
5 ccc www