Pandas
CSV
format, both are from weather station 'USC00116760' in Petersburg, IL'Temp_116760.csv'
stores temperture data, the index is day-of-year.'Prcp_116760.csv'
stores Precipation data, the index is date-time.
In [1]:
df_temp = pd.read_csv('Temp_116760.csv', skiprows=1, index_col=0)
In [2]:
df_temp.tail()
Out[2]:
In [3]:
df_prcp = pd.read_csv('Prcp_116760.csv', index_col=0)
df_prcp.index = pd.to_datetime(df_prcp.index)
In [4]:
df_prcp.head()
Out[4]:
In [5]:
# and I want the index to be of date-time, rather than just strings
df_prcp.index.dtype
Out[5]:
pandas.concat
In [6]:
pd.concat((df_prcp,
pd.DataFrame(data=df_temp.values,
index=df_prcp.index,
columns=df_temp.columns)),
axis=1).head()
Out[6]:
pandas.merge
merge
might be the better apporach?
In [7]:
pd.merge(left=df_prcp,
right=df_temp,
left_on=df_prcp.index.dayofyear,
right_index=True,
how='left').head()
Out[7]:
In [8]:
df = pd.merge(left=df_prcp,
right=df_temp,
left_on=df_prcp.index.dayofyear,
right_index=True,
how='left')
df.pivot_table(values='TMAX',
index=df.index.month,
columns=df.SNOW.isnull(),
aggfunc='count')
Out[8]:
In [9]:
df3 = pd.DataFrame(index=pd.date_range('2015-01-01','2015-06-30'),
columns=df.columns)
df3.update(df)
df3.pivot_table(values='TMAX',
index=df3.index.month,
columns=np.where(df3.isnull().all(1),
'Missing',
df3.SNOW.isnull()),
aggfunc=len)
Out[9]: