This notebook will combine all of the excel files that you create in the Combining CSV's notebook. It will also add reactor labels for the data.
In [2]:
import pandas as pd
import numpy as np
Import all of the reactor data you want to include in the final summary excel. It is useful to name each DataFrame you create with the reactor type it is related to.
In [3]:
df_BWR = pd.read_excel('../../Jimmy/BWR_Data/BWR_Combined_notsorted.xlsx', 'Sheet1')
df_VVER = pd.read_excel('../../Jimmy/VVER_Data/VVER_Combined_notsorted.xlsx', 'Sheet1')
df_RBMK = pd.read_excel('../../Jimmy/RBMK_Data/RBMK_Combined_notsorted.xlsx', 'Sheet1')
Create an overall data frame which each of the reactor data frames will be appended into. Make sure to append all of the DataFrames that you created while you were importing your excel files.
In [4]:
df_tot = df_BWR.append(df_VVER)
df_tot = df_tot.append(df_RBMK)
This allows you to see the data at the end of your combined DataFrame.
In [5]:
df_tot.tail()
Out[5]:
To create your reactor column change the string in the append line to the reactor name.
In [18]:
reactor_list = []
#Change range here
for y in range(len(df_BWR):
#Change string here
reactor_list.append('BWR')
for y in range(len(df_VVER)):
reactor_list.append('VVER')
for y in range(df_RBMK):
reactor_list.append('RBMK')
print reactor_list
Check to make sure that the length of the list of reactor names you just made matches the length of your DataFrame.
In [19]:
print 'list', len(reactor_list)
print 'df', len(df_tot)
Create a new column with the reactor names in it and then check to make sure it added to the DataFrame correctly.
In [20]:
df_tot['reactor'] = reactor_list
In [21]:
df_tot.tail()
Out[21]:
Change the order of the columns so that the reactor column is first.
In [22]:
col = list(df_tot.columns)
col = col[-1:] + col[-2:-1] + col[:-2]
print col
df_tot = df_tot[col]
In [24]:
df_tot.tail()
Out[24]:
Use this to make sure specific groups of reactors types worked. You can look at different reactors by switching the string in the get_group(). you can look at .head() and .tail() to make sure the indexes match up.
In [25]:
grouped = df_tot.groupby('reactor')
grouped.get_group('BWR').head()
Out[25]:
Export the DataFrame to an excel sheet. index = False means that the index column will not be exported.
In [26]:
df_tot.to_excel('VVER_RBMK_BWR_generated.xlsx', 'Sheet1', index = False)
In [ ]: