Data to be submitted for ESGF data publication must follow the rules outlined in the CMIP6 Archive Design Document
(https://...)
Thus file names have to follow the pattern:
VariableName_Domain_GCMModelName_CMIP5ExperimentName_CMIP5EnsembleMember_RCMModelName_RCMVersionID_Frequency[_StartTime-EndTime].nc
Example: tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc
The directory structure in which these files are stored follow the pattern:
activity/product/Domain/Institution/ GCMModelName/CMIP5ExperimentName/CMIP5EnsembleMember/ RCMModelName/RCMVersionID/Frequency/VariableName
Example: CORDEX/output/AFR-44/MPI-CSC/MPI-M-MPI-ESM-LR/rcp26/r1i1p1/MPI-CSC-REMO2009/v1/mon/tas/tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc
Notice: If your model is not yet registered, please contact contact ....
This 'data submission form' is used to improve initial information exchange between data providers and the data center. The form has to be filled before the publication process can be started. In case you have questions pleas contact the individual data center:
o DKRZ: cmip6@dkrz.de
In [ ]:
# initialize your CORDEX submission form template
from dkrz_forms import form_handler
from dkrz_forms import checks
please provide information on the contact person for this CORDEX data submission request
In [ ]:
my_email = "..." # example: sf.email = "Mr.Mitty@yahoo.com"
my_first_name = "..." # example: sf.first_name = "Harold"
my_last_name = "..." # example: sf.last_name = "Mitty"
my_keyword = "..." # example: sf.keyword = "mymodel_myrunid"
sf = form_handler.init_form("CORDEX",my_first_name,my_last_name,my_email,my_keyword)
In [ ]:
sf.submission_type = "..." # example: sf.submission_type = "initial_version"
In [ ]:
sf.institution = "..." # example: sf.institution = "Alfred Wegener Institute"
The value of this field has to equal the value of the global NetCDF attribute 'institute_id' in the data files and must equal the 4th directory level. It is needed before the publication process is started in order that the value can be added to the relevant CORDEX list of CV1 if not yet there. Note that 'institute_id' has to be the first part of 'model_id'
In [ ]:
sf.institute_id = "..." # example: sf.institute_id = "AWI"
The value of this field has to be the value of the global NetCDF attribute 'model_id' in the data files. It is needed before the publication process is started in order that the value can be added to the relevant CORDEX list of CV1 if not yet there. Note that it must be composed by the 'institute_id' follwed by the RCM CORDEX model name, separated by a dash. It is part of the file name and the directory structure.
In [ ]:
sf.model_id = "..." # example: sf.model_id = "AWI-HIRHAM5"
Experiment has to equal the value of the global NetCDF attribute 'experiment_id' in the data files. Time_period gives the period of data for which the publication request is submitted. If you intend to submit data from multiple experiments you may add one line for each additional experiment or send in additional publication request sheets.
In [ ]:
sf.experiment_id = "..." # example: sf.experiment_id = "evaluation"
# ["value_a","value_b"] in case of multiple experiments
sf.time_period = "..." # example: sf.time_period = "197901-201412"
# ["time_period_a","time_period_b"] in case of multiple values
In [ ]:
sf.example_file_name = "..." # example: sf.example_file_name = "tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc"
In [ ]:
# Please run this cell as it is to check your example file name structure
# to_do: implement submission_form_check_file function - output result (attributes + check_result)
form_handler.cordex_file_info(sf,sf.example_file_name)
In [ ]:
sf.grid_mapping_name = "..." # example: sf.grid_mapping_name = "rotated_latitude_longitude"
Does the grid configuration exactly follow the specifications in ADD2 (Table 1) in case the native grid is 'rotated_pole'? If not, comment on the differences; otherwise write 'yes' or 'N/A'. If the data is not delivered on the computational grid it has to be noted here as well.
In [ ]:
sf.grid_as_specified_if_rotated_pole = "..." # example: sf.grid_as_specified_if_rotated_pole = "yes"
Please answer 'no', 'QC1', 'QC2-all', 'QC2-CORDEX', or 'other'.
'QC1' refers to the compliancy checker that can be downloaded at http://cordex.dmi.dk. 'QC2' refers to the quality checker developed at DKRZ.
If your answer is 'other' give some informations.
In [ ]:
sf.data_qc_status = "..." # example: sf.data_qc_status = "QC2-CORDEX"
sf.data_qc_comment = "..." # any comment of quality status of the files
Please give the terms of use that shall be asigned to the data. The options are 'unrestricted' and 'non-commercial only'. For the full text 'Terms of Use' of CORDEX data refer to http://cordex.dmi.dk/joomla/images/CORDEX/cordex_terms_of_use.pdf
In [ ]:
sf.terms_of_use = "..." # example: sf.terms_of_use = "unrestricted"
If there is any directory structure deviation from the CORDEX standard please specify here. Otherwise enter 'compliant'. Please note that deviations MAY imply that data can not be accepted.
In [ ]:
sf.directory_structure = "..." # example: sf.directory_structure = "compliant"
Give the path where the data reside, for example: blizzard.dkrz.de:/scratch/b/b364034/. If not applicable write N/A and give data access information in the data_information string
In [ ]:
sf.data_path = "..." # example: sf.data_path = "mistral.dkrz.de:/mnt/lustre01/work/bm0021/k204016/CORDEX/archive/"
sf.data_information = "..." # ...any info where data can be accessed and transfered to the data center ... "
In each CORDEX file there may be only one variable which shall be published and searchable at the ESGF portal (target variable). In order to facilitate publication, all non-target variables are included in a list used by the publisher to avoid publication. A list of known non-target variables is [time, time_bnds, lon, lat, rlon ,rlat ,x ,y ,z ,height, plev, Lambert_Conformal, rotated_pole]. Please enter other variables into the left field if applicable (e.g. grid description variables), otherwise write 'N/A'.
In [ ]:
sf.exclude_variables_list = "..." # example: sf.exclude_variables_list=["bnds", "vertices"]
In case any of your files is replacing a file already published, it must not have the same tracking_id nor the same creation_date as the file it replaces. Did you make sure that that this is not the case ? Reply 'yes'; otherwise adapt the new file versions.
In [ ]:
sf.uniqueness_of_tracking_id = "..." # example: sf.uniqueness_of_tracking_id = "yes"
In [ ]:
sf.variable_list_day = [
"clh","clivi","cll","clm","clt","clwvi",
"evspsbl","evspsblpot",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","prc","prhmax","prsn","prw","ps","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","tauu","tauv","ta200","ta500","ta850","ts",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850","wsgsmax",
"zg200","zg500","zmla"
]
sf.variable_list_mon = [
"clt",
"evspsbl",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","ta200",
"ta500","ta850",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850",
"zg200","zg500"
]
sf.variable_list_sem = [
"clt",
"evspsbl",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","ta200","ta500","ta850",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850",
"zg200","zg500"
]
sf.variable_list_fx = [
"areacella",
"mrsofc",
"orog",
"rootd",
"sftgif","sftlf"
]
In [ ]:
# simple consistency check report for your submission form
res = form_handler.check_submission(sf)
sf.sub['status_flag_validity'] = res['valid_submission']
form_handler.DictTable(res)
In [ ]:
form_handler.form_save(sf)
In [ ]:
#evaluate this cell if you want a reference to the saved form emailed to you
# (only available if you access this form via the DKRZ form hosting service)
form_handler.email_form_info()
In [ ]:
# evaluate this cell if you want a reference (provided by email)
# (only available if you access this form via the DKRZ hosting service)
form_handler.email_form_info(sf)
In [ ]:
form_handler.email_form_info(sf)
form_handler.form_submission(sf)