DKRZ CMIP6 submission form for ESGF data publication

General Information (to be completed based on official CMIP6 references)

Data to be submitted for ESGF data publication must follow the rules outlined in the CMIP6 Archive Design
(https://...)

Thus file names have to follow the pattern:

VariableName_Domain_GCMModelName_CMIP6ExperimentName_CMIP5EnsembleMember_RCMModelName_RCMVersionID_Frequency[_StartTime-EndTime].nc
Example: tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc

The directory structure in which these files are stored follow the pattern:

activity/product/Domain/Institution/ GCMModelName/CMIP5ExperimentName/CMIP5EnsembleMember/ RCMModelName/RCMVersionID/Frequency/VariableName
Example: CORDEX/output/AFR-44/MPI-CSC/MPI-M-MPI-ESM-LR/rcp26/r1i1p1/MPI-CSC-REMO2009/v1/mon/tas/tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc

Notice: If your model is not yet registered, please contact contact ....

This 'data submission form' is used to improve initial information exchange between data providers and the DKZ data managers. The form has to be filled before the publication process can be started. In case you have questions please contact cmip6@dkrz.de


In [1]:
from dkrz_forms import form_widgets
form_widgets.show_status('form-submission')


Start submission procedure

The submission is based on this interactive document consisting of "cells" you can modify and then evaluate. Evaluation of cells is done by selecting the cell and then press the keys "Shift" + "Enter"
Please evaluate the following cell to initialize your form based on the information provided as part of the form generation (name, email, etc.)


In [ ]:
MY_LAST_NAME = "...."   # e.gl MY_LAST_NAME = "schulz" 
#-------------------------------------------------


from dkrz_forms import form_handler, form_widgets, checks
form_info = form_widgets.check_pwd(MY_LAST_NAME)
sf = form_handler.init_form(form_info)
form = sf.sub.entity_out.form_info

please provide information on the contact person for this CORDEX data submission request

Type of submission

please specify the type of this data submission:

  • "initial_version" for first submission of data
  • "new _version" for a re-submission of previousliy submitted data
  • "retract" for the request to retract previously submitted data

In [ ]:
sf.submission_type = "..."  # example: sf.submission_type = "initial_version"

Requested general information

... to be finalized as soon as CMIP6 specification is finalized ....

Please provide model and institution info as well as an example of a file name

institution

The value of this field has to equal the value of the optional NetCDF attribute 'institution' (long version) in the data files if the latter is used.


In [ ]:
sf.institution = "..." # example: sf.institution = "Alfred Wegener Institute"
institute_id

The value of this field has to equal the value of the global NetCDF attribute 'institute_id' in the data files and must equal the 4th directory level. It is needed before the publication process is started in order that the value can be added to the relevant CORDEX list of CV1 if not yet there. Note that 'institute_id' has to be the first part of 'model_id'


In [ ]:
sf.institute_id = "..." # example: sf.institute_id = "AWI"
model_id

The value of this field has to be the value of the global NetCDF attribute 'model_id' in the data files. It is needed before the publication process is started in order that the value can be added to the relevant CORDEX list of CV1 if not yet there. Note that it must be composed by the 'institute_id' follwed by the RCM CORDEX model name, separated by a dash. It is part of the file name and the directory structure.


In [ ]:
sf.model_id = "..." # example: sf.model_id = "AWI-HIRHAM5"

experiment_id and time_period

Experiment has to equal the value of the global NetCDF attribute 'experiment_id' in the data files. Time_period gives the period of data for which the publication request is submitted. If you intend to submit data from multiple experiments you may add one line for each additional experiment or send in additional publication request sheets.


In [ ]:
sf.experiment_id = "..."  # example: sf.experiment_id = "evaluation"
                          # ["value_a","value_b"] in case of multiple experiments
sf.time_period = "..." # example: sf.time_period = "197901-201412" 
                       # ["time_period_a","time_period_b"] in case of multiple values

Example file name

Please provide an example file name of a file in your data collection, this name will be used to derive the other


In [ ]:
sf.example_file_name = "..." # example: sf.example_file_name = "tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc"

In [ ]:
# Please run this cell as it is to check your example file name structure 
# to_do: implement submission_form_check_file function - output result (attributes + check_result)
form_handler.cordex_file_info(sf,sf.example_file_name)

information on the grid_mapping

the NetCDF/CF name of the data grid ('rotated_latitude_longitude', 'lambert_conformal_conic', etc.), i.e. either that of the native model grid, or 'latitude_longitude' for the regular -XXi grids


In [ ]:
sf.grid_mapping_name = "..." # example: sf.grid_mapping_name = "rotated_latitude_longitude"

Does the grid configuration exactly follow the specifications in ADD2 (Table 1) in case the native grid is 'rotated_pole'? If not, comment on the differences; otherwise write 'yes' or 'N/A'. If the data is not delivered on the computational grid it has to be noted here as well.


In [ ]:
sf.grid_as_specified_if_rotated_pole = "..." # example: sf.grid_as_specified_if_rotated_pole = "yes"

Please provide information on quality check performed on the data you plan to submit

Please answer 'no', 'QC1', 'QC2-all', 'QC2-CORDEX', or 'other'.

'QC1' refers to the compliancy checker that can be downloaded at http://cordex.dmi.dk. 'QC2' refers to the quality checker developed at DKRZ.

If your answer is 'other' give some informations.


In [ ]:
sf.data_qc_status = "..."  # example: sf.data_qc_status = "QC2-CORDEX"
sf.data_qc_comment = "..." # any comment of quality status of the files

Terms of use

Please give the terms of use that shall be asigned to the data. The options are 'unrestricted' and 'non-commercial only'. For the full text 'Terms of Use' of CORDEX data refer to http://cordex.dmi.dk/joomla/images/CORDEX/cordex_terms_of_use.pdf


In [ ]:
sf.terms_of_use = "..." # example: sf.terms_of_use = "unrestricted"

Information on directory structure and data access path

(and other information needed for data transport and data publication)

If there is any directory structure deviation from the CORDEX standard please specify here. Otherwise enter 'compliant'. Please note that deviations MAY imply that data can not be accepted.


In [ ]:
sf.directory_structure = "..." # example: sf.directory_structure = "compliant"

Give the path where the data reside, for example: blizzard.dkrz.de:/scratch/b/b364034/. If not applicable write N/A and give data access information in the data_information string


In [ ]:
sf.data_path = "..."        # example: sf.data_path = "mistral.dkrz.de:/mnt/lustre01/work/bm0021/k204016/CORDEX/archive/"
sf.data_information = "..." # ...any info where data can be accessed and transfered to the data center ... "

Exclude variable list

In each CORDEX file there may be only one variable which shall be published and searchable at the ESGF portal (target variable). In order to facilitate publication, all non-target variables are included in a list used by the publisher to avoid publication. A list of known non-target variables is [time, time_bnds, lon, lat, rlon ,rlat ,x ,y ,z ,height, plev, Lambert_Conformal, rotated_pole]. Please enter other variables into the left field if applicable (e.g. grid description variables), otherwise write 'N/A'.


In [ ]:
sf.exclude_variables_list = "..." # example: sf.exclude_variables_list=["bnds", "vertices"]

Uniqueness of tracking_id and creation_date

In case any of your files is replacing a file already published, it must not have the same tracking_id nor the same creation_date as the file it replaces. Did you make sure that that this is not the case ? Reply 'yes'; otherwise adapt the new file versions.


In [ ]:
sf.uniqueness_of_tracking_id = "..." # example: sf.uniqueness_of_tracking_id = "yes"

Variable list

list of variables submitted -- please remove the ones you do not provide:


In [ ]:
sf.variable_list_day = [
"clh","clivi","cll","clm","clt","clwvi",
"evspsbl","evspsblpot",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","prc","prhmax","prsn","prw","ps","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","tauu","tauv","ta200","ta500","ta850","ts",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850","wsgsmax",
"zg200","zg500","zmla"
]

sf.variable_list_mon = [
"clt",
"evspsbl",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","ta200",
"ta500","ta850",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850",
"zg200","zg500"
]
sf.variable_list_sem = [
"clt",
"evspsbl",
"hfls","hfss","hurs","huss","hus850",
"mrfso","mrro","mrros","mrso",
"pr","psl",
"rlds","rlus","rlut","rsds","rsdt","rsus","rsut",
"sfcWind","sfcWindmax","sic","snc","snd","snm","snw","sund",
"tas","tasmax","tasmin","ta200","ta500","ta850",
"uas","ua200","ua500","ua850",
"vas","va200","va500","va850",
"zg200","zg500"  
]

sf.variable_list_fx = [
"areacella",
"mrsofc",
"orog",
"rootd",
"sftgif","sftlf"   
]

Check your submission before submission


In [ ]:
# simple consistency check report for your submission form
res = form_handler.check_submission(sf)
sf.sub['status_flag_validity'] = res['valid_submission']
form_handler.DictTable(res)

Save your form

your form will be stored (the form name consists of your last name plut your keyword)


In [ ]:
form_handler.form_save(sf)

In [ ]:
#evaluate this cell if you want a reference to the saved form emailed to you
# (only available if you access this form via the DKRZ form hosting service)
form_handler.email_form_info()

In [ ]:
# evaluate this cell if you want a reference (provided by email)
# (only available if you access this form via the DKRZ hosting service)
form_handler.email_form_info(sf)

officially submit your form

the form will be submitted to the DKRZ team to process you also receive a confirmation email with a reference to your online form for future modifications


In [ ]:
form_handler.email_form_info(sf)
form_handler.form_submission(sf)