You want to store and publish CMIP6 data at DKRZ via ESGF ? This form will provide some background information and guide you through the process.
To organize the data ingest we need some specific information with respect to the CMIP6 data collection you want to publish (e.g. concerning data structure, content and quality). The form has to be filled before the ESGF data ingest and publication process can be started.
In case you have questions please contact esgf-publication@dkrz.de
You need to be aware of a set of technical requirements which have to be addressed before CMIP6 data submission to DKRZ and ESGF data publication are possible. They are collected at the official WCRP CMP Phase6 (CMIP6) site in the Guide to CMP6 Participation. In the following a short summary of key prerequisites is given:
Contact and citation information has to be registered in the citation GUI documentation of GUI
Your data conforms to the CMIP6 specifications for file names, directory structures and CMIP6 Data Reference Syntax (DRS)
Directory structure:
<mip_era>/<activity_id>/<institution_id>/<source_id>/
<experiment_id>/<member_id>/<table_id>/<variable_id>/<grid_label>/
File naming convention:
<variable_id>_<table_id>_<source_id>_<experiment_id><member_id>
_<grid_label[_<time_range>].nc
Please make sure your data is quality checked before submission to a data center. Two tools for checking are recommended:
The submission is based on this interactive document consisting of "cells" you can modify and then evaluate.
Evaluation of cells is done by selecting the cell and pressing the keys "Shift" + "Enter".
Please evaluate the following cell to identifiy your form (associate your name and email to this form).
Attention: the name selected must match the name at the opt of this page !
In [ ]:
# Evaluate this cell to identifiy your form
from dkrz_forms import form_widgets, form_handler, checks
form_infos = form_widgets.show_selection()
In [ ]:
# Evaluate this cell to generate your personal form instance
form_info = form_infos[form_widgets.FORMS.value]
sf = form_handler.init_form(form_info)
form = sf.sub.entity_out.report
In [ ]:
form.submission_type = "init" # example: sf.submission_type = "initial_version"
In [ ]:
form.cmor = '..' ## options: 'CMOR', 'CDO-CMOR', etc.
form.cmor_compliance_checks = '..' ## please name the tool you used to check your files with respect to CMIP6 compliance
## 'PREPARE' for the CMOR PREPARE checker and "DKRZ" for the DKRZ tool.
In [ ]:
form.es_doc = " .. " # 'yes' related esdoc model information is/will be available, 'no' otherwise
form.errata = " .. " # 'yes' if errata information was provided based on the CMIP6 errata mechanism
# fill the following info only in case this form refers to new versions of already published ESGF data
form.errata_id = ".." # the errata id provided by the CMIP6 errata mechanism
form.errata_comment = "..." # any additional information on the reason of this new version, not yet provided
All your file have unique tracking_ids assigned in the structure required by CMIP6 ?
In case any of your files is replacing a file already published, it must not have the same tracking_id nor the same creation_date as the file it replaces. Did you make sure that that this is true ?
Reply 'yes'; otherwise adapt your files, no ESGF publication is possible !
In [ ]:
form.uniqueness_of_tracking_id = "..." # example: form.uniqueness_of_tracking_id = "yes"
Please name the respective directory names characterizing your submission:
CMIP6 directory structure:
<CMIP6>/<activity_id>/<institution_id>/<source_id>/
<experiment_id>/<member_id>/<table_id>/<variable_id>/
<grid_label>/<version>
addresses all 3hr data in the specified experiment/member
In [ ]:
form.data_dir_1 = " ... "
# uncomment for additional entries ...
# form.data_dir_2 = " ... "
# form.data_dir_3 = " ... "
# ...
In [ ]:
form.time_period = "..." # example: sf.time_period = "197901-201412"
# ["time_period_a","time_period_b"] in case of multiple values
form.grid = ".."
In each CMIP6 file there may be only one variable which shall be published and searchable at the ESGF portal (target variable). In order to facilitate publication, all non-target variables are included in a list used by the publisher to avoid publication. A list of known non-target variables is [time, time_bnds, lon, lat, rlon ,rlat ,x ,y ,z ,height, plev, Lambert_Conformal, rotated_pole]. Please enter other variables into the left field if applicable (e.g. grid description variables), otherwise write 'N/A'.
In [ ]:
form.exclude_variables_list = "..." # example: sf.exclude_variables_list=["bnds", "vertices"]
In [ ]:
form.terms_of_use = "..." # has to be "ok"
In [ ]:
form.data_path = "..." # example: sf.data_path = "mistral.dkrz.de:/mnt/lustre01/work/bm0021/k204016/CORDEX/archive/"
form.data_information = "..." # ...any info where data can be accessed and transfered to the data center ... "
In [ ]:
form.example_file_name = "..." # example: sf.example_file_name = "tas_AFR-44_MPI-M-MPI-ESM-LR_rcp26_r1i1p1_MPI-CSC-REMO2009_v1_mon_yyyymm-yyyymm.nc"
In [ ]:
# simple consistency check report for your submission form - not completed
report = checks.check_report(sf,"sub")
checks.display_report(report)
In [ ]:
form_handler.save_form(sf,"any comment you want") # add a comment
In [ ]:
# evaluate this cell if you want a reference (provided by email)
# (only available if you access this form via the DKRZ hosting service)
form_handler.email_form_info(sf)
In [ ]:
#form_handler.email_form_info(sf)
form_handler.form_submission(sf)