Generic DKRZ national archive ingest form

This form is intended to request data to be made locally available in the DKRZ nationl data archive besides the Data which is ingested as part of the CMIP6 replication. For replication requests a separate form is available.

Please provide information on the following aspects of your data ingest request:

  • scientific context of data
  • specific data access rights
  • technical details, like
    • amount of data
    • source of data

In [1]:
from dkrz_forms import form_widgets
form_widgets.show_status('form-submission')


Please provide information to unlock your form

  • last name
  • password

In [2]:
from dkrz_forms import form_handler, form_widgets

#please provide your last name - replacing ... below
MY_LAST_NAME = "ki" 


form_info = form_widgets.check_pwd(MY_LAST_NAME)

sf = form_handler.init_form(form_info)
form = sf.sub.entity_out.form_info


Enter your form key: ········
---- Your Name:  ki ki
---- Your email:  ki
---- Name of this submission form:  DKRZ_CDP_ki_123
Form Handler: Initialized form for project: DKRZ_CDP
DKRZ_CDP
/home/stephan/tmp/Repos/form_repo/DKRZ_CDP
DKRZ_CDP_ki_123.ipynb
/home/stephan/Repos/ENES-EUDAT/submission_forms/test/forms/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb
/home/stephan/tmp/Repos/form_repo/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb
submission form intitialized: sf
(For the curious: the sf object is used in the following to store and manage all your information)

In [14]:
import pprint
from dkrz_forms import form_handler
pprint.pprint(form_handler.form_to_dict(sf))


{'__doc__': '\n             Form object for data replication related information\n             \n             Workflow steps related sub-forms (filled by data manager):\n               - sub: data submission form\n               - rev: data review_form\n               - ing: data ingest form\n               - qua: data quality assurance form\n               - pub: data publication form\n               \n             Each workfow step form is structured according to\n               - entity_in : input information for this step\n               - entity_out: output information for this step\n               - agent: information related to responsible party for this step\n               - activity: information related the workflow step execution\n               \n              End user provided form information is stored in:\n             \n             _this_form_object.sub.entity_out.form  \n             \n             The the documentation of the sub.entity_out subform for \n             the end-user filled information entities \n             ',
 'form_dir': '/home/stephan/Repos/ENES-EUDAT/submission_forms/test/forms/DKRZ_CDP',
 'ing': {'__doc__': '\n       Attributes characterizing the data ingest workflow step:\n       - entity_in : data review report\n       - entity_out : data ingest report\n       - agent : person or tool related ingest step information \n       - activity : information on the ingest process\n       ',
         'activity': {'__doc__': '\n        Attributes characterizing the data ingest activity:\n        - status: status information\n        - timestamp_started\n        - timestamp_finished : data ingest timing information\n        - comment : free text comment\n        - ticket_id: related RT ticket number\n        - ingest_report: dictionary with ingest related information (tbd.)\n        ',
                      'comment': '',
                      'ingest_report': {},
                      'status': '',
                      'ticket_id': 0,
                      'timestamp_finished': '',
                      'timestamp_started': ''},
         'agent': {'__doc__': ' \n                   Attributes characterizing the person or tool managing the data ingest:\n                    - responsible_person:  \n                    ',
                   'responsible_person': ''},
         'entity_in': {'__doc__': "\n         Attributes characterizing the review results:\n         - data: time stamp of last results\n         - tag: git tag for repo conatining report information\n         - repo: (gitlab) repo containing report information,\n         - comment: free text comment for this review,\n         - status : review status information: ok, undef, uncomplete, error,\n         - dict: report details in dictionary,\n         - dict_type': predefined value: review_report   \n     ",
                       'comment': '',
                       'date': '',
                       'dict': '',
                       'dict_type': 'review_report',
                       'repo': '',
                       'status': '',
                       'tag': ''},
         'entity_out': {'__doc__': "\n     Attributes characterizing the ingest report summary:\n       - date: 'last modification date,\n       - tag : '(git) tag for repo containing report information,\n       - repo: '(gitlab) repo containing report information,\n       - comment : 'free text comment for this review,\n       - status : 'data ingest status information: ok, undef, uncomplete, error,\n       - dict: 'report details in dictionary,\n       - dict_type: 'predefined value: ingest_report   \n     ",
                        'comment': '',
                        'date': '',
                        'dict': '',
                        'dict_type': 'ingest_report',
                        'repo': '',
                        'status': '',
                        'tag': ''}},
 'project': 'DKRZ_CDP',
 'pub': {'__doc__': '\n        Attributes characterizing the data publication workflow step:\n        - entity_in: \n        - entity_out:\n        - agent:\n        - activity:\n     ',
         'activity': {'__doc__': ' Attributes characterizing the data publication activity:\n        - status: status information\n        - timestamp_started\n        - timestamp_finished : data ingest timing information\n        - comment : free text comment\n        - ticket_id: related RT ticket number\n        - follow_up_ticket: in case new data has to be provided\n        ',
                      'comment': '',
                      'follow_up_ticket': '',
                      'status': '',
                      'ticket_id': '',
                      'timestamp_finished': '',
                      'timestamp_started': ''},
         'agent': {'__doc__': '\n        Attributes characterizing the persons performing the data publication:\n         - responsible_person: person name\n         - publication_tool: string characterizing the publication tool\n     ',
                   'publication_tool': '',
                   'responsible_person': ''},
         'entity_in': {'__doc__': "\n     Attributes characterizing the quality report summary:\n       - date: 'last modification date,\n       - tag : '(git) tag for repo containing report information,\n       - repo: '(gitlab) repo containing report information,\n       - comment : 'free text comment for this review,\n       - status : 'data ingest status information: ok, undef, uncomplete, error,\n       - dict: 'report details in dictionary,\n       - dict_type: 'predefined value: ingest_report   \n     ",
                       'comment': '',
                       'date': '',
                       'dict': '',
                       'dict_type': 'qua_report',
                       'repo': '',
                       'status': '',
                       'tag': ''},
         'entity_out': {'__doc__': " \n               Attributes characterizing the data publication report\n               - date: last modification date,\n               - tag : (git) tag for repo containing report information,\n               - repo: (gitlab) repo containing report information,\n               - comment : free text comment for this review,\n               - status : data ingest status information: ok, undef, uncomplete, error,\n               - dict: 'report details in dictionary,\n               - dict_type: predefined value: publication_report, \n             ",
                        'comment': '',
                        'date': '',
                        'dict': '',
                        'dict_type': 'publication_report',
                        'facet_string': '# e.g. project=A&model=B& ....',
                        'repo': '',
                        'status': '',
                        'tag': ''}},
 'qua': {'__doc__': '\n        Attributes characterizing the data quality assurance step:\n        - entity_in: data ingest report\n        - entity_out:  data quality assurance report\n        - agent: person and tool responsible for qua checking\n        - activity: info on qua checking process\n    ',
         'activity': {'__doc__': '\n        Attributes characterizing the data quality assurance activity:\n        - status: status information\n        - timestamp_started\n        - timestamp_finished : data ingest timing information\n        - comment : free text comment\n        - ticket_id: related RT ticket number\n        - follow_up_ticket: in case new data has to be provided\n        - quality_report: dictionary with quality related information (tbd.)\n        ',
                      'comment': '',
                      'follow_up_ticket': '',
                      'status': '',
                      'ticket_id': '',
                      'timestamp_finished': '',
                      'timestamp_started': ''},
         'agent': {'__doc__': ' \n                   Attributes characterizing the person or tool managing the data ingest:\n                    - responsible_person:  \n                    ',
                   'responsible_person': ''},
         'entity_in': {'__doc__': "\n     Attributes characterizing the ingest report summary:\n       - date: 'last modification date,\n       - tag : '(git) tag for repo containing report information,\n       - repo: '(gitlab) repo containing report information,\n       - comment : 'free text comment for this review,\n       - status : 'data ingest status information: ok, undef, uncomplete, error,\n       - dict: 'report details in dictionary,\n       - dict_type: 'predefined value: ingest_report   \n     ",
                       'comment': '',
                       'date': '',
                       'dict': '',
                       'dict_type': 'ingest_report',
                       'repo': '',
                       'status': '',
                       'tag': ''},
         'entity_out': {'__doc__': "\n     Attributes characterizing the quality report summary:\n       - date: 'last modification date,\n       - tag : '(git) tag for repo containing report information,\n       - repo: '(gitlab) repo containing report information,\n       - comment : 'free text comment for this review,\n       - status : 'data ingest status information: ok, undef, uncomplete, error,\n       - dict: 'report details in dictionary,\n       - dict_type: 'predefined value: ingest_report   \n     ",
                        'comment': '',
                        'date': '',
                        'dict': '',
                        'dict_type': 'qua_report',
                        'repo': '',
                        'status': '',
                        'tag': ''}},
 'rev': {'__doc__': " \n           Attributes characterizing the data submission review step:\n           - entity_in :  submission form\n           - entity_out: adapted submission form\n           - agent: person or tool related information\n           - activity': submission activity related information\n           ",
         'activity': {'__doc__': '\n           Attributes characterizing the form review activity:\n            - ticket_url: assigned RT Ticket\n            - ticket_id: RT Ticket id \n            - review_comment: free text comment \n            - review_status: progress status of review\n            - review_report: dictionary with review results and issues\n        ',
                      'review_comment': '',
                      'review_status': '',
                      'ticket_id': 0,
                      'ticket_url': ''},
         'agent': {'__doc__': ' \n                   Attributes characterizing the person or tool checking the form:\n                    - responsible_person:  \n                    ',
                   'responsible_person': ''},
         'entity_in': {'__doc__': " \n           Attributes characterizing the data submission workflow step:\n           - entity_in : Input (form template and init info)\n           - entity_out: Output (submission form information)\n           - agent: person or tool related information\n           - activity': submission activity related information\n           ",
                       'activity': {'__doc__': '\n                         Attributes characterizing the form submission activity:\n                         \n                         - submission_comment : free text comment\n                         - submission_method  : How the submission was generated and submitted to DKRZ: email or DKRZ form server based       \n                         ',
                                    'pwd': ' password to access form ',
                                    'submission_comment': '',
                                    'submission_method': ''},
                       'agent': {'__doc__': 'Attributes characterizing the person responsible for form completion and submission:\n            \n                                   - last_name: Last name of the person responsible for the submission form content\n                                   - first_name: Corresponding first name\n                                   - email: Valid user email address: all follow up activities will use this email to contact end user\n                                   - key_word : user provided key word to remember and separate submission\n                             ',
                                 'email': '',
                                 'first_name': '',
                                 'key_word': '',
                                 'last_name': ''},
                       'entity_in': {'__doc__': '\n                           Attributes defining the form template used:\n                               \n                               - source_path: path to the form template used (jupyter notebook)         \n                               - form_template_version:  version string for the form template used\n                               - tag: git tag of repo containing templates (=source code repo in most cases)\n                ',
                                     'form_template_version': '',
                                     'source_path': '',
                                     'tag': ''},
                       'entity_out': {'__doc__': "\n                     Attributes characterizing the form submission process and context:\n                     - form: Form object for all end user provided information\n                     - form_name: consistent prefix for the form name (postfix=.ipynb and .json)\n                     - form_repo: git repo where forms are stored (before submission)\n                     - form_json, form_repo_path: full paths to json and ipynb representations\n                     - form_dir: directory where form are served to the notebook interface\n                     - status: status information\n                     - checks_done: form consistency checks done\n                     - tag: 'git tag of submission form in submission form repo',\n                     - repo: '(gitlab) repo where the tag relates to','\n                     \n                   ",
                                      'checks_done': '',
                                      'form': '',
                                      'form_dir': '',
                                      'form_info': {},
                                      'form_json': '',
                                      'form_name': '',
                                      'form_path': '',
                                      'form_repo': '',
                                      'form_repo_path': '',
                                      'repo': '',
                                      'status': '',
                                      'tag': ''}},
         'entity_out': {'__doc__': "\n         Attributes characterizing the review results:\n         - data: time stamp of last results\n         - tag: git tag for repo conatining report information\n         - repo: (gitlab) repo containing report information,\n         - comment: free text comment for this review,\n         - status : review status information: ok, undef, uncomplete, error,\n         - dict: report details in dictionary,\n         - dict_type': predefined value: review_report   \n     ",
                        'comment': '',
                        'date': '',
                        'dict': '',
                        'dict_type': 'review_report',
                        'repo': '',
                        'status': '',
                        'tag': ''}},
 'sub': {'__doc__': " \n           Attributes characterizing the data submission workflow step:\n           - entity_in : Input (form template and init info)\n           - entity_out: Output (submission form information)\n           - agent: person or tool related information\n           - activity': submission activity related information\n           ",
         'activity': {'__doc__': '\n                         Attributes characterizing the form submission activity:\n                         \n                         - submission_comment : free text comment\n                         - submission_method  : How the submission was generated and submitted to DKRZ: email or DKRZ form server based       \n                         ',
                      'keyword': u'123',
                      'pwd': '4VY08A',
                      'submission_comment': '',
                      'submission_method': ''},
         'agent': {'__doc__': 'Attributes characterizing the person responsible for form completion and submission:\n            \n                                   - last_name: Last name of the person responsible for the submission form content\n                                   - first_name: Corresponding first name\n                                   - email: Valid user email address: all follow up activities will use this email to contact end user\n                                   - key_word : user provided key word to remember and separate submission\n                             ',
                   'email': u'kim',
                   'first_name': u'ki',
                   'key_word': '',
                   'last_name': u'ki'},
         'entity_in': {'__doc__': '\n                           Attributes defining the form template used:\n                               \n                               - source_path: path to the form template used (jupyter notebook)         \n                               - form_template_version:  version string for the form template used\n                               - tag: git tag of repo containing templates (=source code repo in most cases)\n                ',
                       'form_json': u'/home/stephan/Repos/ENES-EUDAT/submission_forms/test/forms/DKRZ_CDP/DKRZ_CDP_ki_123.json',
                       'form_path': u'/home/stephan/Repos/ENES-EUDAT/submission_forms/test/forms/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb',
                       'form_template_version': '',
                       'source_path': '',
                       'tag': ''},
         'entity_out': {'__doc__': "\n                     Attributes characterizing the form submission process and context:\n                     - form: Form object for all end user provided information\n                     - form_name: consistent prefix for the form name (postfix=.ipynb and .json)\n                     - form_repo: git repo where forms are stored (before submission)\n                     - form_json, form_repo_path: full paths to json and ipynb representations\n                     - form_dir: directory where form are served to the notebook interface\n                     - status: status information\n                     - checks_done: form consistency checks done\n                     - tag: 'git tag of submission form in submission form repo',\n                     - repo: '(gitlab) repo where the tag relates to','\n                     \n                   ",
                        'checks_done': '',
                        'form': {'__doc__': '\n             Form object for data replication related information\n             \n             Workflow steps related sub-forms (filled by data manager):\n               - sub: data submission form\n               - rev: data review_form\n               - ing: data ingest form\n               - qua: data quality assurance form\n               - pub: data publication form\n               \n             Each workfow step form is structured according to\n               - entity_in : input information for this step\n               - entity_out: output information for this step\n               - agent: information related to responsible party for this step\n               - activity: information related the workflow step execution\n               \n              End user provided form information is stored in:\n             \n             _this_form_object.sub.entity_out.form  \n             \n             The the documentation of the sub.entity_out subform for \n             the end-user filled information entities \n             ',
                                 'project': 'CMIP6_CDP',
                                 'workflow': [('sub', 'data_submission'),
                                              ('rev',
                                               'data_submission_review'),
                                              ('ing', 'data_ingest'),
                                              ('qua',
                                               'data_quality_assurance'),
                                              ('pub', 'data_publication')]},
                        'form_dir': '',
                        'form_info': {'access_group': '....',
                                      'access_rights': '....',
                                      'best_ingest_before': '....',
                                      'context_comment': '....',
                                      'data_path': '....',
                                      'data_type': '....',
                                      'directory_structure': '...',
                                      'directory_structure_comment': '...',
                                      'directory_structure_convention': '...',
                                      'metadata_comment': '...',
                                      'metadata_convention_name': '...',
                                      'scientific_context': '...',
                                      'terms_of_use': '....',
                                      'usage': '....'},
                        'form_json': u'/home/stephan/tmp/Repos/form_repo/DKRZ_CDP/DKRZ_CDP_ki_123.json',
                        'form_name': u'DKRZ_CDP_ki_123',
                        'form_path': '',
                        'form_repo': '/home/stephan/tmp/Repos/form_repo/DKRZ_CDP',
                        'form_repo_path': u'/home/stephan/tmp/Repos/form_repo/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb',
                        'pwd': '4VY08A',
                        'repo': '',
                        'status': '',
                        'tag': ''},
         'id': '79ed688a-3280-11e7-b816-080027f178b4',
         'submission_repo': '/home/stephan/tmp/Repos/submission_repo/DKRZ_CDP',
         'timestamp': '2017-05-06 19:21:44.081274'},
 'workflow': [('sub', 'data_submission'),
              ('rev', 'data_submission_review'),
              ('ing', 'data_ingest'),
              ('qua', 'data_quality_assurance'),
              ('pub', 'data_publication')]}

Please provide the following information

Please provide some generic context information about the data, which should be availabe as part of the DKRZ CMIP Data Pool (CDP)


In [3]:
# (informal) type of data
form.data_type = "...."  # e.g. model data, observational data, .. 
# # free text describing scientific context of data
form.scientific_context ="..." 
# free text describing the expected usage as part of the DKRZ CMIP Data pool
form.usage = "...." 
# free text describing access rights (who is allowed to read the data)
form.access_rights = "...."
# generic terms of  policy information
form.terms_of_use = "...." # e.g. unrestricted, restricted
# any additional comment on context
form.access_group = "...."
form.context_comment = "...."

technical information concerning your request


In [4]:
# information on where the data is stored and can be accessed
# e.g. file system path if on DKRZ storage, url etc. if on web accessible resources (cloud,thredds server,..)
form.data_path = "...."

# timing constraints, when the data ingest should be completed 
# (e.g. because the data source is only accessible in specific time frame)  
form.best_ingest_before = "...."

# directory structure information, especially 
form.directory_structure = "..."  # e.g. institute/experiment/file.nc
form.directory_structure_convention = "..." # e.g. CMIP5, CMIP6, CORDEX, your_convention_name
form.directory_structure_comment = "..."  # free text, e.g. with link describing the directory structure convention you used

# metadata information
form.metadata_convention_name = "..." # e.g. CF1.6 etc. None if not applicable
form.metadata_comment = "..." # information about metadata, e.g. links to metadata info etc.

Check your submission form

Please evaluate the following cell to check your submission form.

In case of errors, please go up to the corresponden information cells and update your information accordingly...


In [5]:
# to be completed ..

Save your form

your form will be stored (the form name consists of your last name plut your keyword)


In [6]:
form_handler.save_form(sf,"..my comment..") # edit my comment info



Form Handler - save form status message:
/home/stephan/Repos/ENES-EUDAT/submission_forms/test/forms/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb
/home/stephan/tmp/Repos/form_repo/DKRZ_CDP/DKRZ_CDP_ki_123.ipynb
 --- form stored in transfer format in: /home/stephan/tmp/Repos/form_repo/DKRZ_CDP/DKRZ_CDP_ki_123.json
 
 --- commit message:[master d29a38a] Form Handler: submission form for user ki saved using prefix DKRZ_CDP_ki_123 ## ..my comment..
 2 files changed, 53 insertions(+), 7 deletions(-)
Out[6]:
DKRZ Form object 

officially submit your form

the form will be submitted to the DKRZ team to process you also receive a confirmation email with a reference to your online form for future modifications


In [7]:
form_handler.email_form_info(sf)
form_handler.form_submission(sf)


This form is not hosted at DKRZ! Thus form information is stored locally on your computer 

Here is a summary of the generated and stored information:
-- form for project:  DKRZ_CDP
-- form name:  DKRZ_CDP_ki_123
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-28be4728f00a> in <module>()
----> 1 form_handler.email_form_info(sf)
      2 form_handler.form_submission(sf)

/home/stephan/Repos/ENES-EUDAT/submission_forms/dkrz_forms/form_handler.pyc in email_form_info(sf)
    536      print("-- form for project: ",sf.project)
    537      print("-- form name: ",sf.sub.entity_out.form_name)
--> 538      print("-- submission form path: ", sf.sub.entity_out.subform_path)
    539      print("-- package path: ", sf.sub.entity_out.package_path)
    540      print("-- package name: ", sf.sub.entity_out.package_name)

AttributeError: 'Form' object has no attribute 'subform_path'