The DKRZ data infrastructue team supports data producers, submitters and analysists at any step along the data workflow.


Fig. 3 Figure 1: The data workflow into and services on top of the CDP.#

  • An Earth System Model produces output in a raw format which needs to be post-processed to reach a project compliant format. This Post-processing includes diagnostics and standardization. The DKRZ develops and maintains tools for this processing steps. Afterwards, the data quality is assured with a QA tool. We summarize these steps under the term Preparation.

  • Next step: The data is ingested savely and quickly into the CMIP Data Pool. Since data can be both, primary data created on DKRZ HPC as well as replicated data, we use different tools for each purpose described in the subchapters.

  • In the following, the data can be published.

  • The DKRZ CDP is curated: Outdated and retracted data is removed, Catalogs are maintained, back-ups are created. Most of these processes are automized and continously improved.

  • In order to support users with data Analysis, we prepare tutorials and use cases in notebooks.