Curation#

The CDP is curated with different measures to keep it updated, accessible and back-upped.

Clear and structured insights into the data collection of the CDP is gained through catalogs. intake catalogs enable users to quickly browse through the catalog but also open and load the data into memory. It is based on xarray and pandas . Over time, many data centers have decided to provide intake catalogs for their CMIP data collections. This facilitates the interoperable application of scripts by using the global name space of an intake catalog enabled by the CMIP data standard.

The ESGF index is searched for both new and retracted datasets on daily frequency. Outdated and retracted Datasets are separated from the clean CMIP data tree by the egest service. This data is retained on the file system for a few months depending on the need for free disk space. In this time, data analysists can still easily reproduce results based on this data. Afterwards, the data is moved into the archive. Additional Back-ups of the main part of the CDP in the archive protect against unlikely events of failures.

In the near future, we will provide tutorials on how to retrieve data from the archive.