Replication and Syncronization
Replication and Syncronization#
In addition to the primary publication, the data pool is filled by replication of datasets from other ESGF nodes. Here, we document the tools which we use for that and which datasets have higher priority.
The tools for replication have the following attributes:
The syncronization process is designed to run “eternally” as the publication at the other nodes is ongoing. Therefore, the process must be run as a
cronjoband as a
servicethat can restart itself.
The continous update of an entire repository requires that data from other nodes is tested for compatibility. This includes checks against replication errors or design errors occured at the publishing ESGF node.
Due to the extent of the entire CMIP6 data repository, important data is defined by the AR6 working groups which will have a higher priority.
- Load management
The syncronization disseminates downloads so that new datasets are not downloaded from only one ESGF node.
In case you miss some ESGF CMIP(6) data available at other ESGF nodes you can request a data replication by contacting firstname.lastname@example.org. In case you have requirements with respect to storage of derived data products or the inclusion of non-ESGF accessible data sets please contact email@example.com.