FAQ

FAQ#

WARNING: This is ongoing work….

What data do you offer?

The data from numerical experiments which contribute to internationally coordinated model intercomparison projects, the so-called MIPs. The most prominent global MIP activity is CMIP, the Coupled Model Intercomparison project. CMIP is currently in its 6th phase and the data produced along its protocols forms the scientific basis for the climate projection analyses in IPCC’s AR6 (Sixth Assessment Report). The data pool holds model data covering CMIP3, CMIP5 and CMIP6 activities.

What about observations?

Not (yet) available in the pool - but have a look on ‘What is reanalysis?’ and ERA5 data. Maybe that serves your purpose.

What kind of models produce MIP data?

Dynamical models: Earth System Models (ESMs), Global Climate Models (GCMs), Regional Climate Models (RCMs), Weather Models (ECMWF for Reanalysis). (!)

How and where can I access the data on DKRZ’s High Performance Computer?

We link model data repositories and catalogs in the file system under /pool/data. If you are not a DKRZ user yet, you may want to apply for the transnational access service of IS-ENES3 or the ENES Climate Analytics (ECAS) service. In that case, you need to request to become member of project bk1088.

Basic data management terminology#

What is a dataset?

Datasets in CMIP are defined to be all files required to cover the entire time series of a single variable of a single simulation of a single experiment of a single model. That can be multiple files, but they are all in ONE directory uniquely defined by the Data Reference Syntax (DRS).

How is the path to the data constructed? What is the path template? What is the DRS?

The Data Reference Syntax is a set of required attributes which uniquely identify and describe a dataset. Since only DRS elements are used to construct the directory structure and file names of a dataset, you can use the DRS to find data on the filesystem. The path template for CMIP6 is:

<mip_era>/

<activity_id>/

<institution_id>/

<source_id>/

<experiment_id>/

<member_id>/

<table_id>/

<variable_id>/

<grid_label>/

<version>

Output of Earth System and Climate Models#

Why do climate scientists use Models?

While physicists usually can conduct experiments in laboratories, climate scientists do not have this option: Testing a thesis on effects of climate change with our one and only planet is highly morally questionable (unless it is about emitting a lot of CO₂, obviously… ). The only access to such questions is through the digital representation of earth in models. Using these models, hypotheses about causalities between earth system changes can be tested by e.g. implementing corresponding boundary or initial conditions into a simulation.

Attribute	Meaning	Example
product	product type	“model-output” (“output”) is the only allowed value in CMIP6 (CMIP5)

How is earth described by models?

Dynamical models are based on physical and biogeochemical equations that have been proven in laboratory experiments. However, these equations cannot be solved or applied for every location and point in time possible because of both the complexity and nonlinearity of the earth system as well as computational resource limitations. This is why the scientists use numerics for finding approximate solutions:

These equations are ‘discretized’ for specific points in space and time resulting in a spatial grid covering the earth and a calculation time step or output frequency. Values for variables on such a grid point are valid for the entire area that this grid point covers. Assuming a resolution of 1° in space, which is a state-of-the-art earth system resolution for global experiments, a grid cell covers about 100*100km.

Note that results have to be accordingly interpreted: If a town is located inside a grid cell, it iswrongto say that the exact value for that grid cell is a prediction or a precise measurement for that town. The value is an average for over 10000km². The effect of orography makes it clear: If a grid cell covers a mountain and a valley, one value describes the conditions for both.

Attribute	Meaning	Example
grid	Can be used to describe the horizontal grid and regridding procedure.	“data regridded to a CMIP6 standard 1x1 degree lonxlat grid from the native T63 grid using an area-average preserving method.”
grid_label	Allows distinction when the variable is reported on more than one grid.	“gn” for native grid. “gr” for regridded data reported on the data provider’s preferred target grid.
nominal_resolution	Provides an indication of approximate output grid resolution.	“50 km”, “100 km”, “250 km”

What are the definitions of Earth System, Earth System Model and Climate Model?

The Earth System is the combination of all physical, chemical, biological and social components, processes and feedbacks that influence the state and the change of planet earth. https://de.wikipedia.org/wiki/Erdsystemwissenschaft#cite_note-Leemans-1

An Earth System Model integrates and couples submodels each specialized for a component of the earth system and which in combination fully describe the earth system. This means, besides an atmospheric model that represents the atmospheric state, other models are implemented to calculate the physics and dynamics of ocean, ice and land as well as biogeochemical processes. You are a biologist? Maybe the output of the land model is of your interest: You can find simulations of vegetation and land cover types, CO₂budgets and many more variables for many experiments.

Note that this can be a difference to weather forecast models: Because their focus is on a short time period, it is sufficient to use an atmospheric model to only calculate the atmospheric conditions. Climate related questions on the other hand cover larger time scales as well as all parts of the earth system and therefore demand a more extended earth representation. In addition, the non-atmospheric processes such as the oceanic circulation have a considerable influence on the earth system on such time scales. Therefore, their precise description in ESMs has become a requirement for their results being useful.

A climate model is not necessarily an ESM however an ESM can most certainly be used for climate simulations. If only the atmospheric part of the ESM was used for an experiment, the term climate model may still be used. As for weather forecasts, it depends on the focus of the experiment and its underlying scientific question if a full representation of the earth system is needed at all.

Attribute	Meaning	Example
source_id, model_id, driving_model_id	Model identifier, the short form of “source”. Values without forbidden characters like spaces. CV registered values only.	“GFDL-CM2-1”, “MPI-ESM1-2-HR”
source ,model	Used to fully identify the model and version. It must include the year (i.e., model vintage) when this model version was first used in a scientific application. It should also include information concerning the component models.	“CCSM2 (2002): atmos: CAM2 (c am2_0_brnchT_itea_2, T42L26); ocean: POP (pop2_0_ver_1.4.3, 3x2L15); seaIce: CSIM4; land: CLM2.0”
source_type	Experiments define what components of an ESM are required. source_type contains all components of the ESM that were switched on for the experiment. “AOGCM”, which is for atmosphere-ocean global climate model, means that a coupled simulation is conducted with atmosphere-ocean interaction. “AGCM” on the other hand means that only the Atmospheric Model was switched on.	“AOGCM”, “AGCM CHEM”

How fine can Climate Models resolve earth?

The range of grid spacings spanned by climate models begin on the km scale and reach up to 250km. While a state of the art global resolving ESM has a grid spacing on the order of 100x100km, regional climate models can simulate on a 10x10km scale. On even finer resolutions, the climate models have to solve a set of additional physical equations for processes like turbulence and convection. Normally, those processes can be appropriately parameterized. This would cost a lot of additional computational resources.

A regional climate model (RCMs) only simulates a part of the earth, often a continent. By limiting the model area, computational resources get free and allow simulations on finer resolution. Therefore, RCMs provide information on much smaller scales supporting more detailed impact and adaptation assessment and planning.

Attribute

Meaning

Example

grid_label

Allows distinction when the variable is reported on more than one grid.

“gm”: global mean output is reported, so data are not gridded

nominal_resolution

Characterizes the resolution of the grid used to report model output fields. The respective measure *d^max is the average of the maximum distance of cell vertices weighted by the grid-cell’s area. For lonxlat grid cells, for example, d^max would be the diagonal distance.

“100 km” if 62km <

< 160 km

Why does the ESM science community need so much data?

Assuming a resolution of 1° in space, which is a state-of-the-art earth system resolution for global experiments, 90 levels and monthly output frequency for 100 years, this results in 360*180*90*12*100 data points for one variable of one experiment. If one value uses 4 bytes of disk space, this number is equivalent to about 25GB (uncompressed). Further multiplication factors for a project are all variables, experiments or participating models. For the CMIP6 project, those are about 100 experiments and 2000 variables.

Model Intercomparison Projects, Downscaling and Reanalysis#

What is a reanalysis? What is ERA5?

A reanalysis combines observations and the dynamics of a weather model to find the most realistic description of the earth system state at a specific time. The integration of observations into the weather model is a dedicated field of science called data assimilation. The reanalysis process includes back and forth calculation in time: From a specific starting point, the model simulates up to the next observation time where those observations are compared with the simulation. The simulation is fitted to the observation and the model calculates backward in time. One of the best weather models is the Integrated Forecast System (IFS) of the ECMWF. The most recent reanalysis product of the ECMWF is ERA5 andl is available in DKRZ’s climate data pool. Click here for documentation.

The path to the data is /pool/data/ERA5

How can Climate models be compared? How good are Climate models? What is CMIP?

In order to evaluate and compare climate models, a globally organized intercomparison project is periodically conducted. The Coupled Model Intercomparison Project (CMIP) is in its 6th phase and builds the database for reports of the International Panel on Climate Change (IPCC). Many international institutions participate in this project with their models. ‘Coupled’ means that the atmosphere and the ocean model interact with each other.

CMIP defines a range of standard experiments required to evaluate the basic features of climate models. Those include piControl, historical, abrupt-4xCO₂, 1pctCO₂ and amip. The preIndustrial Control simulation is the reference for other experiments and ensures that the climate model is able to simulate a stable climate for over 500 years. The historical experiment covers the years from 1850-2014, in which the climate models are forced with time series of adequate aerosol and land use fields according to observations. Therefore, the model’s ability to simulate a realistic evolution of historical climate can be evaluated with a statistical analysis of that experiment output. The abrupt-4xCO₂ and 1pctCO₂ experiments address CO₂ forcing feedback, whereas abrupt-4xCO₂ represents an abrupt quadrupling of CO₂ and 1pctCO² an increase of CO₂ atmospheric concentration of 1% each year for 140 years. The latter two are essential to evaluate how probable a future scenario output of that model can be. The amip experiment is an atmosphere only experiment where the sea surface temperature of the ocean is prescribed according to observations in order to better analyse the atmospheric part of the model.

If you want to provide a baseline for your analysis, refer to the results of the CMIP standard experiment evaluation.

Attribute	Meaning	Example
activity_id	Allows distinction when the variable is reported on more than one grid.	“gm”: global mean output is reported, so data are not gridded
experiment_id	Characterizes the resolution of the grid used to report model output fields. The respective measure d*^max is the average of the maximum distance of cell vertices weighted by the grid-cell’s area. For lonxlat grid cells, for example, d^max would be the diagonal distance.	“100 km” if 62km < < 160 km
experiment

How does `Regional Downscaling <https://cordex.org/about/what-is-regional-downscaling/>`__ work? What is CORDEX?

Regional Climate Model simulations are driven by data obtained from global simulations i.e. they use them as initial and boundary conditions. These data can stem from both experiments or reanalysis. Since the focus is on a small scale, regional climate models are often combinations of only atmospheric-land submodels without ocean and biogeochemistry. The integration of all submodels is part of ongoing research efforts.

Are you looking for robust information for a localised domain? Then have a look at CORDEX (Coordinated Regional Climate Downscaling Experiment, `CORDEX <https://cordex.org/>`__ . Under the CORDEX protocol, RCM results have been made comparable and evaluable. We provide CORDEX data in the climate data pool.

What is an endorsed MIP? Why are there other MIPs inside CMIP6? How is CMIP structured?

For the recent phase 6 of CMIP, its design is a framework which allows smaller model intercomparison projects (MIPs) with a specific focus to be endorsed to CMIP6. That means, each model that runs the standard CMIP experiments (see How good are Climate models?) can participate in CMIP6 and further MIPs. The most often used activities are CMIP, which contains the standard experiments (see ‘how good are climate models?’), and ScenarioMIP, which contains the future scenarios.

While these endorsed MIPs do rather detached science, the modelers and the data users benefit from a shared data infrastructure including a condensed data request, a data pool and a data portal for all MIPs. We at DKRZ provide services for infrastructure tools that simplify navigating through the CMIP6 requirements (e.g. https://c6dreq.dkrz.de ).

What scientific questions are addressed by endorsed MIPs of CMIP6?

The Coupled Model Intercomparison Project Phase 6 is designed (doi:10.5194/gmd-9-1937-2016) in order to tackle three main questions and the Grand Science Challenges defined by the World Climate Research Programme (Grand Challenges Overview). Each of those can be associated with one or more endorsed MIPs (see ‘What is an endorsed MIP?’) (see the upcoming table).

The three questions are:

How does the Earth system respond to forcing?
What are the origins and consequences of systematic model biases?
How can we assess future climate changes given internal climate variability, predictability, and uncertainties and scenarios?

The Grand Science Challenges relate to

advancing understanding of the role of clouds in the general atmospheric circulation and climate sensitivity
assessing the response of the cryosphere to a warming climate and its global consequences
understanding the factors that control water availability over land
assessing climate extremes, what controls them, how they’ve changed in the past and how they might change in the future
understanding and predicting regional sea level change and its coastal impacts
improving near-term cli-mate predictions
determining how biogeochemical cycles and feedback control greenhouse gas concentrations and climate change.

What does the definition of a numerical experiment include? What are the assumptions? How are experiments for Climate Models constructed?

Navigation - how can I find the data I need?#

How can I find the project I need?

Are you looking for robust historical data? → ERA5 (also other reanalysis going back further in time. ERA5 at DKRZ as of now just starts in 1979) - Are you looking for climate model output? → CORDEX, CMIP - Are you looking for results valid on a small scale? → CORDEX - Are you looking for the database of IPCC? → CMIP(, CORDEX) - Are you looking for new results on future Scenarios? → CMIP6

How can I navigate through CMIP6? Try to determine as many DRS elements you can for your specific research idea. For CMIP6, try to select at least before you search for variables: 1. CMIP6 endorsed MIPs and/or CMIP6 experiments that are of interest: What MIP targets the topic of your interest? → See ‘What scientific questions are addressed by endorsed MIPs of CMIP6?’ Which experiments are suitable to analyse your research questions? → See ‘CMIP6 experiments’

How can I find the variables I need?

Most of the data in the data pool is compliant to the Climate and Forecast Convention LINK. This defines so called standard_names, which need to be assigned to variables as a variable attribute inside the data. Since these are very long, the name of the variable in the data is a shorter - a so-called short name. This short name is saved in the data catalogs which can be searched.

So how can the short name for a standard_name be found? For CMIP6, one way is this page: http://clipc-services.ceda.ac.uk/dreq/mipVars.html . But for air_temperature, e.g., you get many results. The reason for that is, that in CMIP, there are multiple definitions for one ‘physical’ variable like air_temperature: Each is a combination of the frequency, time cell methods (average or instantaneous), vertical level (e.g. interpolated on pressure levels), grid and realm (e.g. atmosphere model output or ocean model output) depending on the interest of all MIPs. One combination is assigned to a project table, so called MIP-table or cmor-cmip-table, where similar combinations are collected.

Here is a table of the most often used variables:

Variable (short_name)	s tandard_name	Example table with monthly mean values on a latlon grid	Level
tas	a ir_temperature	Amon	Near-surface (usually 2m)
pr	precipitation	Amon	Near-surface (usually 2m)
uas	eastward_wind	Amon	10m
vas	northward_wind	Amon	10m
psl	air_pr essure_a t_mean _sea_level	Amon
ts	surfa ce_temperature	Amon
zg	geopo tential_height	Amon	19 pressure levels
co2	mole_fractio n_of_carbon_di oxide_in_air	Amon	19 pressure levels
tos	sea_surfa ce_temperature	Omon
zostoga	gl obal_average_t hermosteric_se a_level_change	Omon
fgco2	surface_down ward_mass_flu x_of_carbon _dioxide_expre ssed_as_carbon	Omon	Near-surface (usually 0m)
volcello	ocean_volume	Omon	ocean model level
siconc	sea_ice _area_fraction	SImon
snw	surfa ce_snow_amount	LImon
nbp	surface_ne t_downward_mas s_flux_of_carb on_dioxide_exp ressed_as_carb on_due_to_al l_land_proces ses	Lmon
rlus	surface_up welling_longwa ve_flux_in_air	Amon

of a variable identifier (the name of the variable inside the MIP-table) and a MIP-table is therefore unique.

Since it is not clear which combination exists for the variables.

FAQ

Contents

FAQ#

Basic data management terminology#

Output of Earth System and Climate Models#

Model Intercomparison Projects, Downscaling and Reanalysis#

Navigation - how can I find the data I need?#