Hazard assessment: Prepare climate datasets (EURO-CORDEX)#
A workflow from the CLIMAAX Handbook and MULTI_HAZARD GitHub repository.
See our how to use risk workflows page for information on how to run this notebook.
Preparation Work#
Load libraries#
import os
import zipfile
import cdsapi
import dask.diagnostics
import gisco_geodata
import numpy as np
import regionmask
import xarray as xr
Define the area of interest#
Specify a name for your area of interest (region_name) and load a corresponding shape with geopandas (region_gdf).
For example, access NUTS regions from GISCO:
# Specify a name for labelling outputs
region_name = 'IT'
nuts = gisco_geodata.NUTS()
# Geopandas dataframe with the shape of the region
region_gdf = nuts.get(
countries=region_name, # put a NUTS ID here
nuts_level='LEVL_0', # adjust the NUTS level to match your ID
scale='20M', # select data resolution (1M, 3M, 10M, 20M or 60M)
spatial_type='RG',
projection='4326'
)
region_gdf.plot()
Path configuration#
# Download folder for CORDEX datasets (inputs for preprocessing)
input_folder = 'data'
os.makedirs(input_folder, exist_ok=True)
# Local data after preprocessing
output_folder = f'data_{region_name}'
os.makedirs(output_folder, exist_ok=True)
CDS client setup#
To access the data on CDS, you need an ECMWF account. See the guide on how to set up the API for more information.
URL = 'https://cds.climate.copernicus.eu/api'
KEY = None # add your key or provide via ~/.cdsapirc
cds_client = cdsapi.Client(url=URL, key=KEY)
EURO-CORDEX projections#
The EURO-CORDEX (Coordinated Regional Climate Downscaling Experiment for Europe) project is a set of high-resolution regional climate projections for Europe, designed to support impact, adaptation, and vulnerability assessments under various climate change scenarios. The EURO-CORDEX integrate global climate model (GCM) outputs with regional climate models (RCMs), enabling the simulation of climatic patterns and extremes. The models explore different Representative Concentration Pathways (RCPs) from CMIP5 (RCP2.6, RCP4.5, RCP8.5) and Shared Socioeconomic Pathways (SSPs) from CMIP6 (SSP1-2.6, SSP5-8.5). The simulations cover historical periods (1950–2005) and future projections (2006–2100).
Limitations#
EURO-CORDEX offers high-resolution data (typically 0.11° ~ 12.5 km and 0.44° ~ 50 km), but it may still not fully capture localized phenomena such as urban heat islands, small-scale topographic effects, and small meteorological events.
Like all climate models, EURO-CORDEX RCMs and their driving GCMs exhibit biases compared to observed data. These biases can vary regionally and seasonally.
Climate models may struggle to accurately simulate extreme weather events such as heatwaves, heavy precipitation, or storms.
While the dataset captures trends in extremes, very high thresholds (>45°C or >100 mm/day rainfall) may have higher uncertainty due to limited observational data.
Model, scenario and period#
Data from the CDS: CORDEX regional climate model data on single levels.
# Global (GCM) and regional (RCM) climate model combination and its ensemble member
gcm = 'mpi_m_mpi_esm_lr'
rcm = 'clmcom_clm_cclm4_8_17'
ens = 'r1i1p1'
# Optional: define a custom shorthand for this model configuration
cordex_id = f'{gcm}-{rcm}-{ens}'
# Representative concentration pathway
scenario = 'rcp_4_5'
# Future period (data is available in 5-year blocks from 2006 to 2100)
future_start = 2041
future_end = 2070
# Optional: define a custom shorthand for this scenario and period configuration
scenario_id = f'{scenario}-{future_start}-{future_end}'
For the historical period, we retrieve data between 1981 and 2010 by default.
Tip
Use the CDS download form to see which combinations of GCM, RCM, ensemble member, and scenario are valid. The API request code at the bottom contains the values that need to be inserted here.
Approx 4 GB per 30 year time period of daily values.
Helper functions#
Maximum Temperature#
def load_and_preprocess_temperature(path):
ds = xr.open_mfdataset(os.path.join(path, '*.nc'), chunks='auto')
ds = ds.rename({'rlon': 'x', 'rlat': 'y', 'lat': 'latitude', 'lon': 'longitude', 'tasmax': 'tmax2m'})
ds = clip_and_mask(ds, region_gdf)
ds = ds['tmax2m'] - 273.15 # temperature from Kelvin to Celsius
ds.attrs = {'units': 'C', 'gcm': gcm, 'rcm': rcm, 'ens': ens}
return ds
Data for the historical period:
# Submit a CDS request, download the zip archive with the data and unpack
zip_tasmax_hist = cordex_request_hist('maximum_2m_temperature_in_the_last_24_hours', 'tasmax')
path_tasmax_hist = extract(zip_tasmax_hist, delete_archive=False)
# Load the data, clip to the region of interest, and adapt the structure
# for compatibility with the next steps of the workflow
ds_tmax2m = load_and_preprocess_temperature(path_tasmax_hist)
ds_tmax2m.attrs['scenario'] = 'historical'
# Save the processed and clipped data to a new NetCDF file
with dask.diagnostics.ProgressBar():
output_file_path = os.path.join(output_folder, f'CORDEX-{cordex_id}-historical-tmax2m.nc')
ds_tmax2m.to_netcdf(output_file_path, encoding={'tmax2m': {'compression': 'zlib'}})
print(f'Saved processed file to: {output_file_path}')
Data for the future period:
# Submit a CDS request, download the zip archive with the data and unpack
zip_tasmax_future = cordex_request_future('maximum_2m_temperature_in_the_last_24_hours', 'tasmax')
path_tasmax_future = extract(zip_tasmax_future, delete_archive=False)
# Load the data, clip to the region of interest, and adapt the structure
# for compatibility with the next steps of the workflow
ds_tmax2m = load_and_preprocess_temperature(path_tasmax_future)
ds_tmax2m.attrs['scenario'] = scenario
ds_tmax2m.attrs['period'] = f"{future_start}-{future_end}"
# Save the processed and clipped data to a new NetCDF file
with dask.diagnostics.ProgressBar():
output_file_path = os.path.join(output_folder, f'CORDEX-{cordex_id}-{scenario_id}-tmax2m.nc')
ds_tmax2m.to_netcdf(output_file_path, encoding={'tmax2m': {'compression': 'zlib'}})
print(f'Saved processed file to: {output_file_path}')
Total Precipitation#
def load_and_preprocess_precipitation(path):
ds = xr.open_mfdataset(os.path.join(path, '*.nc'), chunks='auto')
ds = ds.rename({'rlon': 'x', 'rlat': 'y', 'lat': 'latitude', 'lon': 'longitude', 'pr': 'tp'})
ds = clip_and_mask(ds, region_gdf)
ds = ds["tp"] * (60 * 60 * 24) # precipitation flux from kg/m²/s to mm/day
ds.attrs= {'units': 'mm/d', 'gcm': gcm, 'rcm': rcm, 'ens': ens}
return ds
Data for the historical period:
# Submit a CDS request, download the zip archive with the data and unpack
zip_pr_hist = cordex_request_hist('mean_precipitation_flux', 'pr')
path_pr_hist = extract(zip_pr_hist, delete_archive=False)
# Load the data, clip to the region of interest, and adapt the structure
# for compatibility with the next steps of the workflow
ds_tp = load_and_preprocess_precipitation(path_pr_hist)
ds_tp.attrs['scenario'] = 'historical'
# Save the processed and clipped data to a new NetCDF file
with dask.diagnostics.ProgressBar():
output_file_path = os.path.join(output_folder, f'CORDEX-{cordex_id}-historical-tp.nc')
ds_tp.to_netcdf(output_file_path, encoding={'tp': {'compression': 'zlib'}})
print(f'Saved processed file to: {output_file_path}')
Data for the future period:
# Submit a CDS request, download the zip archive with the data and unpack
zip_pr_future = cordex_request_future('mean_precipitation_flux', 'pr')
path_pr_future = extract(zip_pr_future, delete_archive=False)
# Load the data, clip to the region of interest, and adapt the structure
# for compatibility with the next steps of the workflow
ds_tp = load_and_preprocess_precipitation(path_pr_future)
ds_tp.attrs['scenario'] = scenario
ds_tp.attrs['period'] = f"{future_start}-{future_end}"
# Save the processed and clipped data to a new NetCDF file
with dask.diagnostics.ProgressBar():
output_file_path = os.path.join(output_folder, f'CORDEX-{cordex_id}-{scenario_id}-tp.nc')
ds_tp.to_netcdf(output_file_path, encoding={'tp': {'compression': 'zlib'}})
print(f'Saved processed file to: {output_file_path}')
Next step#
Continue with computing climate indicators from the data prepared here.