Hazard assessment: Compute climate indicators#

Preparation work#

Load libraries#

xclim is a Python library tailored for climate services, offering a wide range of tools to compute climate-related indicators. In this notebook, it is the primary library used to calculate the climate indicator “number of days above a threshold temperature”.

import os

import numpy as np
from scipy.stats import gumbel_r
from tqdm import tqdm
import xarray as xr
import xclim

Area of interest#

region_name = 'IT'

Path configuration#

# Folder with temperature and total precipitation data
folder = f'data_{region_name}'

input_file_prefix = 'UERRA'

# Output folder for the climate indicators
output_folder = os.path.join(folder, 'indicators/uerra')
os.makedirs(output_folder, exist_ok=True)

Tip

The default configuration for the input and output file names is for the UERRA dataset. To load data from another dataset, adjust the variables input_file_prefix so that the relevant files are selected and output_folder so previously processed data isn’t overwritten and the outputs can be identified in the following steps of the workflow.

Example: With data from EURO-CORDEX available in files

data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tmax2m.nc
data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tp.nc

a possible configuration in the cell above is

input_file_prefix = 'CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical'
output_folder = os.path.join(folder, 'indicators/cordex-hist')

The data created in this step of the workflow can then be picked up via the cordex-hist identifier in the following steps.

Re-run this notebook as often as necessary to process all retrieved CORDEX models, scenarios, and time periods.

Process the data#

Process the temperature and precipitation files within folder and calculate climate indicators. Specifically:

  • Calculate the number of days exceeding specific temperature thresholds (35°, 40° and 45° Celcius).

  • Calculate percentiles for both temperature and precipitation.

  • Estimate return levels for extreme precipitation events.

The results are saved as new NetCDF files in the path configured in output_folder. We avoid recomputation by checking if output files already exist and handle missing values to avoid errors in the calculations.

Daily maximum temperature#

Load data#

ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tmax2m.nc'))

dailyMaxTemp = ds['tmax2m'].chunk({'time': -1})

Number of days above thresholds#

# Define the output files for Number of Days Above Thresholds
output_files_numDays = {
    'Temp_DaysAbove30.nc': '30 C',
    'Temp_DaysAbove35.nc': '35 C',
    'Temp_DaysAbove40.nc': '40 C',
    'Temp_DaysAbove45.nc': '45 C',
}

for fname, threshold in output_files_numDays.items():
    output_path = os.path.join(output_folder, fname)

    if os.path.exists(output_path):
        print(f'The file {fname} already exists. Skipping calculation.')
    else:
        # Compute the number of days above the threshold
        with xclim.set_options(cf_compliance='log'):
            NumbDaysAbove = xclim.atmos.tx_days_above(tasmax=dailyMaxTemp, thresh=threshold, freq='YS')
        NumbDaysAbove_avg = NumbDaysAbove.mean(dim='time', skipna=True)
        # Preserve metadata from input dataset
        NumbDaysAbove_avg.attrs = {**dailyMaxTemp.attrs, 'units': '1'}
        # Write to disk in NetCDF format
        NumbDaysAbove_avg.to_netcdf(output_path)
        print(f'Saved {fname} to {output_path}.')
Saved Temp_DaysAbove30.nc to data_IT/indicators/uerra/Temp_DaysAbove30.nc.
Saved Temp_DaysAbove35.nc to data_IT/indicators/uerra/Temp_DaysAbove35.nc.
Saved Temp_DaysAbove40.nc to data_IT/indicators/uerra/Temp_DaysAbove40.nc.
Saved Temp_DaysAbove45.nc to data_IT/indicators/uerra/Temp_DaysAbove45.nc.

Percentiles#

# Define the output files for Percentiles
output_files_percent = {
    'Temp_P95.nc': 0.95,
    'Temp_P999.nc': 0.999,
}

for fname_percent, percentile in output_files_percent.items():
    output_path_percent = os.path.join(output_folder, fname_percent)

    if os.path.exists(output_path_percent):
        print(f"The file {fname_percent} already exists. Skipping calculation.")
    else:
        # Calculate the percentiles across all time steps
        dailyMaxTemp_nonan = dailyMaxTemp.dropna(dim='time', how='all')
        calc_percentile = dailyMaxTemp_nonan.quantile(percentile, dim='time')
        # Preserve metadata from input dataset
        calc_percentile.attrs = dailyMaxTemp.attrs
        # Write to disk in NetCDF format
        calc_percentile.to_netcdf(output_path_percent)
        print(f"Saved {percentile * 100}th percentile to {output_path_percent}.")
Saved 95.0th percentile to data_IT/indicators/uerra/Temp_P95.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Temp_P999.nc.

Daily precipitation#

Load data#

ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tp.nc'))

dailyTotPrep = ds['tp'].chunk({'time': -1})

Percentiles#

# Define the output files for Percentiles
output_files_percent_prec = {
    'Precip_P99.nc': 0.99,
    'Precip_P995.nc': 0.995,
    'Precip_P999.nc': 0.999,
}

for fname_percent_prep, percentile_prep in output_files_percent_prec.items():
    output_path_percent_prep = os.path.join(output_folder, fname_percent_prep)

    if os.path.exists(output_path_percent_prep):
        print(f"The file {fname_percent_prep} already exists. Skipping calculation.")
    else:
        # Calculate the percentiles across all time steps
        dailyTotPrep_nonan = dailyTotPrep.dropna(dim='time', how='all')
        calc_percentile_prep = dailyTotPrep_nonan.quantile(percentile_prep, dim='time')
        # Preserve metadata from input dataset
        calc_percentile_prep.attrs = dailyTotPrep.attrs
        # Write to disk in NetCDF format
        calc_percentile_prep.to_netcdf(output_path_percent_prep)
        print(f"Saved {percentile_prep * 100}th percentile to {output_path_percent_prep}.")
Saved 99.0th percentile to data_IT/indicators/uerra/Precip_P99.nc.
Saved 99.5th percentile to data_IT/indicators/uerra/Precip_P995.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Precip_P999.nc.

Return levels for extreme precipitation events#

Calculate the maximum one-day precipitation per year:

annual_max_precip = dailyTotPrep.resample(time='YE').max().compute()

We define Return Periods and Exceedance Probabilities: Return periods (e.g., 10, 20, 30, 50, 100, 150 years) are specified, and their corresponding exceedance probabilities are calculated. The exceedance probability represents the likelihood of surpassing a certain precipitation threshold in any given year

return_periods = np.array([2, 5, 10, 20, 30, 50, 100, 150, 200, 500])
exceedance_probs = 1 - (1 / return_periods)

Loop over every grid cell (longitude, and latitude) and extract the annual maximum precipitation values for each. For each grid cell the Gumbel distribution (gumbel_r.fit()) is fitted to the annual maxima series to determine location (loc) and scale (scale) parameters. Then we calculate the return levels for the specified return periods using the previous parameters.

# Prepare a new dataset to store return levels for each period and each grid cell
return_levels_ds = xr.Dataset()
for rp in return_periods:
    return_levels_ds[f"return_period_{rp}_y"] = xr.full_like(annual_max_precip.isel(time=0), np.nan)

# Loop over each grid cell (x, y) and fit the Weibull distribution
for y in tqdm(annual_max_precip['y']):
    for x in annual_max_precip['x']:
        # Extract the annual maximum series for the current grid cell
        annual_max_values = annual_max_precip.sel(x=x, y=y).values
        annual_max_values = annual_max_values[~np.isnan(annual_max_values)]  # Remove NaNs
        if len(annual_max_values) > 0:
            # Fit the Gumbel distribution to the annual maxima for this grid cell
            loc, scale = gumbel_r.fit(annual_max_values)
            # Calculate return levels for the specified return periods using the Gumbel parameters
            return_levels = gumbel_r.ppf(exceedance_probs, loc, scale)
            # Store return levels in the dataset for each return period
            for rp, rl in zip(return_periods, return_levels):
                return_levels_ds[f"return_period_{rp}_y"].loc[{'x': x, 'y': y}] = rl

# Add coordinates and attributes to the new dataset
return_levels_ds['x'] = annual_max_precip['x']
return_levels_ds['y'] = annual_max_precip['y']
# Preserve metadata from input dataset
return_levels_ds.attrs = {
    **annual_max_precip.attrs,
    'description': 'Return levels for specified return periods based on GEV distribution fit'
}
100%|██████████| 211/211 [00:44<00:00,  4.78it/s]

The return_levels_ds dataset represents the precipitation amount expected for each return period in each grid cell.

# Save the return levels to a NetCDF file
return_levels_ds.to_netcdf(os.path.join(output_folder, 'Precip_return_levels_gumbel.nc'))
return_levels_ds
<xarray.Dataset> Size: 2MB
Dimensions:              (y: 211, x: 184)
Coordinates:
    latitude             (y, x) float64 311kB 36.85 36.85 36.85 ... 46.5 46.49
    longitude            (y, x) float64 311kB 6.854 6.914 6.974 ... 19.74 19.81
    time                 datetime64[ns] 8B 1981-12-31
  * x                    (x) int64 1kB 0 1 2 3 4 5 6 ... 178 179 180 181 182 183
  * y                    (y) int64 2kB 0 1 2 3 4 5 6 ... 205 206 207 208 209 210
Data variables:
    return_period_2_y    (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_5_y    (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_10_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_20_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_30_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_50_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_100_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_150_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_200_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_500_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
Attributes: (12/36)
    GRIB_paramId:                             228228
    GRIB_dataType:                            an
    GRIB_numberOfPoints:                      1142761
    GRIB_typeOfLevel:                         surface
    GRIB_stepUnits:                           1
    GRIB_stepType:                            accum
    ...                                       ...
    GRIB_units:                               kg m**-2
    long_name:                                Total Precipitation
    units:                                    mm/d
    standard_name:                            unknown
    GRIB_surface:                             0.0
    description:                              Return levels for specified ret...

Next step#

Continue with visualising the climate indicators computed here.