Hazard assessment: Compute climate indicators#
A workflow from the CLIMAAX Handbook and MULTI_HAZARD GitHub repository.
See our how to use risk workflows page for information on how to run this notebook.
Preparation work#
Load libraries#
xclim is a Python library tailored for climate services, offering a wide range of tools to compute climate-related indicators. In this notebook, it is the primary library used to calculate the climate indicator “number of days above a threshold temperature”.
import os
import numpy as np
from scipy.stats import gumbel_r
from tqdm import tqdm
import xarray as xr
import xclim
Area of interest#
region_name = 'IT'
Path configuration#
# Folder with temperature and total precipitation data
folder = f'data_{region_name}'
input_file_prefix = 'UERRA'
# Output folder for the climate indicators
output_folder = os.path.join(folder, 'indicators/uerra')
os.makedirs(output_folder, exist_ok=True)
Tip
The default configuration for the input and output file names is for the UERRA dataset.
To load data from another dataset, adjust the variables input_file_prefix so that the relevant files are selected and output_folder so previously processed data isn’t overwritten and the outputs can be identified in the following steps of the workflow.
Example: With data from EURO-CORDEX available in files
data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tmax2m.nc
data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tp.nc
a possible configuration in the cell above is
input_file_prefix = 'CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical'
output_folder = os.path.join(folder, 'indicators/cordex-hist')
The data created in this step of the workflow can then be picked up via the cordex-hist identifier in the following steps.
Re-run this notebook as often as necessary to process all retrieved CORDEX models, scenarios, and time periods.
Process the data#
Process the temperature and precipitation files within folder and calculate climate indicators.
Specifically:
Calculate the number of days exceeding specific temperature thresholds (35°, 40° and 45° Celcius).
Calculate percentiles for both temperature and precipitation.
Estimate return levels for extreme precipitation events.
The results are saved as new NetCDF files in the path configured in output_folder.
We avoid recomputation by checking if output files already exist and handle missing values to avoid errors in the calculations.
Daily maximum temperature#
Load data#
ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tmax2m.nc'))
dailyMaxTemp = ds['tmax2m'].chunk({'time': -1})
Number of days above thresholds#
# Define the output files for Number of Days Above Thresholds
output_files_numDays = {
'Temp_DaysAbove30.nc': '30 C',
'Temp_DaysAbove35.nc': '35 C',
'Temp_DaysAbove40.nc': '40 C',
'Temp_DaysAbove45.nc': '45 C',
}
for fname, threshold in output_files_numDays.items():
output_path = os.path.join(output_folder, fname)
if os.path.exists(output_path):
print(f'The file {fname} already exists. Skipping calculation.')
else:
# Compute the number of days above the threshold
with xclim.set_options(cf_compliance='log'):
NumbDaysAbove = xclim.atmos.tx_days_above(tasmax=dailyMaxTemp, thresh=threshold, freq='YS')
NumbDaysAbove_avg = NumbDaysAbove.mean(dim='time', skipna=True)
# Preserve metadata from input dataset
NumbDaysAbove_avg.attrs = {**dailyMaxTemp.attrs, 'units': '1'}
# Write to disk in NetCDF format
NumbDaysAbove_avg.to_netcdf(output_path)
print(f'Saved {fname} to {output_path}.')
Saved Temp_DaysAbove30.nc to data_IT/indicators/uerra/Temp_DaysAbove30.nc.
Saved Temp_DaysAbove35.nc to data_IT/indicators/uerra/Temp_DaysAbove35.nc.
Saved Temp_DaysAbove40.nc to data_IT/indicators/uerra/Temp_DaysAbove40.nc.
Saved Temp_DaysAbove45.nc to data_IT/indicators/uerra/Temp_DaysAbove45.nc.
Percentiles#
# Define the output files for Percentiles
output_files_percent = {
'Temp_P95.nc': 0.95,
'Temp_P999.nc': 0.999,
}
for fname_percent, percentile in output_files_percent.items():
output_path_percent = os.path.join(output_folder, fname_percent)
if os.path.exists(output_path_percent):
print(f"The file {fname_percent} already exists. Skipping calculation.")
else:
# Calculate the percentiles across all time steps
dailyMaxTemp_nonan = dailyMaxTemp.dropna(dim='time', how='all')
calc_percentile = dailyMaxTemp_nonan.quantile(percentile, dim='time')
# Preserve metadata from input dataset
calc_percentile.attrs = dailyMaxTemp.attrs
# Write to disk in NetCDF format
calc_percentile.to_netcdf(output_path_percent)
print(f"Saved {percentile * 100}th percentile to {output_path_percent}.")
Saved 95.0th percentile to data_IT/indicators/uerra/Temp_P95.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Temp_P999.nc.
Daily precipitation#
Load data#
ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tp.nc'))
dailyTotPrep = ds['tp'].chunk({'time': -1})
Percentiles#
# Define the output files for Percentiles
output_files_percent_prec = {
'Precip_P99.nc': 0.99,
'Precip_P995.nc': 0.995,
'Precip_P999.nc': 0.999,
}
for fname_percent_prep, percentile_prep in output_files_percent_prec.items():
output_path_percent_prep = os.path.join(output_folder, fname_percent_prep)
if os.path.exists(output_path_percent_prep):
print(f"The file {fname_percent_prep} already exists. Skipping calculation.")
else:
# Calculate the percentiles across all time steps
dailyTotPrep_nonan = dailyTotPrep.dropna(dim='time', how='all')
calc_percentile_prep = dailyTotPrep_nonan.quantile(percentile_prep, dim='time')
# Preserve metadata from input dataset
calc_percentile_prep.attrs = dailyTotPrep.attrs
# Write to disk in NetCDF format
calc_percentile_prep.to_netcdf(output_path_percent_prep)
print(f"Saved {percentile_prep * 100}th percentile to {output_path_percent_prep}.")
Saved 99.0th percentile to data_IT/indicators/uerra/Precip_P99.nc.
Saved 99.5th percentile to data_IT/indicators/uerra/Precip_P995.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Precip_P999.nc.
Return levels for extreme precipitation events#
Calculate the maximum one-day precipitation per year:
annual_max_precip = dailyTotPrep.resample(time='YE').max().compute()
We define Return Periods and Exceedance Probabilities: Return periods (e.g., 10, 20, 30, 50, 100, 150 years) are specified, and their corresponding exceedance probabilities are calculated. The exceedance probability represents the likelihood of surpassing a certain precipitation threshold in any given year
return_periods = np.array([2, 5, 10, 20, 30, 50, 100, 150, 200, 500])
exceedance_probs = 1 - (1 / return_periods)
Loop over every grid cell (longitude, and latitude) and extract the annual maximum precipitation values for each.
For each grid cell the Gumbel distribution (gumbel_r.fit()) is fitted to the annual maxima series to determine location (loc) and scale (scale) parameters.
Then we calculate the return levels for the specified return periods using the previous parameters.
# Prepare a new dataset to store return levels for each period and each grid cell
return_levels_ds = xr.Dataset()
for rp in return_periods:
return_levels_ds[f"return_period_{rp}_y"] = xr.full_like(annual_max_precip.isel(time=0), np.nan)
# Loop over each grid cell (x, y) and fit the Weibull distribution
for y in tqdm(annual_max_precip['y']):
for x in annual_max_precip['x']:
# Extract the annual maximum series for the current grid cell
annual_max_values = annual_max_precip.sel(x=x, y=y).values
annual_max_values = annual_max_values[~np.isnan(annual_max_values)] # Remove NaNs
if len(annual_max_values) > 0:
# Fit the Gumbel distribution to the annual maxima for this grid cell
loc, scale = gumbel_r.fit(annual_max_values)
# Calculate return levels for the specified return periods using the Gumbel parameters
return_levels = gumbel_r.ppf(exceedance_probs, loc, scale)
# Store return levels in the dataset for each return period
for rp, rl in zip(return_periods, return_levels):
return_levels_ds[f"return_period_{rp}_y"].loc[{'x': x, 'y': y}] = rl
# Add coordinates and attributes to the new dataset
return_levels_ds['x'] = annual_max_precip['x']
return_levels_ds['y'] = annual_max_precip['y']
# Preserve metadata from input dataset
return_levels_ds.attrs = {
**annual_max_precip.attrs,
'description': 'Return levels for specified return periods based on GEV distribution fit'
}
100%|██████████| 211/211 [00:44<00:00, 4.78it/s]
The return_levels_ds dataset represents the precipitation amount expected for each return period in each grid cell.
# Save the return levels to a NetCDF file
return_levels_ds.to_netcdf(os.path.join(output_folder, 'Precip_return_levels_gumbel.nc'))
return_levels_ds
<xarray.Dataset> Size: 2MB
Dimensions: (y: 211, x: 184)
Coordinates:
latitude (y, x) float64 311kB 36.85 36.85 36.85 ... 46.5 46.49
longitude (y, x) float64 311kB 6.854 6.914 6.974 ... 19.74 19.81
time datetime64[ns] 8B 1981-12-31
* x (x) int64 1kB 0 1 2 3 4 5 6 ... 178 179 180 181 182 183
* y (y) int64 2kB 0 1 2 3 4 5 6 ... 205 206 207 208 209 210
Data variables:
return_period_2_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_5_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_10_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_20_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_30_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_50_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_100_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_150_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_200_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
return_period_500_y (y, x) float32 155kB nan nan nan nan ... nan nan nan
Attributes: (12/36)
GRIB_paramId: 228228
GRIB_dataType: an
GRIB_numberOfPoints: 1142761
GRIB_typeOfLevel: surface
GRIB_stepUnits: 1
GRIB_stepType: accum
... ...
GRIB_units: kg m**-2
long_name: Total Precipitation
units: mm/d
standard_name: unknown
GRIB_surface: 0.0
description: Return levels for specified ret...Next step#
Continue with visualising the climate indicators computed here.