⚠️ This is the first operational version of the handbook, but it is still a work in progress and will be heavily updated during 2024! ⚠️

Hazard assessment for relative drought#

Click Binder to launch this workflow on MyBinder.

In this workflow drought hazard (dH) for a given region is estimated as the probability of exceedance the median of regional (e.g., EU level) severe precipitation deficits for an historical reference period (e.g. 1979-2019) or for a future projection period (e.g. 2015-2100). The methodology used here was developed and applied globally by Carrão et al. (2016) \(^1\).

Workflow on how to quantify drough risk as the product of drought hazard, exposure and vulnerability can be found in the Risk_assessement notebook. Visualization tools based on preprocessed results for both drought hazard and drought risk can be found in Risk_visualization notebook.

Below is a description of the data and tools used to calculate droguth hazard, both for the historic period and for future scenarios, and the outputs of this workflow.


We use the weighted anomaly of standardized precipitation (WASP) index to define the severity of precipitation deficit. The WASP-index takes into account the annual seasonality of the precipitation cycle and is computed by summing weighted standardized monthly precipitation anomalies (see Eq. 1). Where \(P_{n,m}\) is each region’s monthly precipitation, \(T_m\) is a monthly threshold defining precipitation severity, and \(T_A\) is an annual threshold for precipitation severity. The thresholds are defined by dividing multi-annual monthly observed rain using the ‘Fisher-jenks’ classification algorithm \(^2\).

Eq. 1:

\[WASP_j = \Sigma_{P_{n,m} < T_m}^{P_{n,m} >= T_m}( \frac{P_{n,m} - T_m}{T_m})*\frac{T_m}{T_A}\]

Workflow implementation#

Load libraries#

import os
import urllib
import pooch
os.environ['USE_PYGEOS'] = '0'
import pandas as pd
import geopandas as gpd
import numpy as np
import scipy
import jenkspy
import json
import pyproj
import plotly.express as px
import matplotlib.pyplot as plt
from datetime import datetime
from datetime import date

Define working environment and global parameters#

This workflow relies on pre-processed data. The user will define the path to the data folder and the code below would create a folder for outputs.

# Set working environment
workflow_folder = './sample_data_nuts3/'

# Define scenario 0: historic; 1: SSP1-2.6; 2: SSP3-7.0. 3: SSP5-8.5
scn = 0

# Define time (applicible only for the future): 0: near-future (2050); 1: far-future (2080)
time = 0

pattern = "historic"
pattern_h = "historic"
if scn != 0:
    pattern_h = ['ssp126', 'ssp370', 'ssp585'][scn - 1]
    pattern = ['ssp126', 'ssp370', 'ssp585'][scn - 1] + '_' + ['nf', 'ff'][time]

# debug if folder does not exist - issue an error to check path

# Create outputs folder
name_output_folder = 'outputs_hazards'
os.makedirs(os.path.join(workflow_folder, name_output_folder), exist_ok=True)

Access to sample dataset#

Load the file registry for the droughtrisk_sample_nuts3 dataset in the CLIMAAX cloud storage with pooch.

sample_data_pooch = pooch.create(

If any files requested below were downloaded before, pooch will inspect the local file contents and skip the download if the contents match expectations.

Choose country code#

Choose country code from:
‘HR’, ‘DE’, ‘BG’, ‘AT’, ‘AL’, ‘BE’, ‘ES’, ‘CH’, ‘CZ’, ‘EL’, ‘FR’, ‘FI’, ‘EE’, ‘DK’, ‘CY’, ‘HU’, ‘NL’, ‘NO’, ‘LV’, ‘LT’, ‘IS’, ‘MK’, ‘MT’, ‘IT’, ‘TR’, ‘PL’, ‘RO’, ‘SE’, ‘RS’, ‘PT’, ‘IE’, ‘UK’, ‘ME’, ‘LU’, ‘SK’, ‘SI’ ,‘LI’

ccode = "AL"

Load and visualize precipitation data#

# Download precipitation data for selected scenario
precip_file = sample_data_pooch.fetch(f"drought_hazard_{pattern_h}.csv")

print("Analyzing drought hazard. This process may take few minutes...")

# Load precipitation data
precip = pd.read_csv(precip_file)
# Convert timing column to datetime
precip['timing'] = pd.to_datetime(precip['timing'], format='%Y-%m-%d')

# time  subset

if scn != 0:
    if time == 0:
        precip = precip.loc[(precip['timing'].dt.date >= date(2020,1,1)) & (precip['timing'].dt.date  < date(2060,1,1)), :]
        precip = precip.loc[(precip['timing'].dt.date >= date(2060,1,1)) & (precip['timing'].dt.date  < date(2100,1,1)), :]
    precip = precip.loc[(precip['timing'].dt.date >= date(1979,1,1)) & (precip['timing'].dt.date  < date(2020,1,1)), :]

# col_subset aims to extract the relevant results
precip = precip.reset_index()

col_subset = list(precip.columns.str.contains(ccode))
col_subset[1] = True
precip = precip.loc[:, col_subset]

# clean NaN rows & missing columns
precip = precip.loc[~np.array(precip.isna().all(axis = 1)),:]

drop_regions = []

# missing data in columns
col_subset = np.array(precip.isna().all(axis = 0))
drop_regions += list(precip.columns[col_subset])
precip = precip.loc[:, ~col_subset]

regions = precip.columns[1:]
output = pd.DataFrame(regions, columns = ['NUTS_ID'])

print("The following regions are dropped due to missing data: "+ str(drop_regions))
print('Input precipitation data (top 3 rows): ')
Analyzing drought hazard. This process may take few minutes...

The following regions are dropped due to missing data: []

Input precipitation data (top 3 rows): 
      timing     AL011     AL012     AL013    AL014     AL015     AL021  \
0 2020-01-31  0.000220  0.000073  0.000197  0.00013  0.000377  0.000199   
1 2020-02-29  0.000279  0.000100  0.000245  0.00016  0.000459  0.000255   
2 2020-03-31  0.000258  0.000082  0.000235  0.00015  0.000418  0.000224   

      AL022     AL031     AL032     AL033     AL034     AL035  
0  0.000187  0.000174  0.000174  0.000278  0.000230  0.000245  
1  0.000242  0.000231  0.000218  0.000394  0.000300  0.000343  
2  0.000208  0.000188  0.000179  0.000299  0.000249  0.000249  

Calculate WASP Index (Weighted Anomaly Standardized Precipitation) monthly threshold#

# create empty arrays and tables for intermediate and final results
WASP = []
WASP_global = []
drought_class = precip.copy()

# prepare output for drought event index - WASP_j- list of lists wasp = [[rid1], [rid2], ...]
for i in range(1, len(precip.columns)):

    # For every NUTS3 out of all regions - do the following:

    # empty array for the monthly water deficit thresholds
    if scn == 0:
        t_m = pd.DataFrame(np.tile([0], (12, len(precip.columns) - 1)))
        t_m_file = download_sample_data(os.path.join(name_output_folder, f'tm_hist_{ccode}.csv'))
        t_a_file = download_sample_data(os.path.join(name_output_folder, f'ta_hist_{ccode}.csv'))
        t_m = pd.read_csv(t_m_file)
        t_a = list(pd.read_csv(t_a_file).t_a)

    for mon_ in range(1, 13):
        # For every month out of all all months (January, ..., December) - do the following:

        # calculate monthly drought threshold -\
            # using a division of the data into to clusters with the Jenks' (Natural breaks) algorithm
        r_idx = precip.index[precip.timing.dt.month == mon_].tolist()
        if pattern == 'historic':
            t_m_last = jenkspy.jenks_breaks(precip.iloc[r_idx, i], n_classes = 2)[1]
            t_m.iloc[mon_ - 1, i - 1] = t_m_last

        # Define every month with water deficity (precipitation < threshold) as a drought month
        drought_class.iloc[r_idx, i] = (drought_class.iloc[r_idx, i] < t_m.iloc[mon_ - 1, i - 1]).astype(int)

    # calculate annual water deficit threshold
    if pattern == 'historic':
        t_a = list(t_m.sum(axis = 0))

    t_m0 = t_m.iloc[:, i - 1]
    t_a0 = t_a[i-1]
    # calculate droughts' magnitude and duration using the WASP indicator
    WASP_tmp = []
    index = []
    for k in range(1, len(precip)):
        # for every row (ordered month-year combinations):
            # check if drought month -> calculate drought accumulated magnitude (over 1+ months)
        if drought_class.iloc[k, i]== 1:
            # In case of a drought month
            # calculate monthly WASP index
            index = int(drought_class.timing.dt.month[k] - 1)
            # WASP monthly index: [(precipitation - month_threshold)/month_threshold)]*[month_threshold/annual_treshold]
            WASP_last=((precip.iloc[k,i] - t_m0[index])/t_m0[index])* (t_m0[index]/t_a0)

            if first_true==0:
                # if this is the first month in a drought event:
                # append calculated monthly wasp to WASP array.
                # if this is NOT the first month in a drought event:
                # add the calculated monthly wasp to last element in the WASP array (accumulative drought).
                WASP_tmp[-1]=WASP_tmp[-1] + WASP_last
            # check if not drought month - do not calculate WASP

if pattern == 'historic':
    t_m.to_csv(os.path.join(workflow_folder, name_output_folder, f'tm_hist_{ccode}.csv'),\
              index = False)
    pd.DataFrame({'t_a': t_a}).to_csv(os.path.join(workflow_folder, name_output_folder, f'ta_hist_{ccode}.csv'),\
                                    index = False)
dH = []
WASP = np.array(WASP, dtype=object)
# calculate global median deficit severity -
    # set drought hazard (dH) as the probability of exceeding the global median water deficit.

median_global_wasp = np.nanmedian(WASP_global)

# calculate dH per region i
for i in range(WASP.shape[0]):
    # The more negative the WASP index, the more severe is the deficit event, so
    # probability of exceedence the severity is 1 - np.nansum(WASP[i] >= median_global_wasp) / len(WASP[i])
    if len(WASP[i]) > 0:
        dH.append(round(1 - np.nansum(WASP[i] >= median_global_wasp) / len(WASP[i]), 3))

# https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2004GL020901 - WASP Indicator description

output['wasp_raw_mean'] = [np.nanmean(x) for x in WASP]
output['wasp_raw_q25'] = [np.nanquantile(x, q = 0.25) for x in WASP]
output['wasp_raw_median'] = [np.nanmedian(x) for x in WASP]
output['wasp_raw_q75'] = [np.nanquantile(x, q = 0.75) for x in WASP]
output['wasp_raw_count'] = [x.shape[0] for x in WASP]

output['hazard_raw'] = dH
print('>>>>> Drought hazard is completed.')

output.to_csv(os.path.join(workflow_folder, name_output_folder, f'droughthazard_{ccode}_{pattern}.csv'),\
              index = False)
print('>>>>> Drought hazard indices were saved.')
>>>>> Drought hazard is completed.
>>>>> Drought hazard indices were saved.


The above workflow calculates the Drought hazard (dH) index that can be used as an input to calculate drought risk in the workflow described in the file Risk_Assessment.ipynb.


The workflow has beend developed by Silvia Artuso and Dor Fridman from IIASA’s Water Security Research Group, and supported by Michaela Bachmann from IIASA’s Systemic Risk and Reslience Research Group.


[1] Carrão, H., Naumann, G., & Barbosa, P. (2016). Mapping global patterns of drought risk: An empirical framework based on sub-national estimates of hazard, exposure and vulnerability. Global Environmental Change, 39, 108-124.