kdiagram.utils.compute_interval_width

kdiagram.utils.compute_interval_width(df, *quantile_pairs, prefix='width_', inplace=False)[source]

Computes the width of one or more prediction intervals.

This is a fundamental data preparation utility that calculates the difference between upper and lower quantile columns for one or more forecast intervals. The resulting interval width is a key measure of a forecast’s sharpness.

Parameters:
dfpd.DataFrame

The input DataFrame containing the quantile forecast columns.

quantile_pairslist of (str or float)

One or more lists or tuples, each containing two elements in the order: [lower_quantile_col, upper_quantile_col].

prefixstr, default=’width_’

The prefix for the new interval width column names. The new name will be f”{prefix}{upper_col_name}”.

inplacebool, default=False

If True, modifies the original DataFrame by adding the new columns. If False (default), returns a new DataFrame.

Returns:
pd.DataFrame

The DataFrame with the new interval width column(s) added.

Raises:
ValueError

If a provided pair does not contain exactly two column names.

Parameters:
Return type:

DataFrame

See also

plot_polar_sharpness

A plot that directly uses this metric.

compute_winkler_score

A score that uses interval width as a component.

Notes

The width of a prediction interval is the most direct measure of a forecast’s sharpness, a key property of probabilistic forecasts [1]. A smaller width indicates a more precise, or sharper, forecast.

For a given observation \(i\), the interval width \(w_i\) is the simple difference between the upper and lower quantile forecasts:

(1)\[w_i = q_{upper, i} - q_{lower, i}\]

References

Examples

>>> import pandas as pd
>>> from kdiagram.utils.forecast_utils import compute_interval_width
>>>
>>> df = pd.DataFrame({
...     'q10_model_A': [1, 2], 'q90_model_A': [10, 12],
...     'q05_model_A': [0, 1], 'q95_model_A': [11, 13]
... })
>>>
>>> # Calculate the 80% and 90% interval widths
>>> widths_df = compute_interval_width(
...     df, ['q10_model_A', 'q90_model_A'], ['q05_model_A', 'q95_model_A']
... )
>>> print(widths_df)
   q10_model_A  q90_model_A  q05_model_A  q95_model_A  width_q90_model_A  width_q95_model_A
0            1           10            0           11                  9                 11
1            2           12            1           13                 10                 12