kdiagram.utils.compute_interval_width¶
- kdiagram.utils.compute_interval_width(df, *quantile_pairs, prefix='width_', inplace=False)[source]¶
Computes the width of one or more prediction intervals.
This is a fundamental data preparation utility that calculates the difference between upper and lower quantile columns for one or more forecast intervals. The resulting interval width is a key measure of a forecast’s sharpness.
- Parameters:
- df
pd.DataFrame The input DataFrame containing the quantile forecast columns.
- quantile_pairs
listof(strorfloat) One or more lists or tuples, each containing two elements in the order:
[lower_quantile_col, upper_quantile_col].- prefix
str, default=’width_’ The prefix for the new interval width column names. The new name will be f”{prefix}{upper_col_name}”.
- inplacebool, default=False
If
True, modifies the original DataFrame by adding the new columns. IfFalse(default), returns a new DataFrame.
- df
- Returns:
pd.DataFrameThe DataFrame with the new interval width column(s) added.
- Raises:
ValueErrorIf a provided pair does not contain exactly two column names.
- Parameters:
- Return type:
DataFrame
See also
plot_polar_sharpnessA plot that directly uses this metric.
compute_winkler_scoreA score that uses interval width as a component.
Notes
The width of a prediction interval is the most direct measure of a forecast’s sharpness, a key property of probabilistic forecasts [1]. A smaller width indicates a more precise, or sharper, forecast.
For a given observation \(i\), the interval width \(w_i\) is the simple difference between the upper and lower quantile forecasts:
(1)¶\[w_i = q_{upper, i} - q_{lower, i}\]References
Examples
>>> import pandas as pd >>> from kdiagram.utils.forecast_utils import compute_interval_width >>> >>> df = pd.DataFrame({ ... 'q10_model_A': [1, 2], 'q90_model_A': [10, 12], ... 'q05_model_A': [0, 1], 'q95_model_A': [11, 13] ... }) >>> >>> # Calculate the 80% and 90% interval widths >>> widths_df = compute_interval_width( ... df, ['q10_model_A', 'q90_model_A'], ['q05_model_A', 'q95_model_A'] ... ) >>> print(widths_df) q10_model_A q90_model_A q05_model_A q95_model_A width_q90_model_A width_q95_model_A 0 1 10 0 11 9 11 1 2 12 1 13 10 12