kdiagram.utils.compute_winkler_score¶
- kdiagram.utils.compute_winkler_score(y_true, y_pred_lower, y_pred_upper, alpha=0.1)[source]¶
Computes the Winkler score for a given prediction interval.
The Winkler score is a proper scoring rule that evaluates a prediction interval by combining its width (sharpness) with a penalty for observations that fall outside the interval. A lower score indicates a better forecast.
- Parameters:
- y_true
np.ndarray 1D array of the true observed values.
- y_pred_lower
np.ndarray 1D array of the lower bound of the prediction interval.
- y_pred_upper
np.ndarray 1D array of the upper bound of the prediction interval.
- alpha
float, default=0.1 The significance level for the prediction interval. For example, alpha=0.1 corresponds to a (1-0.1)*100 = 90% prediction interval.
- y_true
- Returns:
floatThe average Winkler score over all observations.
- Parameters:
- Return type:
See also
compute_coverage_scoreA metric that only assesses coverage.
compute_interval_widthA metric that only assesses sharpness.
Notes
The Winkler score [1] is designed to evaluate both the sharpness and calibration of a prediction interval simultaneously. The score for a single observation \(y\) and a \((1-\alpha)\) prediction interval \([l, u]\) is defined as:
(1)¶\[\begin{split}S_{\alpha}(l, u, y) = (u - l) + \begin{cases} \frac{2}{\alpha}(l - y) & \text{if } y < l \\ 0 & \text{if } l \le y \le u \\ \frac{2}{\alpha}(y - u) & \text{if } y > u \end{cases}\end{split}\]The first term, \((u - l)\), is the interval width, which rewards sharpness (narrower intervals). The second term is a penalty that is applied only if the observation falls outside the interval. The penalty increases as the observation gets further from the violated bound. This function returns the average of this score over all observations.
References
Examples
>>> import numpy as np >>> from kdiagram.utils.mathext import compute_winkler_score >>> >>> y_true = np.array([1, 5, 12]) >>> y_lower = np.array([2, 4, 8]) >>> y_upper = np.array([8, 6, 10]) >>> >>> # For a 90% interval (alpha=0.1) >>> # Obs 1 (y=1): outside. Width=6. Penalty=(2/0.1)*(2-1)=20. Score=26. >>> # Obs 2 (y=5): inside. Width=2. Penalty=0. Score=2. >>> # Obs 3 (y=12): outside. Width=2. Penalty=(2/0.1)*(12-10)=40. Score=42. >>> # Average = (26 + 2 + 42) / 3 = 23.33 >>> >>> score = compute_winkler_score( ... y_true, y_lower, y_upper, alpha=0.1 ... ) >>> print(f"Average Winkler Score: {score:.2f}")
Expected Output¶Average Winkler Score: 23.33