kdiagram.utils.compute_coverage_score¶
- kdiagram.utils.compute_coverage_score(y_true, y_pred_lower, y_pred_upper, *, method='within', return_counts=False)[source]¶
Computes the coverage score for a given prediction interval.
This utility calculates the fraction (or count) of true values that fall within, above, or below the specified prediction interval. It is a fundamental metric for assessing the calibration of a forecast’s uncertainty bounds.
- Parameters:
- y_true
np.ndarray 1D array of the true observed values.
- y_pred_lower
np.ndarray 1D array of the lower bound of the prediction interval.
- y_pred_upper
np.ndarray 1D array of the upper bound of the prediction interval.
- method{‘within’, ‘above’, ‘below’}, default=’within’
The type of coverage to calculate:
‘within’: The standard coverage score. Calculates the proportion of true values such that lower <= true <= upper.
‘above’: Calculates the proportion of true values that are strictly above the upper bound (true > upper).
‘below’: Calculates the proportion of true values that are strictly below the lower bound (true < lower).
- return_countsbool, default=False
If
True, returns the raw count of observations matching the condition instead of the proportion (a float between 0 and 1).
- y_true
- Returns:
- Parameters:
- Return type:
See also
plot_coverageA visualization of this score.
compute_winkler_scoreA score that penalizes for lack of coverage.
Notes
The empirical coverage is a key diagnostic for checking if a model’s prediction intervals are well-calibrated. For a given \((1-\alpha) \cdot 100\%\) prediction interval, the empirical coverage should be close to \(1-\alpha\).
For the standard ‘within’ method, the coverage for a set of \(N\) observations is calculated as:
(1)¶\[\text{Coverage} = \frac{1}{N} \sum_{i=1}^{N} \mathbf{1}\{y_{lower,i} \le y_{true,i} \le y_{upper,i}\}\]where \(\mathbf{1}\) is the indicator function. The ‘above’ and ‘below’ methods are useful for diagnosing the direction of miscalibration.
Examples
>>> import numpy as np >>> from kdiagram.utils.mathext import compute_coverage_score >>> >>> y_true = np.array([1, 2, 3, 4, 5, 6]) >>> y_lower = np.array([0, 3, 2, 5, 4, 7]) >>> y_upper = np.array([2, 4, 4, 6, 6, 8]) >>> >>> # Calculate the standard coverage (4 out of 6 are within) >>> coverage = compute_coverage_score(y_true, y_lower, y_upper) >>> print(f"Coverage Score: {coverage:.2f}")
Expected Output¶Coverage Score: 0.67
>>> # Calculate the number of points below the interval >>> count_below = compute_coverage_score( ... y_true, y_lower, y_upper, method='below', return_counts=True ... ) >>> print(f"Count below interval: {count_below}")
Expected Output¶Count below interval: 2