kdiagram.utils.compute_pit

kdiagram.utils.compute_pit(y_true, y_preds_quantiles, quantiles)[source]

Computes the Probability Integral Transform (PIT) for each observation.

Parameters:
y_truenp.ndarray

1D array of the true observed values.

y_preds_quantilesnp.ndarray

2D array of quantile forecasts, with shape (n_samples, n_quantiles).

quantilesnp.ndarray

1D array of the quantile levels.

Returns:
np.ndarray

A 1D array of PIT values, one for each observation.

Parameters:
Return type:

ndarray

See also

plot_pit_histogram

A visualization of these PIT values.

calculate_calibration_error

A summary score based on PIT values.

Notes

The Probability Integral Transform (PIT) is a fundamental tool for evaluating the calibration of probabilistic forecasts [1].

When the predictive distribution is represented by a finite set of \(M\) quantiles, the PIT value for each observation \(y_i\) is approximated as the fraction of forecast quantiles that are less than or equal to the observation:

(1)\[\text{PIT}_i = \frac{1}{M} \sum_{j=1}^{M} \mathbf{1}\{q_{i,j} \le y_i\}\]

where \(q_{i,j}\) is the \(j\)-th quantile forecast for observation \(i\), and \(\mathbf{1}\) is the indicator function. A uniform distribution of PIT values indicates perfect calibration.

References

Examples

>>> import numpy as np
>>> from kdiagram.utils.mathext import compute_pit
>>>
>>> # Define true values and quantile forecasts for 3 observations
>>> y_true = np.array([10, 1, 5.5])
>>> quantiles = np.array([0.1, 0.5, 0.9])
>>> y_preds = np.array([
...     [8, 11, 13],  # Forecast for y_true = 10
...     [0, 0.5, 2],  # Forecast for y_true = 1
...     [4, 5, 6]     # Forecast for y_true = 5.5
... ])
>>>
>>> # Calculate the PIT value for each observation
>>> pit_values = compute_pit(y_true, y_preds, quantiles)
>>> print(pit_values)
Expected Output
[0.33333333 0.66666667 0.66666667]