kdiagram.utils.compute_pit¶

kdiagram.utils.compute_pit(y_true, y_preds_quantiles, quantiles)[source]¶

Computes the Probability Integral Transform (PIT) for each observation.

Parameters:

y_truenp.ndarray: 1D array of the true observed values.
y_preds_quantilesnp.ndarray: 2D array of quantile forecasts, with shape (n_samples, n_quantiles).
quantilesnp.ndarray: 1D array of the quantile levels.

Returns:

np.ndarray: A 1D array of PIT values, one for each observation.

Parameters:

y_true (ndarray)
y_preds_quantiles (ndarray)
quantiles (ndarray)

Return type:

ndarray

See also

plot_pit_histogram: A visualization of these PIT values.
calculate_calibration_error: A summary score based on PIT values.

Notes

The Probability Integral Transform (PIT) is a fundamental tool for evaluating the calibration of probabilistic forecasts [1].

When the predictive distribution is represented by a finite set of \(M\) quantiles, the PIT value for each observation \(y_i\) is approximated as the fraction of forecast quantiles that are less than or equal to the observation:

(1)¶\[\text{PIT}_i = \frac{1}{M} \sum_{j=1}^{M} \mathbf{1}\{q_{i,j} \le y_i\}\]

where \(q_{i,j}\) is the \(j\)-th quantile forecast for observation \(i\), and \(\mathbf{1}\) is the indicator function. A uniform distribution of PIT values indicates perfect calibration.

References

Examples

>>> import numpy as np
>>> from kdiagram.utils.mathext import compute_pit
>>>
>>> # Define true values and quantile forecasts for 3 observations
>>> y_true = np.array([10, 1, 5.5])
>>> quantiles = np.array([0.1, 0.5, 0.9])
>>> y_preds = np.array([
...     [8, 11, 13],  # Forecast for y_true = 10
...     [0, 0.5, 2],  # Forecast for y_true = 1
...     [4, 5, 6]     # Forecast for y_true = 5.5
... ])
>>>
>>> # Calculate the PIT value for each observation
>>> pit_values = compute_pit(y_true, y_preds, quantiles)
>>> print(pit_values)

Expected Output¶

[0.33333333 0.66666667 0.66666667]