kdiagram.metrics.cluster_aware_severity_score¶
- kdiagram.metrics.cluster_aware_severity_score(y_true, y_pred, *, sample_weight=None, window_size=21, sort_by=None, normalize='band', density_source='indicator', kernel='triangular', lambda_=1.0, gamma=1.0, eps=1e-12, multioutput='uniform_average', nan_policy='omit', return_details=False)[source]¶
Compute the Cluster-Aware Severity (CAS) score.
This metric evaluates prediction intervals by penalizing not only the magnitude of interval failures (anomalies) but also their local concentration in time or space. CAS highlights models that generate runs of misses, which are often more operationally risky than isolated errors with similar size.
Formally, for observation \(y_t\) and interval \([L_t, U_t]\) at level \(1-\alpha\), define the signed excess and magnitude \(m_t=\max(L_t-y_t,0)+\max(y_t-U_t,0)\). With the band width \(w_t=U_t-L_t\) and small \(\varepsilon>0\), the normalized excess is \(\tilde m_t = m_t / (w_t+\varepsilon)\). Let \(A_t=\mathbf{1}\{y_t<L_t \text{ or } y_t>U_t\}\) and \(d_t\) be a centered kernel average of either indicators (\(A_t\)) or magnitudes (\(\tilde m_t\)) over a window of size
window_size. The pointwise severity is(1)¶\[S_t \;=\; \tilde m_t \Bigl(1 + \lambda\, d_t^{\gamma}\Bigr),\]with \(\lambda\ge 0\) and \(\gamma\ge 1\). The CAS score is the average \(n^{-1}\sum_t S_t\). Lower values indicate fewer and less clustered violations.
- Parameters:
- y_truearray_like
ofshape(n_samples,) or (n_samples, n_outputs) Ground-truth targets. For multioutput, the same prediction interval (from
y_pred) is applied to each output unless your wrapper expands bounds per output.- y_predarray_like
ofshape(n_samples, 2) Predicted interval bounds. Column 0 is the lower bound \(L_t\); column 1 is the upper bound \(U_t\).
- sample_weightarray_like
ofshape(n_samples,), default=None Optional weights for averaging the final severities.
- window_size
int, default=21 Half-width plus one for the centered smoothing window used to compute \(d_t\). Larger values capture longer runs.
- sort_byarray_like
ofshape(n_samples,),optional Key used to order samples before computing \(d_t\). Typical choices are time, a spatial coordinate, or any ordering that makes clustering meaningful.
- normalize{‘band’, ‘mad’, ‘none’}, default=’band’
Normalization for the excess \(m_t\).
‘band’: divide by \(w_t=U_t-L_t\) (unit-free).
‘mad’: divide by a robust global scale (median absolute deviation).
‘none’: no normalization (units of the series).
- density_source{‘indicator’, ‘magnitude’}, default=’indicator’
Source for computing \(d_t\).
‘indicator’: kernel average of \(A_t\) (0/1 misses), matching the CAS definition in the paper.
‘magnitude’: kernel average of normalized magnitude (more sensitive to large single misses).
- kernel{‘box’, ‘triangular’, ‘epan’, ‘gaussian’}, default=’box’
Smoothing kernel used to compute the local density \(d_t\). The kernel’s shape determines how neighboring points are weighted when calculating the concentration of anomalies.
‘box’: A rectangular (or uniform) kernel that gives equal weight to all points inside the window. This kernel is best for emphasizing the raw run length of anomalies, as it effectively counts misses within a fixed-size region.
‘triangular’: A simple linear kernel where the central point receives the maximum weight, which then decreases linearly to zero at the window’s edges. It provides a smoother density estimate than the ‘box’ kernel.
‘epan’: The Epanechnikov kernel, which assigns weights using an inverted parabola. It is statistically efficient and gives more weight to central points while smoothly tapering to zero. It’s a good choice for emphasizing the local prevalence of anomalies near the center of a cluster.
‘gaussian’: A smooth kernel that assigns weights using a Gaussian (bell curve) function. It provides the smoothest density estimate, implying that an anomaly’s influence decays exponentially with distance from the center point.
- lambda_
float, default=1.0 Cluster penalty weight \(\lambda\). Larger values increase the contribution of \(d_t\).
- gamma
float, default=1.0 Density nonlinearity \(\gamma\). Values \(>1\) accentuate dense clusters relative to sparse ones.
- eps
float, default=1e-12 Small positive number used in the band normalization denominator \((w_t+\varepsilon)\).
- multioutput{‘raw_values’, ‘uniform_average’}, default=’uniform_average’
Aggregation across outputs when
y_trueis 2D.‘raw_values’: return per-output scores.
‘uniform_average’: return the average score.
- nan_policy{‘omit’, ‘propagate’, ‘raise’}, default=’omit’
How to handle NaN/inf in any inputs (y_true, bounds, sort_by, sample_weight). After optional sorting, a mask is built over all required columns.
‘omit’ : drop invalid rows before computing CAS.
‘propagate’ : return NaN (and None for details).
‘raise’ : raise ValueError with a row count.
- return_detailsbool, default=False
If True, also return a DataFrame with per-sample fields (
is_anomaly,type,magnitude,local_density,severity). For multioutput, a list of DataFrames may be returned.
- y_truearray_like
- Returns:
- Parameters:
See also
clustered_anomaly_severityHelper that accepts arrays or DataFrame columns and returns the CAS score (and details if requested).
kdiagram.utils.plot.plot_cas_layersLayered, publication-ready line plot of intervals, severity stems, and anomalies.
kdiagram.utils.plot.plot_anomaly_glyphsPolar glyph visualization that emphasizes clustering.
Notes
CAS complements proper scoring rules and coverage by focusing on organization of errors rather than only their average frequency or size. It is translation-invariant and, with
normalize='band', unit-free. Settinglambda_=0reduces CAS to an average normalized excess outside the interval, akin to the distance penalty in interval/Winkler scores. In contrast,lambda_>0increases the score when violations cluster, capturing burstiness that aggregate scores may blur. The default density source (‘indicator’) follows the definition in the paper and is recommended for diagnostics.Time complexity for a box kernel with window
Wis \(\mathcal{O}(nW)\) and memory \(\mathcal{O}(n)\). With FFT-based convolution for smooth kernels, the cost is typically \(\mathcal{O}(n\log n)\).References
[R99bee64cbb50-1]Gneiting, T., Balabdaoui, F., & Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. JRSS Series B, 69(2), 243–268.
[R99bee64cbb50-2]Koenker, R., & Xiao, Z. (2006). Quantile autoregression. JASA, 101, 980–990.
[R99bee64cbb50-3]Podsztavek, O., Jordan, A. I., Tvrdík, P., & Polsterer, K. L. (2024). Automatic Miscalibration Diagnosis: Interpreting PIT Histograms. ESANN.
[R99bee64cbb50-4]Sokol, A. (2025). Fan charts 2.0: Flexible forecast distributions with expert judgement. International Journal of Forecasting, 41(3), 1148–1164.
Examples
Basic usage¶
>>> import numpy as np >>> y_true = np.array([10, 25, 30, 45, 50]) >>> y_pred = np.array([[8, 12], [24, 26], [32, 33], ... [44, 46], [48, 52]]) >>> cas = cluster_aware_severity_score( ... y_true, y_pred, window_size=3 ... ) >>> float(cas)
Sorting to control clustering¶
>>> sort_key = np.array([0, 2, 4, 1, 3]) >>> cas_unsorted = cluster_aware_severity_score( ... y_true, y_pred, window_size=3 ... ) >>> cas_sorted = cluster_aware_severity_score( ... y_true, y_pred, window_size=3, sort_by=sort_key ... ) >>> (float(cas_sorted), float(cas_unsorted))
Adjusting density source and kernel¶
>>> cas_mag = cluster_aware_severity_score( ... y_true, y_pred, window_size=5, ... density_source="magnitude", kernel="triangular" ... ) >>> float(cas_mag)
Weighting and stronger cluster penalty¶
>>> w = np.array([1, 1, 5, 1, 1]) >>> cas_w = cluster_aware_severity_score( ... y_true, y_pred, sample_weight=w, ... lambda_=2.0, gamma=2.0 ... ) >>> float(cas_w)