kdiagram.plot.uncertainty.plot_interval_consistency¶
- kdiagram.plot.uncertainty.plot_interval_consistency(df, qlow_cols, qup_cols, q50_cols=None, theta_col=None, use_cv=True, cmap='coolwarm', acov='default', title=None, figsize=(9, 9), s=30, alpha=0.85, show_grid=True, mask_angle=False, savefig=None, dpi=300, ax=None)[source]¶
Polar plot showing consistency of prediction interval widths.
This function generates a polar scatter plot to visualize the temporal consistency (or variability) of prediction interval widths (e.g., Q90 - Q10) across different locations over multiple time steps or forecast horizons:
The angular position (`theta`) represents each location, currently derived from the DataFrame index and mapped onto the specified angular coverage (acov).
The radial distance (`r`) quantifies the inconsistency or variability of the interval width over time for each location. It is calculated as either the standard deviation (absolute variability) or the coefficient of variation (CV, relative variability) of the interval widths (Upper Quantile - Lower Quantile) across the specified time steps. Higher r values indicate locations where the predicted uncertainty range fluctuates more significantly over time.
The color of each point typically represents the average median prediction (Q50) across the time steps (if q50_cols are provided). This adds context, helping to identify if interval inconsistency occurs in regions of high or low average predictions. If q50_cols are not provided, color defaults to representing the inconsistency measure r.
This plot is useful for diagnosing model reliability, identifying locations or conditions where the model’s uncertainty estimates are unstable or vary considerably across different forecast horizons.
- Parameters:
- df
pd.DataFrame Input DataFrame containing the data. Must include columns specified in qlow_cols and qup_cols. Decorator @isdf ensures this is a pandas DataFrame. Decorator @check_non_emptiness ensures it’s not empty.
- qlow_cols
listofstr List of column names representing the lower quantile (e.g., Q10) predictions for consecutive time steps (e.g., years). Order should correspond to the time steps. Example:
['subsidence_2023_q10', 'subsidence_2024_q10', ...].- qup_cols
listofstr List of column names representing the upper quantile (e.g., Q90) predictions for the same consecutive time steps as qlow_cols. Must be the same length as qlow_cols. Example:
['subsidence_2023_q90', 'subsidence_2024_q90', ...].- q50_cols
listofstr,optional List of column names representing the median quantile (Q50) predictions for the same time steps. If provided, the average Q50 value across these columns will be used to color the points. Must be the same length as qlow_cols if provided. If
None, the color will represent the radial value r (the inconsistency measure). Default isNone.- theta_col
str,optional Intended column name to determine the angular position (theta) for each location (e.g., ‘latitude’, ‘longitude’, or a spatial index). If
None, the DataFrame index is conceptually used. Note: The current implementation maps the DataFrame row index to the angular range specified by `acov`, regardless of whether `theta_col` is provided. Providing `theta_col` will currently trigger a warning but will not affect the plot’s angular axis. Default isNone.- use_cvbool, default=True
Determines the measure of interval width variability used for the radial coordinate r:
If
True, r is the Coefficient of Variation (CV) of the interval widths (Std Dev / Mean). CV measures relative variability, useful when mean widths differ substantially.If
False, r is the Standard Deviation (Std Dev) of the interval widths. Std Dev measures absolute variability.
- cmap
str, default=’coolwarm’ The name of the Matplotlib colormap used to color the scatter points based on the average Q50 value (or r if q50_cols is
None).- acov
str, default=’default’ Angular coverage defining the span of the polar plot’s theta axis. Options:
'default'(360°),'half_circle'(180°),'quarter_circle'(90°),'eighth_circle'(45°). Invalid options default to'default'.- title
str,optional The title displayed above the polar plot. If
None, a default title like “Prediction Interval Consistency (Q90Q10)” is used. Default isNone.- figsize
tupleof(float,float), default=(9, 9) The width and height of the figure in inches.
- s
floatorint, default=30 The marker size for the scatter points.
- alpha
float, default=0.85 The transparency level of the scatter points (0=transparent, 1=opaque).
- show_gridbool, default=True
If
True, display the polar grid lines.- mask_anglebool, default=False
If
True, hide the angular tick labels. Useful if the index- based angle is not directly interpretable.- savefig
str,optional File path to save the plot image. If
None, displays the plot interactively. Default isNone.
- df
- Returns:
- ax
matplotlib.axes.Axes The Matplotlib Axes object containing the polar scatter plot.
- ax
- Raises:
AssertionErrorIf qlow_cols and qup_cols have different lengths.
ValueErrorIf specified columns in qlow_cols, qup_cols, or q50_cols are not found in the DataFrame.
- Parameters:
See also
plot_velocityPlot average velocity in polar coordinates.
numpy.stdCompute the standard deviation.
numpy.meanCompute the arithmetic mean.
matplotlib.pyplot.scatterCreate scatter plots.
Notes
Interval-width consistency is assessed from paired lower/upper quantiles for multiple time steps. For each location and time step, the width is computed as upper minus lower. The radial value encodes either the standard deviation of these widths (absolute variability) or their coefficient of variation (relative variability), with safe handling of zero means by setting the CV to zero when the average width is numerically indistinguishable from zero. Angles are derived from the row index and mapped linearly across the angular span determined by
acov; the current implementation does not usetheta_colfor positioning. Rows containing missing values in any required column are dropped prior to computation. These diagnostics relate to standard notions of predictive-interval calibration and stability; see Gneiting et al.[1], Jolliffe and Stephenson[2].Interval widths. Let \(\mathbf L\) and \(\mathbf U\) be matrices extracted from
dfusingqlow_colsandqup_cols, respectively, both of shape \((N,M)\), with \(N\) locations and \(M\) time steps. Define the width matrix \(\mathbf W\) by(1)¶\[W_{j,i} = U_{j,i} - L_{j,i}\]where \(j\) indexes locations (\(0\) to \(N-1\)) and \(i\) indexes time steps (\(0\) to \(M-1\)).
Radial Coordinate Calculation (`r`): Let \(\mathbf{w}_j = (W_{j,0}, \dots, W_{j,M-1})\) be the vector of widths over time for location \(j\). Let \(\bar{w}_j = \text{mean}(\mathbf{w}_j)\) and \(\sigma_{w_j} = \text{std}(\mathbf{w}_j)\).
If use_cv=False (Standard Deviation):
(2)¶\[r_j = \sigma_{w_j}\]If use_cv=True (Coefficient of Variation):
(3)¶\[\begin{split}r_j = \begin{cases} \frac{\sigma_{w_j}}{\bar{w}_j} &\\ \text{if } |\bar{w}_j| > \epsilon \\ 0 & \text{if }\\ |\bar{w}_j| \le \epsilon \end{cases}\end{split}\]where \(\epsilon\) is a small threshold to prevent division by zero.
Color Value Calculation (`c`): Let \(\mathbf{Q50}\) be the data matrix (shape \((N, M)\)) from q50_cols.
If q50_cols is provided: Let \(\mathbf{q50}_j = (Q50_{j,0}, \dots, Q50_{j,M-1})\).
(4)¶\[c_j = \text{mean}(\mathbf{q50}_j) = \frac{1}{M} \sum_{i=0}^{M-1} Q50_{j,i}\]If q50_cols is
None:\(c_j = r_j\)
Angular Coordinate Calculation (`theta`): Same index-based calculation as plot_velocity. Let \(S\) be the angular span from acov.
(5)¶\[\theta_j = \frac{j}{N} \times S\]References
Examples
>>> import pandas as pd >>> import numpy as np >>> from kdiagram.plot.uncertainty import plot_interval_consistency
1. Random Example:
>>> np.random.seed(1) >>> N_points = 120 >>> df_rand_interval = pd.DataFrame({ ... 'id': range(N_points), ... 'lat': np.linspace(30, 31, N_points), ... 'val_2021_q10': np.random.rand(N_points) * 5, ... 'val_2021_q50': np.random.rand(N_points) * 5 + 5, ... 'val_2021_q90': np.random.rand(N_points) * 5 + 10, ... 'val_2022_q10': np.random.rand(N_points) * 6, # Slightly wider ... 'val_2022_q50': np.random.rand(N_points) * 6 + 6, ... 'val_2022_q90': np.random.rand(N_points) * 6 + 12, ... 'val_2023_q10': np.random.rand(N_points) * 4, # Narrower ... 'val_2023_q50': np.random.rand(N_points) * 4 + 7, ... 'val_2023_q90': np.random.rand(N_points) * 4 + 11, ... }) >>> q10_cols_rand = ['val_2021_q10', 'val_2022_q10', 'val_2023_q10'] >>> q90_cols_rand = ['val_2021_q90', 'val_2022_q90', 'val_2023_q90'] >>> q50_cols_rand = ['val_2021_q50', 'val_2022_q50', 'val_2023_q50'] >>> ax_rand_ic = plot_interval_consistency( ... df=df_rand_interval, ... qlow_cols=q10_cols_rand, ... qup_cols=q90_cols_rand, ... q50_cols=q50_cols_rand, ... theta_col='lat', # Note: Ignored for positioning ... use_cv=True, # Use CV for radial axis ... cmap='viridis', ... acov='half_circle', ... title='Random Interval Width Consistency (CV)', ... s=35 ... ) >>> # plt.show() called internally
2. Concrete Example (Subsidence Data - adapted from docstring):
>>> # Assume zhongshan_pred_2023_2026 is loaded DataFrame like: >>> # Create dummy data if it doesn't exist >>> try: ... zhongshan_pred_2023_2026 ... except NameError: ... print("Creating dummy subsidence data for example...") ... N_sub = 150 ... zhongshan_pred_2023_2026 = pd.DataFrame({ ... 'latitude': np.linspace(22.2, 22.8, N_sub), ... **{f'subsidence_{yr}_q10': np.random.rand(N_sub)*(yr-2020)+1 ... for yr in range(2023, 2027)}, ... **{f'subsidence_{yr}_q50': np.random.rand(N_sub)*(yr-2019)+5 ... + np.linspace(0, (yr-2022)*2, N_sub) ... for yr in range(2023, 2027)}, ... **{f'subsidence_{yr}_q90': np.random.rand(N_sub)*(yr-2018)+10 ... + np.linspace(0, (yr-2022)*4, N_sub) ... for yr in range(2023, 2027)}, ... }) >>> qlow_sub = [f'subsidence_{yr}_q10' for yr in range(2023, 2027)] >>> qup_sub = [f'subsidence_{yr}_q90' for yr in range(2023, 2027)] >>> q50_sub = [f'subsidence_{yr}_q50' for yr in range(2023, 2027)] >>> ax_sub_ic = plot_interval_consistency( ... df=zhongshan_pred_2023_2026, ... qlow_cols=qlow_sub, ... qup_cols=qup_sub, ... q50_cols=q50_sub, ... theta_col='latitude', # Ignored for pos, triggers warning ... acov='default', ... title='Subsidence Uncertainty Consistency (20232026)', ... use_cv=False, # Use Std Dev for radius ... cmap='coolwarm', ... s=28, ... alpha=0.8, ... mask_angle=True ... ) >>>