kdiagram.plot.uncertainty.plot_temporal_uncertainty¶
- kdiagram.plot.uncertainty.plot_temporal_uncertainty(df, q_cols='auto', theta_col=None, names=None, acov='default', figsize=(8.0, 8.0), title=None, cmap='tab10', normalize=True, show_grid=True, grid_props=None, alpha=0.7, s=25, dot_style='o', legend_loc='upper right', mask_label=False, mask_angle=True, savefig=None, dpi=300, ax=None)[source]¶
Visualize multiple data series using polar scatter plots.
This function creates a general-purpose polar scatter plot to visualize and compare one or more data series (columns) from a DataFrame in a circular layout. Each series is plotted with a distinct color [1].
Angular Position (`theta`): Represents each data point or sample, ordered by the DataFrame index after removing rows with NaNs in the selected q_cols (and theta_col if used for NaN alignment). The points are mapped linearly onto the specified angular coverage (acov). The theta_col parameter is currently ignored for sorting/positioning but helps align data if it has NaNs.
Radial Distance (`r`): Represents the magnitude of the values from each column specified in q_cols. Values can be optionally normalized independently for each series using min-max scaling (normalize=True).
Color: Each data series (column in q_cols) is assigned a unique color based on the specified cmap [2].
This plot is flexible and can be used for various purposes, such as:
Comparing different model predictions for the same target.
Visualizing different quantile predictions (e.g., Q10, Q50, Q90) for a single time step to show uncertainty spread.
Plotting related variables against each other in a polar context.
- Parameters:
- df
pd.DataFrame Input DataFrame containing the data columns to be plotted. Decorators ensure it’s a valid, non-empty pandas DataFrame.
- q_cols
strorlistofstr, default=’auto’ Specifies the columns to plot.
If
'auto', attempts to automatically detect quantile columns (e.g., names containing ‘_q’ followed by numbers) using the helper function detect_quantiles_in. Raises error if detection fails or no such columns are found.If a list of strings, these column names are used directly. Must contain at least one valid column name from df.
- theta_col
str,optional Intended for ordering points angularly based on this column’s values. Note: Currently ignored for sorting/positioning; uses index order. However, if provided and present in `df`, it is included when checking for and dropping NaN values to ensure data alignment between angles and plotted values. A warning is issued regarding its limited use. Default is
None.- names
listofstr,optional Custom labels for each data series specified in q_cols, used in the plot legend. Must match the number of columns in q_cols. If
None, generic labels like ‘Q1’, ‘Q2’, … (or based on detected quantile names) are generated. Default isNone.- acov{‘default’, ‘half_circle’, ‘quarter_circle’,
‘eighth_circle’}, default=’default’
Specifies the angular coverage (span) of the polar plot:
'default'(360°),'half_circle'(180°),'quarter_circle'(90°),'eighth_circle'(45°).- figsize
tupleof(float,float), default=(8.0, 8.0) Width and height of the figure in inches.
- title
str,optional Custom title for the plot. If
None, a default title is used.- cmap
str, default=’tab10’ Name of the Matplotlib colormap used to assign distinct colors to each data series specified in q_cols.
- normalizebool, default=True
If
True, normalize the values within each column specified in q_cols independently to the range [0, 1] using min-max scaling before plotting. Useful for comparing shapes of series with different scales. IfFalse, raw values are plotted.- show_gridbool, default=True
If
True, display the polar grid lines.- alpha
float, default=0.7 Transparency level for the scatter points (0=transparent, 1=opaque).
- s
int, default=25 Marker size for the scatter points.
- dot_style
str, default=’o’ Marker style used for the scatter points (e.g., ‘o’, ‘.’, ‘x’, ‘+’). Passed to matplotlib.pyplot.scatter.
- legend_loc
str, default=’upper right’ Location of the legend identifying the data series. Common values include ‘best’, ‘upper right’, ‘lower left’, etc.
- mask_anglebool, default=True
If
True, hide the angular tick labels (degrees). Recommended if the angle is based on index.- savefig
str,optional File path to save the plot image. If
None, displays interactively.
- df
- Returns:
- ax
matplotlib.axes._axes.Axes The Matplotlib Axes object (PolarAxesSubplot) containing the plot.
- ax
- Raises:
ValueErrorIf q_cols=’auto’ fails to find columns or detect_quantiles_in fails. If q_cols (explicitly provided or auto-detected) is empty or contains columns not found in df. If names is provided but length doesn’t match q_cols. If acov value is invalid.
TypeErrorIf data in q_cols is not numeric.
- Parameters:
See also
plot_polar_uncertainty_spreadPrevious function focused on plotting quantiles for uncertainty (potentially similar).
detect_quantiles_inHelper function for automatic column detection.
matplotlib.pyplot.scatterUnderlying function for plotting points.
Notes
Normalization \(\aleph\) (normalize=True) is applied independently to each column in q_cols, scaling each series to its own [0, 1] range.
Rows containing NaN values in any of the selected q_cols (and theta_col if present) are dropped before plotting to ensure alignment. theta angles are based on the index of this cleaned data.
The theta_col parameter currently does not affect the order or position of points but is used for consistent NaN handling.
Radial axis ticks and labels are hidden by default (set_yticklabels([])) as the radius represents potentially normalized values from multiple series, making a single scale less meaningful.
Let \(\mathbf{V}_i\) be the data vector for column i in q_cols (length \(N\), after NaN removal based on all selected columns).
Normalization (if normalize=True, applied per column i):
(1)¶\[\mathbf{v}'_i = \aleph ( \mathbf{V}_i)\]where
(2)¶\[\aleph(x)_j = \frac{x_j - \min(\mathbf{x})}{\max(\mathbf{x}) - \min(\mathbf{x})}\](Handles zero range by returning the original vector).
Angular Coordinate (`theta`): Let \(S\) be the angular span and \(\theta_{min}\) the start angle from acov. For index \(j\) (\(j=0, \dots, N-1\) of cleaned data):
(3)¶\[\theta_j = \left( \frac{j}{N} \times S \right) + \theta_{min}\]Radial Coordinate (`r`): For series i and point j:
\(r_{i,j} = v'_{i,j}\) (if normalized) or \(r_{i,j} = v_{i,j}\) (if not normalized).
Plotting: For each series i, plot points \((r_{i,j}, \theta_j)\) using scatter with a distinct color derived from cmap.
References
[1]Kouadio, K. L., Liu, R., Loukou, K. G. H., Liu, J., & Liu, W. (2025). Analytics Framework for Interpreting Spatiotemporal Probabilistic Forecasts. International Journal of Forecasting. Manuscript submitted.
[2]Matplotlib documentation: https://matplotlib.org/stable/
Examples
>>> import pandas as pd >>> import numpy as np >>> from kdiagram.plot.uncertainty import plot_temporal_uncertainty
1. Random Example (Comparing two series):
>>> np.random.seed(42) >>> N = 100 >>> df_comp_rand = pd.DataFrame({ ... 'Index': range(N), ... 'ModelA_Pred': 50 + 10 * np.sin(np.linspace(0, 3 * np.pi, N)) + np.random.randn(N)*5, ... 'ModelB_Pred': 55 + 12 * np.sin(np.linspace(0, 3 * np.pi, N) - 0.5) + np.random.randn(N)*4, ... }) >>> ax_comp_rand = plot_temporal_uncertainty( ... df=df_comp_rand, ... q_cols=['ModelA_Pred', 'ModelB_Pred'], ... names=['Model A', 'Model B'], ... theta_col=None, # Use index order ... acov='default', ... title='Comparison of Model A vs Model B', ... normalize=True, # Normalize for shape comparison ... cmap='Set1', ... dot_style='x', # Use 'x' markers ... mask_angle=False # Show angle ticks ... ) >>> # plt.show() called internally
2. Concrete Example (Subsidence Quantiles for 2023):
>>> # Assume zhongshan_pred_2023_2026 is a loaded DataFrame >>> # Create dummy data if it doesn't exist >>> try: ... zhongshan_pred_2023_2026 ... except NameError: ... print("Creating dummy subsidence data for example...") ... N_sub = 150 ... zhongshan_pred_2023_2026 = pd.DataFrame({ ... 'latitude': np.linspace(22.2, 22.8, N_sub), ... 'subsidence_2023_q10': np.random.rand(N_sub)*5 + 1 + np.linspace(0,2, N_sub), ... 'subsidence_2023_q50': np.random.rand(N_sub)*5 + 3 + np.linspace(1,3, N_sub), ... 'subsidence_2023_q90': np.random.rand(N_sub)*5 + 5 + np.linspace(2,4, N_sub), ... }) >>> # Ensure Q90 > Q50 > Q10 roughly >>> zhongshan_pred_2023_2026['subsidence_2023_q50'] = np.maximum( ... zhongshan_pred_2023_2026['subsidence_2023_q50'], ... zhongshan_pred_2023_2026['subsidence_2023_q10'] + 0.1) >>> zhongshan_pred_2023_2026['subsidence_2023_q90'] = np.maximum( ... zhongshan_pred_2023_2026['subsidence_2023_q90'], ... zhongshan_pred_2023_2026['subsidence_2023_q50'] + 0.1) >>> ax_tu_sub = plot_temporal_uncertainty( ... df=zhongshan_pred_2023_2026.head(100), # Use subset ... # Explicitly list quantile columns for 2023 ... q_cols=['subsidence_2023_q10', 'subsidence_2023_q50', ... 'subsidence_2023_q90'], ... theta_col='latitude', # Used for NaN alignment, not order ... names=["Lower Bound (Q10)", "Median (Q50)", "Upper Bound (Q90)"], ... acov='eighth_circle', # Use smaller angle span ... title='Uncertainty Spread for 2023 (Zhongshan)', ... normalize=False, # Plot raw values ... cmap='coolwarm', # Use diverging map for bounds ... mask_angle=True, ... s=30 ... ) >>> # plt.show() called internally