kdiagram.plot.uncertainty.plot_temporal_uncertainty

kdiagram.plot.uncertainty.plot_temporal_uncertainty(df, q_cols='auto', theta_col=None, names=None, acov='default', figsize=(8.0, 8.0), title=None, cmap='tab10', normalize=True, show_grid=True, grid_props=None, alpha=0.7, s=25, dot_style='o', legend_loc='upper right', mask_label=False, mask_angle=True, savefig=None, dpi=300, ax=None)[source]

Visualize multiple data series using polar scatter plots.

This function creates a general-purpose polar scatter plot to visualize and compare one or more data series (columns) from a DataFrame in a circular layout. Each series is plotted with a distinct color [1].

  • Angular Position (`theta`): Represents each data point or sample, ordered by the DataFrame index after removing rows with NaNs in the selected q_cols (and theta_col if used for NaN alignment). The points are mapped linearly onto the specified angular coverage (acov). The theta_col parameter is currently ignored for sorting/positioning but helps align data if it has NaNs.

  • Radial Distance (`r`): Represents the magnitude of the values from each column specified in q_cols. Values can be optionally normalized independently for each series using min-max scaling (normalize=True).

  • Color: Each data series (column in q_cols) is assigned a unique color based on the specified cmap [2].

This plot is flexible and can be used for various purposes, such as:

  • Comparing different model predictions for the same target.

  • Visualizing different quantile predictions (e.g., Q10, Q50, Q90) for a single time step to show uncertainty spread.

  • Plotting related variables against each other in a polar context.

Parameters:
dfpd.DataFrame

Input DataFrame containing the data columns to be plotted. Decorators ensure it’s a valid, non-empty pandas DataFrame.

q_colsstr or list of str, default=’auto’

Specifies the columns to plot.

  • If 'auto', attempts to automatically detect quantile columns (e.g., names containing ‘_q’ followed by numbers) using the helper function detect_quantiles_in. Raises error if detection fails or no such columns are found.

  • If a list of strings, these column names are used directly. Must contain at least one valid column name from df.

theta_colstr, optional

Intended for ordering points angularly based on this column’s values. Note: Currently ignored for sorting/positioning; uses index order. However, if provided and present in `df`, it is included when checking for and dropping NaN values to ensure data alignment between angles and plotted values. A warning is issued regarding its limited use. Default is None.

nameslist of str, optional

Custom labels for each data series specified in q_cols, used in the plot legend. Must match the number of columns in q_cols. If None, generic labels like ‘Q1’, ‘Q2’, … (or based on detected quantile names) are generated. Default is None.

acov{‘default’, ‘half_circle’, ‘quarter_circle’,

‘eighth_circle’}, default=’default’

Specifies the angular coverage (span) of the polar plot: 'default' (360°), 'half_circle' (180°), 'quarter_circle' (90°), 'eighth_circle' (45°).

figsizetuple of (float, float), default=(8.0, 8.0)

Width and height of the figure in inches.

titlestr, optional

Custom title for the plot. If None, a default title is used.

cmapstr, default=’tab10’

Name of the Matplotlib colormap used to assign distinct colors to each data series specified in q_cols.

normalizebool, default=True

If True, normalize the values within each column specified in q_cols independently to the range [0, 1] using min-max scaling before plotting. Useful for comparing shapes of series with different scales. If False, raw values are plotted.

show_gridbool, default=True

If True, display the polar grid lines.

alphafloat, default=0.7

Transparency level for the scatter points (0=transparent, 1=opaque).

sint, default=25

Marker size for the scatter points.

dot_stylestr, default=’o’

Marker style used for the scatter points (e.g., ‘o’, ‘.’, ‘x’, ‘+’). Passed to matplotlib.pyplot.scatter.

legend_locstr, default=’upper right’

Location of the legend identifying the data series. Common values include ‘best’, ‘upper right’, ‘lower left’, etc.

mask_anglebool, default=True

If True, hide the angular tick labels (degrees). Recommended if the angle is based on index.

savefigstr, optional

File path to save the plot image. If None, displays interactively.

Returns:
axmatplotlib.axes._axes.Axes

The Matplotlib Axes object (PolarAxesSubplot) containing the plot.

Raises:
ValueError

If q_cols=’auto’ fails to find columns or detect_quantiles_in fails. If q_cols (explicitly provided or auto-detected) is empty or contains columns not found in df. If names is provided but length doesn’t match q_cols. If acov value is invalid.

TypeError

If data in q_cols is not numeric.

Parameters:

See also

plot_polar_uncertainty_spread

Previous function focused on plotting quantiles for uncertainty (potentially similar).

detect_quantiles_in

Helper function for automatic column detection.

matplotlib.pyplot.scatter

Underlying function for plotting points.

Notes

  • Normalization \(\aleph\) (normalize=True) is applied independently to each column in q_cols, scaling each series to its own [0, 1] range.

  • Rows containing NaN values in any of the selected q_cols (and theta_col if present) are dropped before plotting to ensure alignment. theta angles are based on the index of this cleaned data.

  • The theta_col parameter currently does not affect the order or position of points but is used for consistent NaN handling.

  • Radial axis ticks and labels are hidden by default (set_yticklabels([])) as the radius represents potentially normalized values from multiple series, making a single scale less meaningful.

Let \(\mathbf{V}_i\) be the data vector for column i in q_cols (length \(N\), after NaN removal based on all selected columns).

  1. Normalization (if normalize=True, applied per column i):

    (1)\[\mathbf{v}'_i = \aleph ( \mathbf{V}_i)\]

    where

    (2)\[\aleph(x)_j = \frac{x_j - \min(\mathbf{x})}{\max(\mathbf{x}) - \min(\mathbf{x})}\]

    (Handles zero range by returning the original vector).

  2. Angular Coordinate (`theta`): Let \(S\) be the angular span and \(\theta_{min}\) the start angle from acov. For index \(j\) (\(j=0, \dots, N-1\) of cleaned data):

    (3)\[\theta_j = \left( \frac{j}{N} \times S \right) + \theta_{min}\]
  3. Radial Coordinate (`r`): For series i and point j:

    \(r_{i,j} = v'_{i,j}\) (if normalized) or \(r_{i,j} = v_{i,j}\) (if not normalized).

  4. Plotting: For each series i, plot points \((r_{i,j}, \theta_j)\) using scatter with a distinct color derived from cmap.

References

[1]

Kouadio, K. L., Liu, R., Loukou, K. G. H., Liu, J., & Liu, W. (2025). Analytics Framework for Interpreting Spatiotemporal Probabilistic Forecasts. International Journal of Forecasting. Manuscript submitted.

[2]

Matplotlib documentation: https://matplotlib.org/stable/

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from kdiagram.plot.uncertainty import plot_temporal_uncertainty

1. Random Example (Comparing two series):

>>> np.random.seed(42)
>>> N = 100
>>> df_comp_rand = pd.DataFrame({
...     'Index': range(N),
...     'ModelA_Pred': 50 + 10 * np.sin(np.linspace(0, 3 * np.pi, N)) + np.random.randn(N)*5,
...     'ModelB_Pred': 55 + 12 * np.sin(np.linspace(0, 3 * np.pi, N) - 0.5) + np.random.randn(N)*4,
... })
>>> ax_comp_rand = plot_temporal_uncertainty(
...     df=df_comp_rand,
...     q_cols=['ModelA_Pred', 'ModelB_Pred'],
...     names=['Model A', 'Model B'],
...     theta_col=None,           # Use index order
...     acov='default',
...     title='Comparison of Model A vs Model B',
...     normalize=True,           # Normalize for shape comparison
...     cmap='Set1',
...     dot_style='x',            # Use 'x' markers
...     mask_angle=False          # Show angle ticks
... )
>>> # plt.show() called internally

2. Concrete Example (Subsidence Quantiles for 2023):

>>> # Assume zhongshan_pred_2023_2026 is a loaded DataFrame
>>> # Create dummy data if it doesn't exist
>>> try:
...    zhongshan_pred_2023_2026
... except NameError:
...    print("Creating dummy subsidence data for example...")
...    N_sub = 150
...    zhongshan_pred_2023_2026 = pd.DataFrame({
...       'latitude': np.linspace(22.2, 22.8, N_sub),
...       'subsidence_2023_q10': np.random.rand(N_sub)*5 + 1 + np.linspace(0,2, N_sub),
...       'subsidence_2023_q50': np.random.rand(N_sub)*5 + 3 + np.linspace(1,3, N_sub),
...       'subsidence_2023_q90': np.random.rand(N_sub)*5 + 5 + np.linspace(2,4, N_sub),
...     })
>>> # Ensure Q90 > Q50 > Q10 roughly
>>> zhongshan_pred_2023_2026['subsidence_2023_q50'] = np.maximum(
...     zhongshan_pred_2023_2026['subsidence_2023_q50'],
...     zhongshan_pred_2023_2026['subsidence_2023_q10'] + 0.1)
>>> zhongshan_pred_2023_2026['subsidence_2023_q90'] = np.maximum(
...     zhongshan_pred_2023_2026['subsidence_2023_q90'],
...     zhongshan_pred_2023_2026['subsidence_2023_q50'] + 0.1)
>>> ax_tu_sub = plot_temporal_uncertainty(
...     df=zhongshan_pred_2023_2026.head(100), # Use subset
...     # Explicitly list quantile columns for 2023
...     q_cols=['subsidence_2023_q10', 'subsidence_2023_q50',
...             'subsidence_2023_q90'],
...     theta_col='latitude',       # Used for NaN alignment, not order
...     names=["Lower Bound (Q10)", "Median (Q50)", "Upper Bound (Q90)"],
...     acov='eighth_circle',    # Use smaller angle span
...     title='Uncertainty Spread for 2023 (Zhongshan)',
...     normalize=False,          # Plot raw values
...     cmap='coolwarm',          # Use diverging map for bounds
...     mask_angle=True,
...     s=30
... )
>>> # plt.show() called internally