kdiagram.plot.uncertainty.plot_temporal_uncertainty¶
- kdiagram.plot.uncertainty.plot_temporal_uncertainty(df, q_cols='auto', theta_col=None, names=None, acov='default', figsize=(8.0, 8.0), title=None, cmap='tab10', normalize=True, show_grid=True, grid_props=None, alpha=0.7, s=25, dot_style='o', legend_loc='upper right', mask_label=False, mask_angle=True, savefig=None)[source]¶
Visualize multiple data series using polar scatter plots.
This function creates a general-purpose polar scatter plot to visualize and compare one or more data series (columns) from a DataFrame in a circular layout. Each series is plotted with a distinct color.
Angular Position (`theta`): Represents each data point or sample, ordered by the DataFrame index after removing rows with NaNs in the selected q_cols (and theta_col if used for NaN alignment). The points are mapped linearly onto the specified angular coverage (acov). The theta_col parameter is currently ignored for sorting/positioning but helps align data if it has NaNs.
Radial Distance (`r`): Represents the magnitude of the values from each column specified in q_cols. Values can be optionally normalized independently for each series using min-max scaling (normalize=True).
Color: Each data series (column in q_cols) is assigned a unique color based on the specified cmap.
This plot is flexible and can be used for various purposes, such as: - Comparing different model predictions for the same target. - Visualizing different quantile predictions (e.g., Q10, Q50, Q90)
for a single time step to show uncertainty spread.
Plotting related variables against each other in a polar context.
- dfpd.DataFrame
Input DataFrame containing the data columns to be plotted. Decorators ensure it’s a valid, non-empty pandas DataFrame.
- q_colsstr or list of str, default=’auto’
Specifies the columns to plot. - If
'auto', attempts to automatically detect quantile columns(e.g., names containing ‘_q’ followed by numbers) using the helper function detect_quantiles_in. Raises error if detection fails or no such columns are found.
If a list of strings, these column names are used directly. Must contain at least one valid column name from df.
- theta_colstr, optional
Intended for ordering points angularly based on this column’s values. Note: Currently ignored for sorting/positioning; uses index order. However, if provided and present in `df`, it is included when checking for and dropping NaN values to ensure data alignment between angles and plotted values. A warning is issued regarding its limited use. Default is
None.- nameslist of str, optional
Custom labels for each data series specified in q_cols, used in the plot legend. Must match the number of columns in q_cols. If
None, generic labels like ‘Q1’, ‘Q2’, … (or based on detected quantile names) are generated. Default isNone.- acov{‘default’, ‘half_circle’, ‘quarter_circle’, ‘eighth_circle’}, default=’default’
Specifies the angular coverage (span) of the polar plot:
'default'(360°),'half_circle'(180°),'quarter_circle'(90°),'eighth_circle'(45°).- figsizetuple of (float, float), default=(8.0, 8.0)
Width and height of the figure in inches.
- titlestr, optional
Custom title for the plot. If
None, a default title is used.- cmapstr, default=’tab10’
Name of the Matplotlib colormap used to assign distinct colors to each data series specified in q_cols.
- normalizebool, default=True
If
True, normalize the values within each column specified in q_cols independently to the range [0, 1] using min-max scaling before plotting. Useful for comparing shapes of series with different scales. IfFalse, raw values are plotted.- show_gridbool, default=True
If
True, display the polar grid lines.- alphafloat, default=0.7
Transparency level for the scatter points (0=transparent, 1=opaque).
- sint, default=25
Marker size for the scatter points.
- dot_stylestr, default=’o’
Marker style used for the scatter points (e.g., ‘o’, ‘.’, ‘x’, ‘+’). Passed to matplotlib.pyplot.scatter.
- legend_locstr, default=’upper right’
Location of the legend identifying the data series. Common values include ‘best’, ‘upper right’, ‘lower left’, etc.
- mask_anglebool, default=True
If
True, hide the angular tick labels (degrees). Recommended if the angle is based on index.- savefigstr, optional
File path to save the plot image. If
None, displays interactively.
- axmatplotlib.axes._axes.Axes
The Matplotlib Axes object (PolarAxesSubplot) containing the plot.
- ValueError
If q_cols=’auto’ fails to find columns or detect_quantiles_in fails. If q_cols (explicitly provided or auto-detected) is empty or contains columns not found in df. If names is provided but length doesn’t match q_cols. If acov value is invalid.
- TypeError
If data in q_cols is not numeric.
- plot_polar_uncertainty_spreadPrevious function focused on plotting
quantiles for uncertainty (potentially similar).
detect_quantiles_in : Helper function for automatic column detection. matplotlib.pyplot.scatter : Underlying function for plotting points.
Normalization (normalize=True) is applied independently to each column in q_cols, scaling each series to its own [0, 1] range.
Rows containing NaN values in any of the selected q_cols (and theta_col if present) are dropped before plotting to ensure alignment. theta angles are based on the index of this cleaned data.
The theta_col parameter currently does not affect the order or position of points but is used for consistent NaN handling.
Radial axis ticks and labels are hidden by default (set_yticklabels([])) as the radius represents potentially normalized values from multiple series, making a single scale less meaningful.
Let \(\mathbf{V}_i\) be the data vector for column i in q_cols (length \(N\), after NaN removal based on all selected columns).
Normalization (if normalize=True, applied per column i): \(\mathbf{v}'_i\) = _normalize`(:math:mathbf{V}_i`) where .. math:
_normalize(x)_j =
- rac{x_j - min(mathbf{x})}{max(mathbf{x}) - min(mathbf{x})}
(Handles zero range by returning the original vector).
Angular Coordinate (`theta`): Let \(S\) be the angular span and :math:` heta_{min}` the start angle from acov. For index \(j\) (\(j=0, \dots, N-1\) of cleaned data): .. math:
heta_j = \left(
rac{j}{N} imes S ight) + heta_{min}
Radial Coordinate (`r`): For series i and point j: \(r_{i,j} = v'_{i,j}\) (if normalized) or \(r_{i,j} = v_{i,j}\) (if not normalized).
Plotting: For each series i, plot points \((r_{i,j}, heta_j)\) using scatter with a distinct color derived from cmap.
>>> import pandas as pd >>> import numpy as np >>> from kdiagram.plot.uncertainty import plot_temporal_uncertainty
1. Random Example (Comparing two series):
>>> np.random.seed(42) >>> N = 100 >>> df_comp_rand = pd.DataFrame({ ... 'Index': range(N), ... 'ModelA_Pred': 50 + 10 * np.sin(np.linspace(0, 3 * np.pi, N)) + np.random.randn(N)*5, ... 'ModelB_Pred': 55 + 12 * np.sin(np.linspace(0, 3 * np.pi, N) - 0.5) + np.random.randn(N)*4, ... }) >>> ax_comp_rand = plot_temporal_uncertainty( ... df=df_comp_rand, ... q_cols=['ModelA_Pred', 'ModelB_Pred'], ... names=['Model A', 'Model B'], ... theta_col=None, # Use index order ... acov='default', ... title='Comparison of Model A vs Model B', ... normalize=True, # Normalize for shape comparison ... cmap='Set1', ... dot_style='x', # Use 'x' markers ... mask_angle=False # Show angle ticks ... ) >>> # plt.show() called internally
2. Concrete Example (Subsidence Quantiles for 2023):
>>> # Assume zhongshan_pred_2023_2026 is a loaded DataFrame >>> # Create dummy data if it doesn't exist >>> try: ... zhongshan_pred_2023_2026 ... except NameError: ... print("Creating dummy subsidence data for example...") ... N_sub = 150 ... zhongshan_pred_2023_2026 = pd.DataFrame({ ... 'latitude': np.linspace(22.2, 22.8, N_sub), ... 'subsidence_2023_q10': np.random.rand(N_sub)*5 + 1 + np.linspace(0,2, N_sub), ... 'subsidence_2023_q50': np.random.rand(N_sub)*5 + 3 + np.linspace(1,3, N_sub), ... 'subsidence_2023_q90': np.random.rand(N_sub)*5 + 5 + np.linspace(2,4, N_sub), ... }) >>> # Ensure Q90 > Q50 > Q10 roughly >>> zhongshan_pred_2023_2026['subsidence_2023_q50'] = np.maximum( ... zhongshan_pred_2023_2026['subsidence_2023_q50'], ... zhongshan_pred_2023_2026['subsidence_2023_q10'] + 0.1) >>> zhongshan_pred_2023_2026['subsidence_2023_q90'] = np.maximum( ... zhongshan_pred_2023_2026['subsidence_2023_q90'], ... zhongshan_pred_2023_2026['subsidence_2023_q50'] + 0.1)
>>> ax_tu_sub = plot_temporal_uncertainty( ... df=zhongshan_pred_2023_2026.head(100), # Use subset ... # Explicitly list quantile columns for 2023 ... q_cols=['subsidence_2023_q10', 'subsidence_2023_q50', ... 'subsidence_2023_q90'], ... theta_col='latitude', # Used for NaN alignment, not order ... names=["Lower Bound (Q10)", "Median (Q50)", "Upper Bound (Q90)"], ... acov='eighth_circle', # Use smaller angle span ... title='Uncertainty Spread for 2023 (Zhongshan)', ... normalize=False, # Plot raw values ... cmap='coolwarm', # Use diverging map for bounds ... mask_angle=True, ... s=30 ... ) >>> # plt.show() called internally