kdiagram.plot.uncertainty.plot_velocity¶
- kdiagram.plot.uncertainty.plot_velocity(df, q50_cols, theta_col=None, cmap='viridis', acov='default', normalize=True, use_abs_color=True, figsize=(9, 9), title=None, s=30, alpha=0.85, show_grid=True, savefig=None, cbar=True, mask_angle=False, dpi=300, ax=None)[source]¶
Polar plot visualizing average velocity across locations.
Generates a polar scatter plot where each point represents a unique location or observation from the input DataFrame. The radial distance (r) of each point corresponds to the average rate of change (velocity) of the median prediction (Q50) over consecutive time periods (e.g., years), optionally normalized to [0, 1]. The angular position (theta) represents the location, currently determined by its index in the DataFrame, mapped onto a specified angular coverage. The color of each point provides an additional dimension, representing either the calculated velocity itself or the average absolute magnitude of the Q50 predictions over the considered time periods [1].
This visualization is useful for identifying spatial patterns in the dynamics of a phenomenon, such as locating areas of rapid or slow change (high/low velocity) in land subsidence predictions. Coloring by magnitude helps to contextualize the velocity (e.g., is high velocity occurring in areas of already high subsidence?).
- Parameters:
- df
pd.DataFrame Input DataFrame containing the data. Must include the columns specified in q50_cols. Decorator @isdf ensures this is a pandas DataFrame. Decorator @check_non_emptiness ensures it’s not empty.
- q50_cols
listofstr An ordered list of column names representing the Q50 (median) predictions for consecutive time steps (e.g., years). The list must contain at least two column names to compute velocity. Example:
['subsidence_2022_q50', 'subsidence_2023_q50', 'subsidence_2024_q50'].- theta_col
str,optional Intended column name to determine the angular position (theta) for each location (e.g., ‘latitude’, ‘longitude’, or a spatial index). If
None, the DataFrame index is conceptually used. Note: The current implementation maps the DataFrame row index to the angular range specified by `acov`, regardless of whether `theta_col` is provided. Providing `theta_col` will currently trigger a warning but will not affect the plot’s angular axis. Default isNone.- cmap
str, default=’viridis’ The name of the Matplotlib colormap used to color the scatter points based on color_vals (determined by use_abs_color).
- acov
str, default=’default’ Angular coverage defining the span of the polar plot’s theta axis. Options are:
'default': Full circle (2p radians or 360 degrees).'half_circle': Half circle (p radians or 180 degrees).'quarter_circle': Quarter circle (p/2 radians or 90 degrees).'eighth_circle': Eighth circle (p/4 radians or 45 degrees).
Invalid options default to
'default'.- normalizebool, default=True
If
True, the calculated average velocity values (r) are min-max normalized to the range [0, 1] before plotting radially. This emphasizes relative velocity patterns. IfFalse, the raw average velocity values are used for the radial coordinate.- use_abs_colorbool, default=True
Determines the variable used for coloring the points:
If
True, points are colored based on the average absolute magnitude of the Q50 values across the specified q50_cols. This highlights areas with high overall prediction values.If
False, points are colored based on the calculated average velocity (r) itself. This highlights areas of high or low rate of change.
- figsize
tupleof(float,float), default=(9, 9) The width and height of the figure in inches.
- title
str,optional The title displayed above the polar plot. If
None, a default title “Normalized Subsidence Velocity” (or similar, depending on context, though not dynamically changed here) is used. Default isNone.- s
floatorint, default=30 The marker size for the scatter points.
- alpha
float, default=0.85 The transparency level of the scatter points (0=transparent, 1=opaque). Useful for visualizing dense data.
- show_gridbool, default=True
If
True, display the polar grid lines (radial and angular) on the plot.- savefig
str,optional The file path (including extension, e.g., ‘velocity_plot.pdf’) where the plot image should be saved. If
None, the plot is displayed interactively using plt.show(). Default isNone.- cbarbool, default=True
If
True, display a color bar alongside the plot indicating the mapping between colors and the values defined by use_abs_color.- mask_anglebool, default=False
If
True, hide the angular tick labels (the degrees/radians around the circumference). This can be useful if the angular position based on index is not inherently meaningful.
- df
- Returns:
- ax
matplotlib.axes.Axes The Matplotlib Axes object containing the polar scatter plot. Can be used for further customization.
- ax
- Raises:
ValueErrorIf q50_cols contains fewer than two column names.
- Parameters:
See also
numpy.diffComputes the difference between consecutive elements.
numpy.meanComputes the arithmetic mean.
matplotlib.pyplot.scatterCreates scatter plots.
matplotlib.pyplot.polarCreates polar plots.
kdiagram.plot.uncertainty.plot_uncertainty_driftVisualizes uncertainty width changes over time.
Notes
The function assumes the columns in q50_cols represent equally spaced time steps for the velocity calculation to be meaningful as an average yearly (or per-step) velocity.
The average velocity (r) is calculated as the mean of the first-order differences between consecutive columns in q50_cols.
Normalization of r uses min-max scaling: \(r' = (r - \min(r)) / (\max(r) - \min(r))\).
The angular coordinate theta is currently derived from the DataFrame index, mapped linearly onto the angular range defined by acov. The theta_col parameter is not used for positioning in the current implementation, which might be revised in future versions. A warning is issued if theta_col is provided [2].
Let \(\mathbf{Q}\) be the data matrix extracted from df using columns q50_cols, with shape \((N, M)\), where \(N\) is the number of locations (rows) and \(M\) is the number of time points (columns). Note the transpose compared to the description in plot_feature_fingerprint.
Velocity Calculation: The differences between consecutive time points for each location \(j\) are computed:
\(\Delta Q_{j,i} = Q_{j, i+1} - Q_{j, i}\) for \(i = 0, \dots, M-2\).
The average velocity for location \(j\) is:
(1)¶\[r_j = \frac{1}{M-1} \sum_{i=0}^{M-2} \Delta Q_{j,i}\]Radial Normalization (if normalize=True):
Let \(\mathbf{r} = (r_0, \dots, r_{N-1})\).
(2)¶\[r'_j = \frac{r_j - \min(\mathbf{r})}{\max(\mathbf{r}) - \min(\mathbf{r})}\]If \(\max(\mathbf{r}) = \min(\mathbf{r})\), \(r'_j = 0\).
Color Value Calculation:
If use_abs_color=True: Average absolute magnitude.
(3)¶\[c_j = \frac{1}{M} \sum_{i=0}^{M-1} |Q_{j,i}|\]If use_abs_color=False: Use average velocity. \(c_j = r_j\)
Angular Coordinate Calculation:
Let \(S\) be the angular span in radians determined by acov (e.g., \(2\pi\) for
'default'). The angle for location \(j\) (where \(j\) is the row index from \(0\) to \(N-1\)) is:(4)¶\[\theta_j = \frac{j}{N} \times S\]The code uses np.linspace(0, 1, N) which generates N points from 0 to 1 inclusive, so the formula might be slightly different depending on endpoint handling, effectively \(\theta_j = \frac{j}{N-1} \times S\) for the N points if endpoint=True, or spacing relates to N intervals if
endpoint=False. The code uses np.linspace(0, 1, N) and multiplies by angle_span, suggesting the angles might span from 0 up to angle_span.
References
[1]Kouadio, K. L., Liu, R., Loukou, K. G. H., Liu, J., & Liu, W. (2025). Analytics Framework for Interpreting Spatiotemporal Probabilistic Forecasts. International Journal of Forecasting. Manuscript submitted.
[2]Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90-95.
Examples
>>> import pandas as pd >>> import numpy as np >>> from kdiagram.plot.uncertainty import plot_velocity
1. Random Example:
>>> np.random.seed(0) >>> N_points = 100 >>> df_random = pd.DataFrame({ ... 'location_id': range(N_points), ... 'value_2020_q50': np.random.rand(N_points) * 10, ... 'value_2021_q50': (np.random.rand(N_points) * 10 + ... np.linspace(0, 5, N_points)), ... 'value_2022_q50': (np.random.rand(N_points) * 10 + ... np.linspace(0, 10, N_points)), ... 'latitude': np.linspace(22, 23, N_points) ... }) >>> q50_cols_random = ['value_2020_q50', 'value_2021_q50', ... 'value_2022_q50'] >>> ax_random = plot_velocity( ... df=df_random, ... q50_cols=q50_cols_random, ... theta_col='latitude', # Note: currently ignored for pos ... acov='default', ... normalize=True, ... use_abs_color=False, # Color by velocity ... title='Random Data Velocity Profile', ... cmap='coolwarm', ... s=40, ... cbar=True ... ) >>> # plt.show() is called internally if savefig is None
2. Concrete Example (Subsidence Data - adapted from docstring):
>>> # Assume zhongshan_pred_2023_2026 is a loaded DataFrame like: >>> # zhongshan_pred_2023_2026 = pd.DataFrame({ >>> # 'subsidence_2022_q50': np.random.rand(50)*5 + 5, >>> # 'subsidence_2023_q50': np.random.rand(50)*6 + 6, >>> # 'subsidence_2024_q50': np.random.rand(50)*7 + 7, >>> # 'subsidence_2025_q50': np.random.rand(50)*8 + 8, >>> # 'subsidence_2026_q50': np.random.rand(50)*9 + 9, >>> # 'latitude': np.linspace(22.2, 22.8, 50) >>> # }) # Dummy data for example execution >>> # Create dummy data if zhongshan_pred_2023_2026 doesn't exist >>> try: ... zhongshan_pred_2023_2026 ... except NameError: ... print("Creating dummy subsidence data for example...") ... zhongshan_pred_2023_2026 = pd.DataFrame({ ... 'subsidence_2022_q50': np.random.rand(150)*5 + 5, ... 'subsidence_2023_q50': np.random.rand(150)*6 + 6 + np.linspace(0, 2, 150), ... 'subsidence_2024_q50': np.random.rand(150)*7 + 7 + np.linspace(0, 4, 150), ... 'subsidence_2025_q50': np.random.rand(150)*8 + 8 + np.linspace(0, 6, 150), ... 'subsidence_2026_q50': np.random.rand(150)*9 + 9 + np.linspace(0, 8, 150), ... 'latitude': np.linspace(22.2, 22.8, 150) ... }) >>> subsidence_q50_cols = [ ... 'subsidence_2022_q50', 'subsidence_2023_q50', ... 'subsidence_2024_q50', 'subsidence_2025_q50', ... 'subsidence_2026_q50', ... ] >>> ax_subsidence = plot_velocity( ... df=zhongshan_pred_2023_2026, ... q50_cols=subsidence_q50_cols, ... theta_col='latitude', # Ignored for pos, triggers warning ... acov='quarter_circle', # Focus angular range ... normalize=True, ... use_abs_color=True, # Color by Q50 magnitude ... title='Subsidence Velocity Across Zhongshan (20222026)', ... cmap='plasma', ... s=25, ... cbar=True, ... mask_angle=True # Hide angle labels ... ) >>> # plt.show() called internally