kdiagram.plot.errors.plot_error_violins¶
- kdiagram.plot.errors.plot_error_violins(df, *error_cols, names=None, title=None, figsize=(9.0, 9.0), cmap='viridis', colors=None, show_grid=True, grid_props=None, savefig=None, dpi=300, acov='default', ax=None, mode='optimized', bw_method=None, overlay=False, overlay_angle=None, show_stats=False, **violin_kws)[source]¶
Plot polar violin plots to compare multiple error distributions.
This function creates a polar plot where each angular sector contains a violin plot representing the error distribution of a different model or dataset. It is a powerful tool for visually comparing bias, variance, and the overall shape of error distributions [1].
- Parameters:
- df
pd.DataFrame The input DataFrame containing the error data.
- *error_cols
str One or more column names from
df, each containing the error values (e.g.,actual - predicted) for a model to be plotted.- names
listofstr,optional Display names for each of the models corresponding to
error_cols. If not provided, generic names like'Model 1'will be generated. The list length must match the number of error columns.- title
str,optional The title for the plot. If
None, a default is generated.- figsize
tupleof(float,float), default=(9, 9) Figure size in inches.
- cmap
str, default=’viridis’ Matplotlib colormap used to assign a unique color to each violin plot.
- colors
listofstr,optional An explicit list of colors to use for the violins. If provided, this overrides
cmap. The list will cycle if it is shorter than the number of error columns.- acov{‘default’, ‘half_circle’, ‘quarter_circle’, ‘eighth_circle’},
default=’default’ Angular coverage (span) of the plot:
'default': \(2\pi\) (full circle)'half_circle': \(\pi\)'quarter_circle': \(\tfrac{\pi}{2}\)'eighth_circle': \(\tfrac{\pi}{4}\)
- mode{‘optimized’, ‘basic’}, default=’cbueth’
The plotting mode to use.
'optimized'or'cbueth'(Default) A mode inspired by reviewer feedback (see Notes). It maps error magnitude to the radius, splits violins into positive/negative lobes, and uses a central dot for the zero reference. This mode is optimized for detecting bias and skew.'basic': The original implementation where the radial axis directly represents the error value (positive and negative), and violins are centered on their assigned angle. The zero reference is a dashed circle.
- bw_method
float,str,orNone, default=None The method used to calculate the estimator bandwidth for the KDE. This is passed directly to
scipy.stats.gaussian_kde. IfNone, it uses the default “scott”.- overlaybool or ‘auto’, default=’auto’
Applies to ‘cbueth’ mode only. If
True, all violins are overlaid on a single shared spoke for direct comparison (best for k=1 or k=2). IfFalse, each violin gets its own spoke. If'auto'(default), overlay is enabled if k <= 2 and disabled otherwise.- overlay_angle
float,optional Applies to ‘cbueth’ mode when ``overlay=True``. The angle (in radians) for the shared spoke. If
None, defaults to \(\pi/2\) (vertical North).- show_statsbool, default=False
Applies to ‘cbueth’ mode only. If
True, appends the median and skew of each distribution to its legend entry (e.g., “Model A (med=0.45; skew=-0.12)”).- show_gridbool, default=True
Toggle gridlines via the package helper
set_axis_grid.- grid_props
dict,optional Keyword arguments passed to
set_axis_gridfor grid customization.- savefig
str,optional If provided, save the figure to this path; otherwise the plot is shown interactively.
- dpi
int, default=300 Resolution for the saved figure.
- **violin_kws
dict,optional Additional keyword arguments passed to the
ax.fillcall for each violin (e.g.,alpha,edgecolor).
- df
- Returns:
- ax
matplotlib.axes.AxesorNone The Matplotlib Axes object containing the plot, or
Noneif the plot could not be generated.
- ax
- Parameters:
df (DataFrame)
error_cols (str)
title (str | None)
cmap (str)
show_grid (bool)
savefig (str | None)
dpi (int)
acov (Literal['default', 'half_circle', 'quarter_circle', 'eighth_circle'])
ax (Axes | None)
mode (Literal['basic', 'cbueth', 'optimized'])
overlay (bool)
overlay_angle (float | None)
show_stats (bool)
Notes
The plot visualizes and compares several one-dimensional error distributions. It adapts the standard violin plot [1] to a polar coordinate system for multi-model comparison.
Kernel Density Estimation (KDE): For each model’s error data \(\mathbf{x} = \{x_1, x_2, ..., x_n\}\), the probability density function (PDF), \(\hat{f}_h(x)\), is estimated using a Gaussian kernel. This creates a smooth curve representing the distribution’s shape.
(1)¶\[\hat{f}_h(x) = \frac{1}{nh} \sum_{i=1}^{n} K\left(\frac{x - x_i}{h}\right)\]where \(K\) is the Gaussian kernel and \(h\) is the bandwidth, a smoothing parameter.
Violin Construction: The violin shape is created by plotting the density curve \(\hat{f}_h(x)\) symmetrically around a central axis. The width of the violin at any given error value \(x\) is proportional to its estimated density.
Polar Arrangement: Each model’s violin is assigned a unique angular sector on the polar plot. The radial axis represents the error value, with a reference circle at \(r=0\) indicating a perfect forecast. The violin is drawn radially within its assigned sector.
The ‘cbueth’ Mode (Default)
The default
mode='cbueth'was developed in response to insightful feedback during the JOSS paper review process. A reviewer noted that the original “basic” mode could sometimes make skewed distributions difficult to interpret, especially when comparing only two models.> “…wouldn’t it be easier to just plot them on top of each other > with transparency? … Maybe this is just a problem when just > comparing two and not more models; with three it is already > prettier.”
To honor this contribution, the new mode was named after the reviewer’s GitHub handle. It addresses this feedback with key design changes:
Radial Axis: The radius maps to absolute error \(|E|\), so all data starts from the center. The zero-error reference is a single point at the origin.
Two-Lobe Design: Each violin is split into two lobes around its central spoke:
Right Lobe: Distribution of positive errors (\(E > 0\)).
Left Lobe: Distribution of negative errors (\(E < 0\)).
Interpretation: This design makes key metrics instantly visible:
- Bias: An imbalance in the size of the two lobes
(e.g., a larger right lobe means a positive bias).
Skew: Asymmetry within a single lobe.
Variance: The overall radial extent of the lobes.
Auto-Overlay: As suggested, when only two models are plotted (
overlay='auto'), they are drawn on top of each other with transparency for a direct, “face-off” style comparison. For 3+ models, they are given separate spokes.
References
Examples
>>> import numpy as np >>> import pandas as pd >>> from kdiagram.plot.errors import plot_polar_error_violins >>> >>> # Simulate errors from three different models >>> np.random.seed(0) >>> n_points = 1000 >>> df_errors = pd.DataFrame({ ... 'Model A (Good)': np.random.normal( ... loc=0.5, scale=1.5, size=n_points), ... 'Model B (Biased)': np.random.normal( ... loc=-4.0, scale=1.5, size=n_points), ... 'Model C (Inconsistent)': np.random.normal( ... loc=0, scale=4.0, size=n_points), ... }) >>> >>> # Generate the polar violin plot >>> ax = plot_polar_error_violins( ... df_errors, ... 'Model A (Good)', ... 'Model B (Biased)', ... 'Model C (Inconsistent)', ... title='Comparison of Model Error Distributions', ... cmap='plasma', ... alpha=0.7 ... )