kdiagram.plot.errors.plot_error_bands¶
- kdiagram.plot.errors.plot_error_bands(df, error_col, theta_col, *, theta_period=None, theta_bins=24, n_std=1.0, title=None, figsize=(8.0, 8.0), cmap='viridis', show_grid=True, grid_props=None, mask_angle=False, savefig=None, dpi=300, acov='default', ax=None, **fill_kws)[source]¶
Plot polar error bands to visualize systemic vs random error.
This function aggregates forecast errors across bins of a cyclical or ordered feature (like month or hour) and plots the mean error and its standard deviation. It is a powerful diagnostic tool for identifying systemic biases and variations in model performance.
- Parameters:
- df
pd.DataFrame The input DataFrame containing the error and feature data.
- error_col
str Name of the column containing the forecast error values, typically calculated as
actual - predicted.- theta_col
str Name of the column representing the feature to bin against, which will be mapped to the angular axis.
- theta_period
float,optional The period of the cyclical data in
theta_col. For example, iftheta_colis the month of the year, the period is 12. This ensures the data wraps around the circle correctly.- theta_bins
int, default=24 The number of angular bins to group the data into for calculating statistics.
- n_std
float, default=1.0 The number of standard deviations to display in the shaded error band around the mean error line.
- title
str,optional The title for the plot. If
None, a default is generated.- figsize
tupleof(float,float), default=(8, 8) Figure size in inches.
- cmap
str, default=’viridis’ Note: This parameter is currently not used in this function as colors are fixed for clarity (black, red, and a fill color).
- show_gridbool, default=True
Toggle gridlines via the package helper
set_axis_grid.- grid_props
dict,optional Keyword arguments passed to
set_axis_gridfor grid customization.- mask_anglebool, default=False
If
True, hide the angular tick labels.- savefig
str,optional If provided, save the figure to this path; otherwise the plot is shown interactively.
- dpi
int, default=300 Resolution for the saved figure.
- **fill_kws
dict,optional Additional keyword arguments passed to the
ax.fill_betweencall for the shaded error band (e.g.,color,alpha).
- df
- Returns:
- ax
matplotlib.axes.AxesorNone The Matplotlib Axes object containing the plot, or
Noneif the plot could not be generated.
- ax
- Parameters:
Notes
The plot visualizes the first two moments (mean and standard deviation) of the error distribution conditioned on the angular variable \(\theta\).
Binning: The data is first partitioned into \(K\) bins based on the values in
theta_col. Let \(B_k\) be the set of indices of data points belonging to the \(k\)-th bin.Mean Error Calculation: For each bin \(B_k\), the mean error \(\mu_{e,k}\) is calculated. This value is plotted as a point on the central black line.
(1)¶\[\mu_{e,k} = \frac{1}{|B_k|} \sum_{i \in B_k} e_i\]where \(e_i\) is the error for data point \(i\). A consistent deviation of this line from the zero-error circle indicates a systemic bias.
Error Variance Calculation: For each bin, the standard deviation of the error, \(\sigma_{e,k}\), is also calculated.
(2)¶\[\begin{split}\sigma_{e,k} = \sqrt{\frac{1}{|B_k|-1}\\ \sum_{i \in B_k} (e_i - \mu_{e,k})^2}\end{split}\]Band Construction: A shaded band is drawn between the lower and upper bounds, defined by the mean plus or minus a multiple of the standard deviation.
(3)¶\[\begin{split}\text{Upper Bound}_k &= \mu_{e,k} + n_{std} \cdot \sigma_{e,k} \\ \text{Lower Bound}_k &= \mu_{e,k} - n_{std} \cdot \sigma_{e,k}\end{split}\]The width of this band indicates the random error or inconsistency of the model within that bin.
Examples
>>> import numpy as np >>> import pandas as pd >>> from kdiagram.plot.errors import plot_error_bands >>> >>> # Simulate a model with seasonal error patterns >>> np.random.seed(42) >>> n_points = 2000 >>> day_of_year = np.arange(n_points) % 365 >>> month = (day_of_year // 30) + 1 >>> >>> # Create a bias (positive error) in summer and more noise in winter >>> seasonal_bias = np.sin((day_of_year - 90) * np.pi / 180) * 5 >>> seasonal_noise = 2 + 2 * np.cos(day_of_year * np.pi / 180)**2 >>> errors = seasonal_bias + np.random.normal(0, seasonal_noise, n_points) >>> >>> df_seasonal = pd.DataFrame({'month': month, 'forecast_error': errors}) >>> >>> # Generate the plot >>> ax = plot_error_bands( ... df=df_seasonal, ... error_col='forecast_error', ... theta_col='month', ... theta_period=12, ... theta_bins=12, ... n_std=1.5, ... title='Seasonal Forecast Error Analysis', ... color='#2980B9', ... alpha=0.3 ... )