kdiagram.plot.errors.plot_error_bands

kdiagram.plot.errors.plot_error_bands(df, error_col, theta_col, *, theta_period=None, theta_bins=24, n_std=1.0, title=None, figsize=(8.0, 8.0), cmap='viridis', show_grid=True, grid_props=None, mask_angle=False, savefig=None, dpi=300, acov='default', ax=None, **fill_kws)[source]

Plot polar error bands to visualize systemic vs random error.

This function aggregates forecast errors across bins of a cyclical or ordered feature (like month or hour) and plots the mean error and its standard deviation. It is a powerful diagnostic tool for identifying systemic biases and variations in model performance.

Parameters:
dfpd.DataFrame

The input DataFrame containing the error and feature data.

error_colstr

Name of the column containing the forecast error values, typically calculated as actual - predicted.

theta_colstr

Name of the column representing the feature to bin against, which will be mapped to the angular axis.

theta_periodfloat, optional

The period of the cyclical data in theta_col. For example, if theta_col is the month of the year, the period is 12. This ensures the data wraps around the circle correctly.

theta_binsint, default=24

The number of angular bins to group the data into for calculating statistics.

n_stdfloat, default=1.0

The number of standard deviations to display in the shaded error band around the mean error line.

titlestr, optional

The title for the plot. If None, a default is generated.

figsizetuple of (float, float), default=(8, 8)

Figure size in inches.

cmapstr, default=’viridis’

Note: This parameter is currently not used in this function as colors are fixed for clarity (black, red, and a fill color).

show_gridbool, default=True

Toggle gridlines via the package helper set_axis_grid.

grid_propsdict, optional

Keyword arguments passed to set_axis_grid for grid customization.

mask_anglebool, default=False

If True, hide the angular tick labels.

savefigstr, optional

If provided, save the figure to this path; otherwise the plot is shown interactively.

dpiint, default=300

Resolution for the saved figure.

**fill_kwsdict, optional

Additional keyword arguments passed to the ax.fill_between call for the shaded error band (e.g., color, alpha).

Returns:
axmatplotlib.axes.Axes or None

The Matplotlib Axes object containing the plot, or None if the plot could not be generated.

Parameters:

Notes

The plot visualizes the first two moments (mean and standard deviation) of the error distribution conditioned on the angular variable \(\theta\).

  1. Binning: The data is first partitioned into \(K\) bins based on the values in theta_col. Let \(B_k\) be the set of indices of data points belonging to the \(k\)-th bin.

  2. Mean Error Calculation: For each bin \(B_k\), the mean error \(\mu_{e,k}\) is calculated. This value is plotted as a point on the central black line.

    (1)\[\mu_{e,k} = \frac{1}{|B_k|} \sum_{i \in B_k} e_i\]

    where \(e_i\) is the error for data point \(i\). A consistent deviation of this line from the zero-error circle indicates a systemic bias.

  3. Error Variance Calculation: For each bin, the standard deviation of the error, \(\sigma_{e,k}\), is also calculated.

    (2)\[\begin{split}\sigma_{e,k} = \sqrt{\frac{1}{|B_k|-1}\\ \sum_{i \in B_k} (e_i - \mu_{e,k})^2}\end{split}\]
  4. Band Construction: A shaded band is drawn between the lower and upper bounds, defined by the mean plus or minus a multiple of the standard deviation.

    (3)\[\begin{split}\text{Upper Bound}_k &= \mu_{e,k} + n_{std} \cdot \sigma_{e,k} \\ \text{Lower Bound}_k &= \mu_{e,k} - n_{std} \cdot \sigma_{e,k}\end{split}\]

    The width of this band indicates the random error or inconsistency of the model within that bin.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from kdiagram.plot.errors import plot_error_bands
>>>
>>> # Simulate a model with seasonal error patterns
>>> np.random.seed(42)
>>> n_points = 2000
>>> day_of_year = np.arange(n_points) % 365
>>> month = (day_of_year // 30) + 1
>>>
>>> # Create a bias (positive error) in summer and more noise in winter
>>> seasonal_bias = np.sin((day_of_year - 90) * np.pi / 180) * 5
>>> seasonal_noise = 2 + 2 * np.cos(day_of_year * np.pi / 180)**2
>>> errors = seasonal_bias + np.random.normal(0, seasonal_noise, n_points)
>>>
>>> df_seasonal = pd.DataFrame({'month': month, 'forecast_error': errors})
>>>
>>> # Generate the plot
>>> ax = plot_error_bands(
...     df=df_seasonal,
...     error_col='forecast_error',
...     theta_col='month',
...     theta_period=12,
...     theta_bins=12,
...     n_std=1.5,
...     title='Seasonal Forecast Error Analysis',
...     color='#2980B9',
...     alpha=0.3
... )