kdiagram.plot.context.plot_qq

kdiagram.plot.context.plot_qq(df, actual_col, pred_col, title=None, xlabel=None, ylabel=None, figsize=(7, 7), show_grid=True, grid_props=None, savefig=None, dpi=300, **scatter_kwargs)[source]

Generates a Quantile-Quantile (Q-Q) plot of forecast errors.

This function creates a Q-Q plot, a standard graphical method for comparing a dataset’s distribution to a theoretical distribution (in this case, the normal distribution). It is an essential tool for visually checking if the forecast errors are normally distributed, a key assumption for many statistical methods.

More details in Q-Q Plot User Guide.

Parameters:
dfpd.DataFrame

The input DataFrame containing the actual and predicted values.

actual_colstr

The name of the column containing the true observed values.

pred_colstr

The name of the column containing the point forecast values.

titlestr, optional

The title for the plot.

xlabelstr, optional

The label for the x-axis.

ylabelstr, optional

The label for the y-axis.

figsizetuple of (float, float), default=(7, 7)

The figure size in inches.

show_gridbool, default=True

Toggle the visibility of the plot’s grid lines.

grid_propsdict, optional

Custom keyword arguments passed to the grid for styling.

savefigstr, optional

The file path to save the plot. If None, the plot is displayed interactively.

dpiint, default=300

The resolution (dots per inch) for the saved figure.

**scatter_kwargs

Additional keyword arguments passed directly to the underlying scatter plot for the data points.

Returns:
axmatplotlib.axes.Axes

The Matplotlib Axes object containing the plot.

Parameters:

See also

plot_error_distribution

A histogram/KDE plot of the same errors.

scipy.stats.probplot

The underlying SciPy function used.

Notes

A Q-Q plot is constructed by plotting the quantiles of two distributions against each other. This function compares the quantiles of the empirical distribution of the forecast errors, \(e_i = y_{true,i} - y_{pred,i}\), against the theoretical quantiles of a standard normal distribution, \(\mathcal{N}(0, 1)\).

If the two distributions are identical, the resulting points will fall perfectly along the identity line \(y=x\). Systematic deviations from this line indicate a departure from normality.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from kdiagram.plot.context import plot_qq
>>>
>>> # Generate synthetic data with normally distributed errors
>>> np.random.seed(0)
>>> n_samples = 200
>>> y_true = np.linspace(0, 50, n_samples)
>>> errors = np.random.normal(0, 5, n_samples) # Normal errors
>>> y_pred = y_true + errors
>>>
>>> df = pd.DataFrame({'actual': y_true, 'predicted': y_pred})
>>>
>>> # Generate the Q-Q plot
>>> ax = plot_qq(
...     df,
...     actual_col='actual',
...     pred_col='predicted',
...     title="Q-Q Plot of Normally Distributed Errors"
... )