kdiagram.plot.context.plot_qq¶
- kdiagram.plot.context.plot_qq(df, actual_col, pred_col, title=None, xlabel=None, ylabel=None, figsize=(7, 7), show_grid=True, grid_props=None, savefig=None, dpi=300, **scatter_kwargs)[source]¶
Generates a Quantile-Quantile (Q-Q) plot of forecast errors.
This function creates a Q-Q plot, a standard graphical method for comparing a dataset’s distribution to a theoretical distribution (in this case, the normal distribution). It is an essential tool for visually checking if the forecast errors are normally distributed, a key assumption for many statistical methods.
More details in Q-Q Plot User Guide.
- Parameters:
- df
pd.DataFrame The input DataFrame containing the actual and predicted values.
- actual_col
str The name of the column containing the true observed values.
- pred_col
str The name of the column containing the point forecast values.
- title
str,optional The title for the plot.
- xlabel
str,optional The label for the x-axis.
- ylabel
str,optional The label for the y-axis.
- figsize
tupleof(float,float), default=(7, 7) The figure size in inches.
- show_gridbool, default=True
Toggle the visibility of the plot’s grid lines.
- grid_props
dict,optional Custom keyword arguments passed to the grid for styling.
- savefig
str,optional The file path to save the plot. If
None, the plot is displayed interactively.- dpi
int, default=300 The resolution (dots per inch) for the saved figure.
- **scatter_kwargs
Additional keyword arguments passed directly to the underlying scatter plot for the data points.
- df
- Returns:
- ax
matplotlib.axes.Axes The Matplotlib Axes object containing the plot.
- ax
- Parameters:
See also
plot_error_distributionA histogram/KDE plot of the same errors.
scipy.stats.probplotThe underlying SciPy function used.
Notes
A Q-Q plot is constructed by plotting the quantiles of two distributions against each other. This function compares the quantiles of the empirical distribution of the forecast errors, \(e_i = y_{true,i} - y_{pred,i}\), against the theoretical quantiles of a standard normal distribution, \(\mathcal{N}(0, 1)\).
If the two distributions are identical, the resulting points will fall perfectly along the identity line \(y=x\). Systematic deviations from this line indicate a departure from normality.
Examples
>>> import pandas as pd >>> import numpy as np >>> from kdiagram.plot.context import plot_qq >>> >>> # Generate synthetic data with normally distributed errors >>> np.random.seed(0) >>> n_samples = 200 >>> y_true = np.linspace(0, 50, n_samples) >>> errors = np.random.normal(0, 5, n_samples) # Normal errors >>> y_pred = y_true + errors >>> >>> df = pd.DataFrame({'actual': y_true, 'predicted': y_pred}) >>> >>> # Generate the Q-Q plot >>> ax = plot_qq( ... df, ... actual_col='actual', ... pred_col='predicted', ... title="Q-Q Plot of Normally Distributed Errors" ... )