kdiagram.plot.context.plot_error_pacf

kdiagram.plot.context.plot_error_pacf(df, actual_col, pred_col, title=None, xlabel=None, ylabel=None, figsize=(10, 5), show_grid=True, grid_props=None, savefig=None, dpi=300, **pacf_kwargs)[source]

Plots the partial autocorrelation of forecast errors.

This function creates a Partial Autocorrelation Function (PACF) plot of the forecast errors. It is a critical companion to the ACF plot, used to identify the direct relationship between an error and its past values, after removing the effects of the intervening lags. This plot requires the statsmodels package.

Additional details can be found in Error Partial Autocorrelation (PACF) Plot User Guide

Parameters:
dfpd.DataFrame

The input DataFrame containing the actual and predicted values.

actual_colstr

The name of the column containing the true observed values.

pred_colstr

The name of the column containing the point forecast values.

titlestr, optional

The title for the plot.

xlabelstr, optional

The label for the x-axis.

ylabelstr, optional

The label for the y-axis.

figsizetuple of (float, float), default=(10, 5)

The figure size in inches.

show_gridbool, default=True

Toggle the visibility of the plot’s grid lines.

grid_propsdict, optional

Custom keyword arguments passed to the grid for styling.

savefigstr, optional

The file path to save the plot. If None, the plot is displayed interactively.

dpiint, default=300

The resolution (dots per inch) for the saved figure.

**pacf_kwargs

Additional keyword arguments passed directly to the underlying statsmodels.graphics.tsaplots.plot_pacf function.

Returns:
axmatplotlib.axes.Axes

The Matplotlib Axes object containing the plot.

Parameters:

See also

plot_error_autocorrelation

The companion plot for autocorrelation.

Notes

While the ACF at lag \(k\) shows the total correlation between \(e_t\) and \(e_{t-k}\), the PACF shows the partial correlation. It measures the correlation between \(e_t\) and \(e_{t-k}\) after removing the linear dependence on the intermediate observations \(e_{t-1}, e_{t-2}, ..., e_{t-k+1}\).

This helps to isolate the direct relationship at a specific lag, making it a key tool for identifying the order of autoregressive (AR) processes in the residuals.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from kdiagram.plot.context import plot_error_pacf
>>>
>>> # Generate synthetic data where errors have an AR(2) structure
>>> np.random.seed(0)
>>> n_samples = 200
>>> y_true = np.linspace(0, 50, n_samples)
>>> errors = np.zeros(n_samples)
>>> errors[0] = np.random.normal(0, 1)
>>> errors[1] = 0.6 * errors[0] + np.random.normal(0, 1)
>>> for t in range(2, n_samples):
...     errors[t] = 0.6 * errors[t-1] - 0.3 * errors[t-2] + np.random.normal(0, 1)
>>> y_pred = y_true - errors # Subtracting so error = actual - pred
>>>
>>> df = pd.DataFrame({'actual': y_true, 'predicted': y_pred})
>>>
>>> # Generate the PACF plot
>>> # The plot should show significant spikes at lags 1 and 2
>>> try:
...     ax = plot_error_pacf(
...         df,
...         actual_col='actual',
...         pred_col='predicted',
...         title="PACF of AR(2) Errors"
...     )
... except ImportError:
...     print("Skipping PACF plot: statsmodels is not installed.")