kdiagram.plot.feature_based.plot_feature_fingerprint

kdiagram.plot.feature_based.plot_feature_fingerprint(importances, features=None, labels=None, normalize=True, fill=True, cmap='tab10', title='Feature Impact Fingerprint', figsize=None, show_grid=True, savefig=None)[source]

Create a radar chart visualizing feature importance profiles.

This function generates a polar (radar) chart to visually compare the importance or contribution profiles of a set of features across different groups, conditions, or time periods (e.g., geographical zones, yearly data, different models). Each group is represented by a distinct polygon (layer) on the chart, making it easy to identify patterns, dominant features, and shifts in feature influence across the groups, often referred to as a ‘fingerprint’.

It is particularly useful for model interpretation, allowing a quick comparison of how feature rankings change under different circumstances.

Parameters:
  • importances (array-like of shape (n_layers, n_features)) – The core data containing feature importance values. Each row represents a different layer (e.g., a zone, a year, a model) and each column corresponds to a feature. Can be a list of lists or a NumPy array.

  • features (list of str, optional) – Names of the features corresponding to the columns in importances. The order must match the columns. If None, generic names like ‘feature 1’, ‘feature 2’, etc., will be generated. Default is None.

  • labels (list of str, optional) – Names for each layer (row) in importances. These labels will appear in the legend. If None, generic names like ‘Layer 1’, ‘Layer 2’, etc., will be generated. Default is None.

  • normalize (bool, default True) – If True, normalize the importance values within each layer (row) to a range of [0, 1] by dividing by the maximum importance value in that layer. This is useful for comparing the shape of the importance profiles independent of their absolute magnitudes. If False, the raw importance values are plotted.

  • fill (bool, default True) – If True, the area enclosed by each layer’s polygon on the radar chart will be filled with a semi-transparent color, enhancing visual distinction between layers. If False, only the outlines are plotted.

  • cmap (str or list, default 'tab10') – Matplotlib colormap name (e.g., ‘viridis’, ‘plasma’, ‘tab10’) or a list of valid color specifications (e.g., [‘red’, ‘#00FF00’, ‘blue’]) to color the different layers. If a colormap name is provided, colors will be sampled from it. If a list is provided, it should ideally have at least as many colors as there are layers.

  • title (str, default "Feature Impact Fingerprint") – The title displayed above the radar chart.

  • figsize (tuple of (float, float), default (8, 8)) – The width and height of the figure in inches.

  • show_grid (bool, default True) – If True, display the polar grid lines (both radial and angular) on the plot, which can aid in reading values. If False, the grid is hidden.

  • savefig (str, optional) – The file path (including extension, e.g., ‘fingerprint.png’) where the plot should be saved. If None, the plot is displayed interactively using plt.show() instead of being saved. Default is None.

Returns:

ax – The Matplotlib Axes object containing the radar chart. This can be used for further customization if needed.

Return type:

matplotlib.axes.Axes

See also

matplotlib.pyplot.polar

Underlying function for polar plots.

numpy.linspace

Used for calculating angles.

Notes

  • The function uses helper utilities like ensure_2d and columns_manager (assumed available) for input validation and preprocessing.

  • To create closed polygons, the function appends the first data point and the first angle to the end of their respective lists before plotting each layer.

  • Normalization (normalize=True) scales each layer independently: \(r'_{ij} = r_{ij} / \max_{j}(r_{ij})\), where \(r_{ij}\) is the importance of feature \(j\) for layer \(i\). This can highlight relative importance patterns but obscures absolute magnitude differences between layers.

  • The angular positions of features are evenly spaced around the circle: \(\theta_j = 2 \pi j / N\) for \(j=0, ..., N-1\), where \(N\) is the number of features.

Let \(\mathbf{R}\) be the input importances matrix of shape \((M, N)\), where \(M\) is the number of layers (labels) and \(N\) is the number of features.

  1. Angle Calculation: Angles for each feature axis are calculated as:

    \[\theta_j = \frac{2 \pi j}{N}, \quad j = 0, 1, \dots, N-1\]
  2. Normalization (if normalize=True): Each row \(\mathbf{r}_i = (r_{i0}, r_{i1}, \dots, r_{i,N-1})\) is normalized: .. math:

    r'_{ij} = \frac{r_{ij}}{\max_{k}(r_{ik})}
    

    If \(\max_{k}(r_{ik}) = 0\), \(r'_{ij}\) is set to 0. Let \(\mathbf{R}'\) be the matrix of normalized values.

  3. Plotting: For each layer \(i\), the function plots points in polar coordinates \((r'_{ij}, \theta_j)\) (or \((r_{ij}, \theta_j)\) if not normalized). To close the shape, the first point \((r'_{i0}, \theta_0)\) is repeated at angle \(2\pi\). The points are connected by lines, and optionally, the enclosed area is filled.

Examples

>>> import numpy as np
>>> from kdiagram.plot.feature_based import plot_feature_fingerprint

1. Random Example:

>>> np.random.seed(42) # for reproducibility
>>> random_importances = np.random.rand(3, 6) # 3 layers, 6 features
>>> feature_names = [f'Feature {i+1}' for i in range(6)]
>>> layer_labels = ['Model A', 'Model B', 'Model C']
>>> ax = plot_feature_fingerprint(
...     importances=random_importances,
...     features=feature_names,
...     labels=layer_labels,
...     title="Random Feature Importance Comparison",
...     cmap='Set3',
...     normalize=True,
...     fill=True
... )
>>> # plt.show() is called internally if savefig is None

2. Concrete Example (Yearly Weights):

>>> features = ['rainfall', 'GWL', 'seismic', 'density', 'geo']
>>> weights_per_year = [
...     [0.2, 0.4, 0.1, 0.6, 0.3],  # 2023
...     [0.3, 0.5, 0.2, 0.4, 0.4],  # 2024
...     [0.1, 0.6, 0.2, 0.5, 0.3],  # 2025
... ]
>>> years = ['2023', '2024', '2025']
>>> ax_yearly = plot_feature_fingerprint(
...     importances=weights_per_year,
...     features=features,
...     labels=years,
...     title="Feature Influence Over Years",
...     cmap='tab10',
...     normalize=False # Show raw weights
... )
>>> # plt.show() is called internally