.. _gallery_uncertainty:
==============================
Uncertainty Visualizations
==============================
This page showcases examples of plots specifically designed for
exploring, diagnosing, and communicating aspects of predictive
uncertainty using `k-diagram`.
.. note::
You need to run the code snippets locally to generate the plot
images referenced below (e.g., ``../images/gallery_actual_vs_predicted.png``).
Ensure the image paths in the ``.. image::`` directives match where
you save the plots (likely an ``images`` subdirectory relative to
this file, e.g., `../images/`).
.. _gallery_plot_actual_vs_predicted:
----------------------
Actual vs. Predicted
----------------------
Compares actual observed values against point predictions (e.g.,
Q50) sample-by-sample. Useful for assessing basic accuracy and
bias.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(66)
n_points = 120
df = pd.DataFrame({'sample': range(n_points)})
signal = 20 + 15 * np.cos(np.linspace(0, 6 * np.pi, n_points))
df['actual'] = signal + np.random.randn(n_points) * 3
df['predicted'] = signal * 0.9 + np.random.randn(n_points) * 2 + 2
# --- Plotting ---
kd.plot_actual_vs_predicted(
df=df,
actual_col='actual',
pred_col='predicted',
title='Gallery: Actual vs. Predicted (Dots)',
line=False, # Use dots instead of lines
r_label="Value",
actual_props={'s': 25, 'alpha': 0.7, 'color':'black'}, # Explicit color
pred_props={'s': 35, 'marker': 'x', 'alpha': 0.7, 'color':'red'}, # Explicit color & size
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_actual_vs_predicted.png"
)
plt.close() # Close the plot window after saving
.. image:: ../images/gallery_actual_vs_predicted.png
:alt: Actual vs. Predicted Plot Example
:align: center
:width: 70%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This polar plot provides a direct visual comparison between actual
values (black dots) and model-predicted medians (red crosses, Q50)
across a set of samples arranged angularly (by index). Each
point's distance from the center (radius) corresponds to the
magnitude of its value.
**Key Insights:**
* **Accuracy & Discrepancies:** Close alignment between black dots
and red crosses indicates accurate predictions for that sample.
Deviations highlight errors. The grey connecting lines (if
``line=True``) emphasize the error magnitude and direction.
* **Systematic Bias:** Look for consistent patterns where red
crosses are generally inside (under-prediction) or outside
(over-prediction) the black dots.
* **Outliers:** Samples with unusually large gaps between actual
and predicted values are easily spotted.
**🔍 In this Example:**
* The points form clear cyclical patterns, matching the
underlying cosine wave used in data generation.
* Predictions (red crosses) generally track the actual values
(black dots) but exhibit some scatter (noise) and slight
magnitude differences, particularly near the peaks (outer radius)
and troughs (inner radius) of the cycle.
* There might be a subtle tendency for red crosses to be slightly
closer to the center than black dots, suggesting mild
underprediction or damping in the simulated model.
**💡 When to Use:**
Use this plot as a primary diagnostic tool to:
* Get an initial visual assessment of point-forecast accuracy.
* Quickly identify overall model bias (systematic over/under
prediction).
* Spot specific samples or regions (if angle is meaningful)
with large prediction errors.
* Complement numerical scores (MAE, RMSE) with an intuitive
overview of model fit, especially for cyclical or ordered data.
.. raw:: html
.. _gallery_plot_anomaly_magnitude:
--------------------
Anomaly Magnitude
--------------------
Highlights instances where the actual value falls outside the
prediction interval [Qlow, Qup]. Shows the location (angle), type
(color), and severity (radius) of anomalies.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(42)
n_points = 180
df = pd.DataFrame({'sample_id': range(n_points)})
df['actual'] = np.random.normal(loc=20, scale=5, size=n_points)
df['q10'] = df['actual'] - np.random.uniform(2, 6, size=n_points)
df['q90'] = df['actual'] + np.random.uniform(2, 6, size=n_points)
# Add anomalies
under_indices = np.random.choice(n_points, 20, replace=False)
df.loc[under_indices, 'actual'] = df.loc[under_indices, 'q10'] - \
np.random.uniform(1, 5, size=20)
available = list(set(range(n_points)) - set(under_indices))
over_indices = np.random.choice(available, 20, replace=False)
df.loc[over_indices, 'actual'] = df.loc[over_indices, 'q90'] + \
np.random.uniform(1, 5, size=20)
# --- Plotting ---
kd.plot_anomaly_magnitude(
df=df,
actual_col='actual',
q_cols=['q10', 'q90'],
title="Gallery: Prediction Anomaly Magnitude",
cbar=True,
s=30,
verbose=0, # Keep output clean for gallery
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_anomaly_magnitude.png"
)
plt.close()
.. image:: ../images/gallery_anomaly_magnitude.png
:alt: Anomaly Magnitude Plot Example
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
The **Anomaly Magnitude Plot** provides valuable insights into
prediction interval failures, showing how far actual values
deviate when they fall outside the predicted bounds (e.g.,
Q10 and Q90). Only points representing anomalies are plotted.
**Key Features:**
* **Angle (θ):** Represents the sample's position or index
in the dataset, arranged circularly.
* **Radius (r):** Directly corresponds to the **magnitude** of
the anomaly (:math:`|y_{actual} - y_{bound}|`). Larger radii
indicate more severe prediction interval failures.
* **Color:** Distinguishes the **type** of anomaly using
different colormaps (defaults: Blues for under-prediction,
Reds for over-prediction).
* **Color Intensity:** Further emphasizes the anomaly's
**severity**, with darker/more intense colors typically
representing larger magnitudes (larger radius).
**🔍 In this Example:**
* The plot clearly separates under-predictions (blue points,
where `actual < q10`) and over-predictions (red points,
where `actual > q90`).
* Points further from the center represent larger deviations from
the predicted [q10, q90] range. We can visually identify the
most significant prediction failures.
* The angular distribution shows where these failures occur within
the sample order. Clusters might indicate problematic regimes.
**💡 When to Use:**
This plot is essential for diagnosing model uncertainty calibration
and identifying high-risk predictions:
* **Pinpoint Interval Failures:** Identify exactly which samples
fall outside the expected range.
* **Assess Anomaly Severity:** Quantify *how far* outside the bounds
the actual values lie.
* **Analyze Error Type:** Determine if the model tends to fail more
often through under-prediction or over-prediction.
* **Guide Model Refinement:** Focus attention on samples/regions
with large anomalies where uncertainty estimation needs improvement.
It offers a geographically or temporally focused investigation into
where and how prediction *intervals* fail, complementing plots
that assess point forecast accuracy.
.. raw:: html
.. _gallery_plot_overall_coverage:
--------------------
Overall Coverage
--------------------
Calculates and displays the overall empirical coverage rate(s)
compared to the nominal rate. Useful for comparing average
interval calibration across models. Shown here with a radar plot
for two simulated models.
.. code-block:: python
:linenos:
import kdiagram as kd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(42)
y_true = np.random.rand(100) * 10
# Model 1 (e.g., ~80% coverage)
y_pred_q1 = np.sort(np.random.normal(
loc=y_true[:, np.newaxis], scale=1.5, size=(100, 2)), axis=1)
# Model 2 (e.g., ~60% coverage - narrower intervals)
y_pred_q2 = np.sort(np.random.normal(
loc=y_true[:, np.newaxis], scale=0.8, size=(100, 2)), axis=1)
q_levels = [0.1, 0.9] # Nominal 80% interval
# --- Plotting ---
kd.plot_coverage(
y_true,
y_pred_q1,
y_pred_q2,
names=['Model A (Wider)', 'Model B (Narrower)'],
q=q_levels,
kind='radar', # Use radar chart for profile comparison
title='Gallery: Overall Coverage Comparison (Radar)',
cov_fill=True,
verbose=0,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_coverage_radar.png"
)
plt.close()
.. image:: ../images/gallery_coverage_radar.png
:alt: Overall Coverage Radar Plot Example
:align: center
:width: 70%
.. topic:: Analysis and Interpretation
:class: hint
This plot compares the **overall empirical coverage rate**
between two simulated models using a radar plot. It helps assess
the **interval calibration** across models, evaluating how well
their predicted intervals (e.g., Q10 to Q90, implying 80%
nominal coverage here) capture the actual values *on average*
over the dataset.
**Analysis and Interpretation:**
In this example radar plot:
* **Model A (Wider):** Exhibits a higher coverage rate
(closer to the outer edge, likely near the target 80% or
higher). This indicates its wider prediction intervals
successfully encompass a larger fraction of the true values.
While seemingly safer, it might suggest the model is
conservative, potentially overestimating uncertainty.
* **Model B (Narrower):** Shows a lower coverage rate (points
closer to the center). Its narrower intervals fail to capture
the true value more often. This model might seem more precise
but likely underestimates uncertainty, increasing the risk of
errors where reality falls outside the predicted range.
The radar layout effectively contrasts the coverage profiles.
Points closer to the outer boundary (radius 1.0) represent
better average coverage relative to the defined interval.
**When to Use This Plot:**
* **Comparing Interval Calibration:** Ideal for a high-level
comparison of how well different models' uncertainty estimates
are calibrated (on average). Is one model consistently too wide
(over-covered) or too narrow (under-covered)?
* **Model Selection:** Aids in selecting a model based on risk
tolerance. Model A might be preferred for risk-averse tasks,
while Model B might be chosen if tighter (though less reliable)
intervals are desired.
* **Summarizing Reliability:** Provides a concise summary of the
average reliability of prediction intervals.
.. raw:: html
.. _gallery_plot_coverage_diagnostic:
----------------------
Coverage Diagnostic
----------------------
Visualizes coverage success (radius 1) or failure (radius 0) for
each individual data point. Helps diagnose *where* intervals fail.
The solid line shows the overall average coverage rate. Shown here
using bars.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(88)
n_points = 200
df = pd.DataFrame({'point_id': range(n_points)})
df['actual_val'] = np.random.normal(loc=5, scale=1.5, size=n_points)
df['q_lower'] = 5 - np.random.uniform(1, 3, n_points)
df['q_upper'] = 5 + np.random.uniform(1, 3, n_points)
# Some points deliberately outside
df.loc[::15, 'actual_val'] = df.loc[::15, 'q_upper'] + 1
# --- Plotting ---
kd.plot_coverage_diagnostic(
df=df,
actual_col='actual_val',
q_cols=['q_lower', 'q_upper'],
title='Gallery: Point-wise Coverage Diagnostic (Bars)',
as_bars=True, # Display as bars instead of scatter
fill_gradient=True, # Show background gradient
coverage_line_color='darkorange', # Example customization
verbose=0,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_coverage_diagnostic_bars.png"
)
plt.close()
.. image:: ../images/gallery_coverage_diagnostic_bars.png
:alt: Coverage Diagnostic Plot Example (Bars)
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot provides a **point-wise coverage diagnostic**, showing
if the actual value for *each sample* falls within the
prediction interval (e.g., Q10-Q90). Each bar (or point if
``as_bars=False``) represents one sample, arranged angularly
by index.
**🔍 Key Insights from this Example:**
* **Bar Height/Radius:** Indicates coverage status. A bar
reaching radius **1** means the actual value was *inside* the
interval (success). A bar at radius **0** means the actual
value was *outside* (failure).
* **Color (Implied):** Although not the primary focus here,
the points/bars are often colored by coverage status (e.g.,
using the `cmap` parameter, green for 1, red for 0).
* **Average Coverage Line:** The solid circular line (orange
in this example code's customization) is drawn at the
radius corresponding to the **overall coverage rate**
(e.g., 0.75 if 75% of points are covered). This provides an
immediate visual benchmark against the nominal target (e.g.,
0.80 for a Q10-Q90 interval) and the plot boundaries (0 & 1).
* **Patterns:** Look for clusters of bars at radius 0. These
indicate ranges of samples (or specific conditions if the
angle represented something else) where the model's intervals
consistently fail.
**💡 When to Use This Plot:**
* **Diagnosing Interval Failures:** Go beyond the average score
provided by ``plot_coverage`` to see *which specific samples*
are missed by the prediction intervals.
* **Identifying Systematic Errors:** Determine if coverage
failures are random or concentrated in certain parts of the
data distribution (represented by angles).
* **Visual Calibration Assessment:** Get a detailed view of how
well the empirical coverage matches the nominal rate point-
by-point, complementing the overall average line.
* **Guiding Model Improvement:** Pinpoint problematic samples
or regimes where uncertainty quantification needs refinement.
.. raw:: html
.. _gallery_plot_interval_consistency:
-------------------------
Interval Consistency
-------------------------
Analyzes the stability of the prediction interval width (Qup - Qlow)
for each location over multiple time steps. Radius shows
variability (CV or Std Dev); color often shows average Q50. High
radius means inconsistent width.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(42)
n_points = 100
n_years = 4
years = list(range(2021, 2021 + n_years))
df = pd.DataFrame({'id': range(n_points)})
qlow_cols, qup_cols, q50_cols = [], [], []
for i, year in enumerate(years):
ql, qu, q50 = f'val_{year}_q10', f'val_{year}_q90', f'val_{year}_q50'
qlow_cols.append(ql); qup_cols.append(qu); q50_cols.append(q50)
base_low = np.random.rand(n_points)*5 + i*0.2
width = np.random.rand(n_points)*3 + 1 + np.sin(
np.linspace(0, np.pi, n_points))*i # Vary width
df[ql] = base_low; df[qu] = base_low + width
df[q50] = base_low + width/2 + np.random.randn(n_points)*0.5
# --- Plotting ---
kd.plot_interval_consistency(
df=df,
qlow_cols=qlow_cols,
qup_cols=qup_cols,
q50_cols=q50_cols, # Color by average Q50
use_cv=True, # Radius = Coefficient of Variation of width
title='Gallery: Interval Width Consistency (CV)',
acov='half_circle',
cmap='viridis',
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_interval_consistency_cv.png"
)
plt.close()
.. image:: ../images/gallery_interval_consistency_cv.png
:alt: Interval Consistency Plot Example
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot analyzes the **stability** of prediction interval
widths (e.g., Q90 - Q10) over multiple time steps or forecast
horizons for different samples (locations/indices arranged
angularly).
**Key Features:**
* **Radius (r):** Corresponds to the **variability** of the
interval width over time for each sample. By default
(``use_cv=True``), it shows the **Coefficient of Variation (CV)**,
representing relative variability. If ``use_cv=False``, it shows
the standard deviation (absolute variability).
* *Large Radius:* High inconsistency (width fluctuates a lot).
* *Small Radius:* High consistency (width is stable).
* **Color:** Typically represents the **average Q50** (median
prediction) across the time steps for each sample, providing
context about the prediction magnitude. Darker/cooler colors
often indicate lower average Q50, brighter/warmer colors
indicate higher average Q50 (depending on the `cmap`).
* **Angle (θ):** Represents the sample index or location.
**🔍 Key Insights from this Example:**
* Points far from the center indicate locations where the model's
uncertainty estimate (interval width) is **less stable** across
the different years included in the data.
* Points clustered near the center represent locations with
**consistent** interval widths over time.
* The color mapping (using `viridis`) shows whether high/low
consistency (radius) correlates with high/low average predicted
values (color). For instance, are the most inconsistent
predictions (large radius) happening in areas predicted to have
high values (yellow) or low values (purple)?
**💡 When to Use This Plot:**
* **Assess Model Stability:** Identify samples/locations where
uncertainty predictions are erratic or stable over time/horizons.
* **Diagnose Uncertainty Drift:** While other plots show average
drift, this shows the *variability* aspect of drift for each
point.
* **Compare Relative vs. Absolute Variability:** Toggle `use_cv`
to understand if large fluctuations are significant relative to
the mean width (CV) or just large in absolute terms (Std Dev).
* **Guide Risk Assessment:** Focus on predictions where interval
widths are stable (low radius) for more reliable planning, and
treat predictions with high variability (high radius) with more
caution.
.. raw:: html
.. _gallery_plot_interval_width:
-------------------
Interval Width
-------------------
Visualizes the magnitude of the prediction interval width (Qup - Qlow)
for each sample at a **single time point**. Radius directly represents
the width. Color can represent width or an optional third variable
(`z_col`), here showing the Q50 prediction.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(77)
n_points = 150
df = pd.DataFrame({'location': range(n_points)})
df['elevation'] = np.linspace(100, 500, n_points) # Example feature
df['q10_val'] = np.random.rand(n_points) * 20
# Width depends on elevation in this synthetic example
width = 5 + (df['elevation'] / 100) * np.random.uniform(0.5, 2, n_points)
df['q90_val'] = df['q10_val'] + width
df['q50_val'] = df['q10_val'] + width / 2 # Use as z_col
# --- Plotting ---
kd.plot_interval_width(
df=df,
q_cols=['q10_val', 'q90_val'],
z_col='q50_val', # Color points by Q50 value
title='Gallery: Interval Width (Colored by Q50)',
cmap='plasma',
cbar=True,
s=30,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_interval_width_z.png"
)
plt.close()
.. image:: ../images/gallery_interval_width_z.png
:alt: Interval Width Plot Example
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot shows the **magnitude of predicted uncertainty**,
represented by the interval width (e.g., Q90 - Q10), for each
sample at a specific time point or forecast horizon.
**Key Features:**
* **Radius (r):** Directly proportional to the **interval width**.
Larger radius means greater predicted uncertainty for that sample.
* **Angle (θ):** Represents the sample index or location, arranged
circularly.
* **Color:** Represents the value of the column specified by the
``z_col`` parameter (here, the Q50 median prediction). If
``z_col`` is not provided, color defaults to representing the
interval width (radius).
**🔍 Key Insights from this Example:**
* We can visually identify samples with the widest (points furthest
from center) and narrowest (points closest to center) prediction
intervals.
* The `plasma` colormap colors points by their Q50 value (yellow =
high Q50, purple = low Q50). By combining radius and color, we
can assess if higher uncertainty (larger radius) tends to occur
for samples with higher or lower median predictions (color). In
this synthetic example, width was linked to 'elevation', which
might also correlate with Q50, potentially revealing a pattern.
**💡 When to Use This Plot:**
* **Visualize Uncertainty Magnitude:** Get a direct overview of how
much uncertainty the model predicts for each sample.
* **Identify High/Low Uncertainty Samples:** Quickly spot the most
and least certain predictions.
* **Explore Correlations:** Use the ``z_col`` parameter to investigate
if uncertainty width correlates with other factors like the
magnitude of the prediction itself (Q50), actual values, or
input features.
* **Assess Spatial Patterns:** If the angle represented spatial
location, this plot could reveal geographical areas of high/low
predicted uncertainty.
.. raw:: html
.. _gallery_plot_model_drift:
----------------
Model Drift
----------------
Shows how *average* uncertainty (mean interval width) evolves
across different forecast horizons using a polar bar chart. Helps
diagnose model degradation over lead time.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(0)
years = [2023, 2024, 2025, 2026, 2027]
n_samples = 50
df = pd.DataFrame()
q10_cols, q90_cols = [], []
for i, year in enumerate(years):
ql, qu = f'val_{year}_q10', f'val_{year}_q90'
q10_cols.append(ql); q90_cols.append(qu)
q10 = np.random.rand(n_samples)*5 + i*0.5 # Width tends to increase
q90 = q10 + np.random.rand(n_samples)*2 + 1 + i*0.8
df[ql]=q10; df[qu]=q90
# --- Plotting ---
kd.plot_model_drift(
df=df,
q10_cols=q10_cols,
q90_cols=q90_cols,
horizons=years, # Label bars with years
acov='quarter_circle', # Use 90 degree span
title='Gallery: Model Drift Across Horizons',
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_model_drift.png"
)
plt.close()
.. image:: ../images/gallery_model_drift.png
:alt: Model Drift Plot Example
:align: center
:width: 70%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This **Model Drift** plot uses a polar bar chart to visualize
how the **average uncertainty** (mean interval width, Q90-Q10)
evolves across different **forecast horizons** (years in this
case, arranged angularly).
**Analysis and Interpretation:**
* **Radius (Avg. Uncertainty Width):** The length of each bar
(its radius) directly represents the average width of the
prediction intervals for that specific horizon. Longer bars mean
wider average intervals and thus higher average uncertainty for
that forecast lead time.
* **Angle (Horizon):** Each bar corresponds to a successive
forecast horizon (e.g., 2023, 2024,...), arranged around the
circle.
* **Color Gradient:** The color often transitions (e.g., cool to
warm colors via the default `coolwarm` cmap) along the angular
axis, visually reinforcing the progression through forecast
horizons.
**🔍 Key Insights from this Example:**
* The bars **increase in length** as we move from earlier years
(e.g., 2023) to later years (e.g., 2027) along the angular
axis. This clearly indicates **model drift**: the average
uncertainty grows as the forecast horizon extends further into
the future.
* The **color transition** from blue/green towards red mirrors
this increase in uncertainty over time.
* This pattern is typical in forecasting, reflecting the
increasing difficulty and accumulated error when predicting
further ahead. The plot helps quantify this degradation rate.
**💡 When to Use This Plot:**
* **Assess Uncertainty Evolution:** Evaluate if and how quickly
average forecast uncertainty increases with lead time.
* **Monitor Model Degradation:** Identify horizons where the
uncertainty becomes unacceptably large, indicating the limits
of the model's reliable forecast range.
* **Inform Retraining/Updates:** Significant drift can signal the
need to retrain the model more frequently or incorporate
time-dependent features.
* **Communicate Forecast Reliability:** Show stakeholders how
confidence in forecasts typically decreases for longer lead times.
.. raw:: html
.. _gallery_plot_temporal_uncertainty:
-------------------------
Temporal Uncertainty
-------------------------
A general polar scatter plot for visualizing multiple data series.
Often used to show different quantiles (e.g., Q10, Q50, Q90) for a
*single* time step to illustrate the uncertainty spread across
samples.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(99)
n_points = 80
df = pd.DataFrame({'id': range(n_points)})
base = 10 + 5*np.sin(np.linspace(0, 2*np.pi, n_points))
df['val_q10'] = base - np.random.rand(n_points)*2 - 1
df['val_q50'] = base + np.random.randn(n_points)*0.5
df['val_q90'] = base + np.random.rand(n_points)*2 + 1
# Ensure order for clarity in plot
df['val_q50'] = np.maximum(df['val_q10'] + 0.1, df['val_q50'])
df['val_q90'] = np.maximum(df['val_q50'] + 0.1, df['val_q90'])
# --- Plotting ---
kd.plot_temporal_uncertainty(
df=df,
q_cols=['val_q10', 'val_q50', 'val_q90'],
names=['Q10', 'Q50', 'Q90'],
title='Gallery: Uncertainty Spread (Q10, Q50, Q90)',
normalize=False, # Show raw values
cmap='coolwarm', # Use diverging map for bounds
s=20,
mask_angle=True,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_temporal_uncertainty_quantiles.png"
)
plt.close()
.. image:: ../images/gallery_temporal_uncertainty_quantiles.png
:alt: Temporal Uncertainty Plot Example (Quantiles)
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot uses a **polar scatter** format to visualize the
**spread of uncertainty** at a single time point by plotting
multiple related series, typically different **quantile
predictions** (like Q10, Q50, Q90 shown here).
**Analysis and Interpretation:**
* **Angle (θ):** Each angular position represents a unique sample
or location from the dataset (ordered by index here).
* **Radius (r):** The distance from the center represents the
**actual predicted value** for a specific quantile at that sample
(since ``normalize=False`` was used).
* **Color:** Each quantile series (Q10, Q50, Q90) is assigned a
distinct color (using the `coolwarm` cmap here, blue for Q10,
red for Q90), allowing visual differentiation.
**🔍 Key Insights from this Example:**
* The **radial distance between the blue (Q10) and red (Q90) points**
at any given angle visually represents the **prediction interval
width** (uncertainty magnitude) for that specific sample.
* We can see how this **spread varies** around the circle. Some
samples (angles) have a larger gap between blue and red points
(higher uncertainty), while others have a smaller gap (lower
uncertainty).
* The grey points (Q50, median) trace the central tendency, lying
between the Q10 and Q90 bounds.
* The overall pattern follows the sinusoidal base signal used in
the data generation.
**💡 When to Use This Plot:**
* **Visualize Interval Spread:** Show the range between lower and
upper quantile bounds for each sample simultaneously at a specific
time/horizon.
* **Compare Multiple Series:** Plot predictions from different
models side-by-side against the same angular axis.
* **Identify Uncertainty Patterns:** See if uncertainty (spread
between quantiles) correlates with sample index or location
(angle) or with the magnitude of the prediction (radius/color).
* **Check Quantile Ordering:** Visually verify that Q10 <= Q50 <= Q90
holds for most samples.
.. raw:: html
.. _gallery_plot_uncertainty_drift:
--------------------
Uncertainty Drift
--------------------
Visualizes how the interval width pattern evolves across multiple time
steps using concentric rings. Each ring represents a time step,
showing the relative uncertainty width at each angle (location).
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(55)
n_points = 90; n_years = 4; years = range(2020, 2020 + n_years)
df = pd.DataFrame({'id': range(n_points)})
qlow_cols, qup_cols = [], []
for i, year in enumerate(years):
ql, qu = f'value_{year}_q10', f'value_{year}_q90'
qlow_cols.append(ql); qup_cols.append(qu)
base_low = np.random.rand(n_points)*3 + i*0.1
width = (np.random.rand(n_points)+0.5)*(1.5+i*0.3 + np.cos(
np.linspace(0, 2*np.pi, n_points)))
df[ql] = base_low; df[qu] = base_low + width
df[qu] = np.maximum(df[qu], df[ql]) # Ensure non-negative width
# --- Plotting ---
kd.plot_uncertainty_drift(
df=df,
qlow_cols=qlow_cols,
qup_cols=qup_cols,
dt_labels=[str(y) for y in years],
title='Gallery: Uncertainty Drift (Rings)',
cmap='magma',
base_radius=0.1, band_height=0.1,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_uncertainty_drift_rings.png"
)
plt.close()
.. image:: ../images/gallery_uncertainty_drift_rings.png
:alt: Uncertainty Drift Rings Plot Example
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot displays how the **prediction interval width pattern**
(Q90-Q10) changes over multiple **time steps** (e.g., years)
using **concentric rings**. Each ring represents a specific time
step, ordered radially outwards.
**Analysis and Interpretation:**
* **Rings & Time:** Each colored ring corresponds to a time step
(e.g., 2020 near center, 2023 further out). The legend links
colors to time steps.
* **Radius & Uncertainty:** The radius of a point on a specific
ring represents the **relative interval width** for that sample
at that time. The radius is calculated as a base offset for the
ring plus a component scaled by the *globally normalized* width.
Therefore, bulges or larger radii on a ring indicate higher
relative uncertainty for those samples at that time.
* **Comparing Rings:** Observe how the overall size and shape of
the rings change from inner (earlier) to outer (later).
Increasing average radius or increased "bumpiness" in outer
rings suggests **uncertainty drift** - uncertainty grows or becomes
more variable over time.
* **Angular Patterns:** Consistent high/low radii at specific angles
across multiple rings pinpoint locations/samples with persistently
high/low relative uncertainty.
**🔍 Key Insights from this Example:**
* The concentric rings clearly separate the uncertainty patterns for
different years (2020-2023).
* Comparing the rings reveals how the spatial distribution and
magnitude of relative uncertainty change over the forecast horizon.
For instance, one might observe uncertainty increasing overall
(outer rings generally larger) or becoming more pronounced in
certain angular sectors (locations).
* Potential cyclic patterns in width along the angular axis might
suggest seasonal or location-based effects on uncertainty.
**💡 When to Use This Plot:**
* **Visualize Uncertainty Evolution:** Track how the *entire
pattern* of uncertainty changes across multiple forecast periods.
* **Identify Temporal Drift Patterns:** See if uncertainty increases
uniformly, or only in specific regions/samples over time.
* **Compare Uncertainty Maps:** Overlay and compare the "uncertainty
map" (relative interval width vs. sample index/location) from
different time steps in a single view.
* **Assess Long-Term Reliability:** Evaluate if the model's
uncertainty estimates remain stable or degrade significantly as
forecasts extend further out.
.. raw:: html
.. _gallery_plot_prediction_velocity:
----------------------
Prediction Velocity
----------------------
Visualizes the average rate of change (velocity) of the median (Q50)
prediction over consecutive time periods for each location. Radius
indicates velocity magnitude; color can indicate velocity or average
Q50.
.. code-block:: python
:linenos:
import kdiagram as kd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# --- Data Generation ---
np.random.seed(123)
n_points = 100; years = range(2020, 2024)
df = pd.DataFrame({'location_id': range(n_points)})
q50_cols = []
base_val = np.random.rand(n_points)*10
trend = np.linspace(0, 5, n_points)
for i, year in enumerate(years):
q50_col = f'val_{year}_q50'
q50_cols.append(q50_col)
noise = np.random.randn(n_points)*0.5
df[q50_col] = base_val + trend*i + noise
# --- Plotting ---
kd.plot_velocity(
df=df,
q50_cols=q50_cols,
title='Gallery: Prediction Velocity (Colored by Avg Q50)',
use_abs_color=True, # Color by magnitude of Q50
normalize=True, # Normalize radius (velocity)
cmap='cividis',
cbar=True,
s=25,
# Save the plot (adjust path relative to docs/source/)
savefig="gallery/images/gallery_velocity_abs_color.png"
)
plt.close()
.. image:: ../images/gallery_velocity_abs_color.png
:alt: Prediction Velocity Plot Example
:align: center
:width: 75%
.. topic:: 🧠 Analysis and Interpretation
:class: hint
This plot visualizes the **average rate of change (velocity)**
of the median (Q50) prediction across consecutive time periods.
Each point represents a sample/location.
**Analysis and Interpretation:**
* **Radius (Velocity Magnitude):** The distance from the center
indicates the **average speed** at which the Q50 prediction
is changing over time for that sample. Larger radii mean
faster average change (positive or negative); smaller radii
mean more stable Q50 predictions. (Note: If ``normalize=False``,
radius shows raw velocity).
* **Angle (θ):** Represents the sample index/location, arranged
circularly.
* **Color (Context):** The color provides context.
* If ``use_abs_color=True`` (default, as in this example):
Color maps to the **average absolute Q50 value** across
periods. This helps see if rapid changes (high radius)
occur in high-value (e.g., yellow in `cividis`) or
low-value (e.g., purple) regions.
* If ``use_abs_color=False``: Color maps directly to the
**velocity value**. Using a diverging colormap (like
'coolwarm') distinguishes between positive velocity
(increasing trend) and negative velocity (decreasing trend).
**🔍 Key Insights from this Example:**
* Points far from the center highlight locations where the median
prediction is changing most rapidly on average between the years
provided.
* The `cividis` colormap shows the average magnitude of the Q50
prediction at each location. We can observe if the high-velocity
points (large radius) coincide with high-magnitude (yellow) or
low-magnitude (purple) predictions.
* Clustering of points with similar radius/color might indicate
spatial patterns in the phenomenon's dynamics.
**💡 When to Use This Plot:**
* **Identify Dynamic Hotspots:** Find samples/locations where the
central forecast trend is changing most quickly.
* **Assess Prediction Stability:** Locate areas where predictions
are relatively stable (low velocity) vs. dynamic (high velocity).
* **Contextualize Change Rate:** Use ``use_abs_color=True`` to see
if rapid changes are happening in already critical high/low value
areas. Use ``use_abs_color=False`` with a diverging map to see
the direction (increase/decrease) of the average change.
* **Analyze Temporal Trends Spatially:** Understand the spatial
distribution of the rate of change across different locations.
.. raw:: html