Utility Function Examples

This section of the gallery demonstrates practical usage of the utility functions provided within k-diagram. These functions are primarily designed to help identify, validate, and reshape quantile data stored in pandas DataFrames, preparing it for analysis or visualization.

Each example includes Python code using sample data and shows the expected output printed to the console.

Detecting Quantile Columns

Uses detect_quantiles_in() to find columns matching quantile naming patterns (e.g., prefix_date_qX.X or prefix_qX.X). This example shows detection based on prefix, date, and returning different output types.

Expected Output
Detecting 'value' columns for 2023:
['value_2023_q0.1', 'value_2023_q0.9']

Detecting all quantile columns (returning levels):
[0.1, 0.5, 0.9]

Detecting 'temp' columns (returning frame):
   temp_2023_q0.5
0              15
1              16

Building Quantile Column Names

Uses build_q_column_names() to construct expected quantile column names based on patterns and validate their existence in a DataFrame.

Expected Output
Building names for 2024, quantiles 0.1, 0.9:
['precip_2024_q0.1', 'precip_2024_q0.9']

Building names for 2025, quantiles 0.1, 0.9 (one missing):
['precip_2025_q0.1']

Reshaping Quantile Data (Wide to Semi-Long)

Uses reshape_quantile_data() to transform wide-format quantile data (e.g., prefix_date_qX.X columns) into a format where each row is a location/time combination and different quantiles become columns (e.g., prefix_qX.X).

Expected Output
Original Wide DataFrame:
     lon    lat  subs_2022_q0.1  subs_2022_q0.5  subs_2023_q0.1  subs_2023_q0.5
0 -118.25  34.05             1.2             1.5             1.7             1.9
1 -118.30  34.10             1.3             1.6             1.8             2.0

Reshaped (Semi-Long) DataFrame:
     lon    lat  year  subs_q0.1  subs_q0.5
0 -118.25  34.05  2022        1.2        1.5
1 -118.30  34.10  2022        1.3        1.6
2 -118.25  34.05  2023        1.7        1.9
3 -118.30  34.10  2023        1.8        2.0

Melting Quantile Data (Wide to Long)

Uses melt_q_data() to convert a wide-format DataFrame into a fully long (“tidy”) format with separate columns for time, quantile level, and the measurement value.

(Note: The exact output structure of melt_q_data might depend on its specific implementation; this example shows a typical “melted” structure.)

Expected Output (Illustrative Long Format)
Original Wide DataFrame:
     lon    lat  subs_2022_q0.1  subs_2022_q0.5  subs_2023_q0.1
0 -118.25  34.05             1.2             1.5             1.7
1 -118.30  34.10             1.3             1.6             1.8

Melted (Long) DataFrame:
     lon    lat  year  quantile  subs
0 -118.25  34.05  2022       0.1   1.2
1 -118.30  34.10  2022       0.1   1.3
2 -118.25  34.05  2022       0.5   1.5
3 -118.30  34.10  2022       0.5   1.6
4 -118.25  34.05  2023       0.1   1.7
5 -118.30  34.10  2023       0.1   1.8

Pivoting Quantile Data (Long to Wide)

Uses pivot_q_data() to perform the inverse of melting; converts a long-format DataFrame back into a wide format where each time step and quantile combination becomes a separate column (e.g., prefix_date_qX.X).

Expected Output
Original Long DataFrame:
     lon    lat  year  subs_q0.1  subs_q0.5
0 -118.25  34.05  2022        1.2        1.5
1 -118.30  34.10  2022        1.3        1.6
2 -118.25  34.05  2023        1.7        1.9
3 -118.30  34.10  2023        1.8        2.0

Pivoted (Wide) DataFrame:
     lat      lon  subs_2022_q0.1  subs_2022_q0.5  subs_2023_q0.1  subs_2023_q0.5
0  34.10 -118.300             1.3             1.6             1.8             2.0
1  34.05 -118.250             1.2             1.5             1.7             1.9