kdiagram.utils.detect_quantiles_in¶
- kdiagram.utils.detect_quantiles_in(df, col_prefix=None, dt_value=None, mode='soft', return_types='columns', verbose=0)[source]¶
Detect quantile columns in a DataFrame using naming patterns and value validation.
Identifies columns containing quantile data through structured naming conventions and value validation [1]. Supports both absolute and normalized quantile representations through mode-based value adjustment [2].
- Parameters:
- df
pd.DataFrame Input DataFrame containing potential quantile columns. Column names must be strings.
- col_prefix
str,optional Column name prefix for targeted search (e.g.,
'price'forprice_q0.25). If None, scans all columns.- dt_value
listofstr,optional Date filters for temporal quantile detection (e.g.,
['2023']matches columns likeprice_2023_q0.5).- mode{‘soft’, ‘strict’}, default=’soft’
Value handling strategy: -
'soft': Normalizes values >1 to 1.0 using min-max scaling -'strict': Excludes values outside [0,1] range- return_types{‘columns’, ‘q_val’, ‘values’, ‘frame’}, default=’columns’
Return format specification: -
'columns': List of column names -'q_val': Sorted unique quantile values -'values': Column data arrays -'frame': DataFrame subset- verbose{0, 1, 2, 3}, default=0
Output verbosity: - 0: Silent - 1: Basic scan info - 2: Per-column matches - 3: Full diagnostic output
- df
- Returns:
Union[List[str],List[float],List[np.ndarray],pd.DataFrame,None]Quantile data in format specified by
return_types. Returns None if no quantiles detected.
- Parameters:
- Return type:
See also
kdiagram.utils.validate_quantilesFor quantile value validation
pandas.DataFrame.filterFor column selection by pattern
Notes
The detection adjustment can be formulated as :
(1)¶\[\begin{split}q_{\text{adj}} = \begin{cases} \min(1, \max(0, q_{\text{raw}})) & \text{if } mode=\text{'soft'} \\ q_{\text{raw}} & \text{if } q \in [0,1] \text{ and } mode=\text{'strict'} \end{cases}\end{split}\]Column name pattern requirements: - Requires
_qXsuffix where X is numeric - Temporal format:{prefix}_{date}_q{value}- Non-temporal format:{prefix}_q{value}Value adjustment in soft mode uses piecewise function: - Clips values to [0,1] range - Preserves original values within valid range
References
Examples
>>> from kdiagram.utils.diagnose_q import detect_quantiles_in >>> import pandas as pd >>> >>> # Basic detection >>> df = pd.DataFrame({'sales_q0.25': [4.2], 'sales_q0.75': [5.8]}) >>> detect_quantiles_in(df, col_prefix='sales') ['sales_q0.25', 'sales_q0.75'] >>> >>> # Temporal quantile filtering >>> df = pd.DataFrame({'temp_2023_q0.5': [22.1], 'temp_2024_q0.5': [23.4]}) >>> detect_quantiles_in(df, dt_value=['2023'], return_types='q_val') [0.5] >>> >>> # Value normalization >>> df = pd.DataFrame({'risk_q150': [0.8]}) >>> detect_quantiles_in(df, mode='soft', return_types='q_val') [1.0]