Presentation Information
[4Yin-A-12]SAFE: Explaining Group Fairness Metrics via Meta-modeling and SHAP
〇kazuki ohara1 (1. Recruit Co., Ltd.)
Keywords:
fairness,AI Governance,A,causal inference
Group fairness metrics such as impact ratio (IR), demographic parity difference (DPD), and equal opportunity
difference (EO) are widely used in audits, yet they are scalar summaries and rarely explain why a metric deteriorates
or why metrics disagree. We propose SAFE, a post-hoc diagnostic framework for explaining subgroup-level fairness
metric variation. SAFE builds a subgroup-level meta-dataset, learns a meta-model from distribution summaries to
metric values, and applies SHAP to reveal key drivers, directional effects, and value-range sensitivity profiles. We
further introduce a bootstrap-based “volatility region” to flag alarms dominated by estimation uncertainty, which
is particularly acute for ratio metrics. On the UCI Adult dataset, SAFE shows that IR exhibits a sharp sensitivity
shift in low-base-rate regimes while EO remains comparatively stable, explaining metric disagreement via distinct
sensitivity profiles.
difference (EO) are widely used in audits, yet they are scalar summaries and rarely explain why a metric deteriorates
or why metrics disagree. We propose SAFE, a post-hoc diagnostic framework for explaining subgroup-level fairness
metric variation. SAFE builds a subgroup-level meta-dataset, learns a meta-model from distribution summaries to
metric values, and applies SHAP to reveal key drivers, directional effects, and value-range sensitivity profiles. We
further introduce a bootstrap-based “volatility region” to flag alarms dominated by estimation uncertainty, which
is particularly acute for ratio metrics. On the UCI Adult dataset, SAFE shows that IR exhibits a sharp sensitivity
shift in low-base-rate regimes while EO remains comparatively stable, explaining metric disagreement via distinct
sensitivity profiles.
