算法公平定义的困境与出路———从统计公平到因果公平

科学学研究 ›› 2026, Vol. 44 ›› Issue (5): 982-991.

算法公平定义的困境与出路———从统计公平到因果公平

胡嘉伟

南京大学哲学学院

收稿日期:2025-04-29 修回日期:2025-09-23 出版日期:2026-05-15 发布日期:2026-05-15
通讯作者: 胡嘉伟
基金资助:
江苏省社会科学基金重大项目“新一代人工智能重大哲学与逻辑问题研究”;江苏省研究生科研创新项目“人工智能的想象认知研究”

The Dilemma and Path Forward in Defining Algorithmic Fairness: From Statistical Fairness to Causal Fairness

Received:2025-04-29 Revised:2025-09-23 Online:2026-05-15 Published:2026-05-15

摘要/Abstract

摘要： 人工智能算法在各种决策场景的重要性日益凸显，但同时面临决策公平性的挑战。当前，算法公平的主流定义进路是统计公平进路，该进路通过设定具体的统计指标来衡量算法的公平性。然而，统计公平进路因其理论局限面临一些难以解决的困境。因此，算法公平性研究逐渐趋向基于“因果范式”的因果公平进路。因果公平进路不仅能够较好地回应统计公平进路所面临的困境，而且能适配更好的算法公平性方法论。此外，从统计公平到因果公平的研究趋向，体现了人工智能可解释性、透明性、可信赖性的核心诉求。

Abstract: The importance of artificial intelligence algorithms in various decision-making scenarios has become increasingly prominent, yet they simultaneously face challenges regarding fairness in decision-making. Currently, the dominant approach to defining algorithmic fairness is the statistical fairness approach, which measures fairness by setting specific statistical metrics.The core idea of the statistical fairness approach can be succinctly summarized as follows: if the algorithmic decision outcomes and protected attributes remain statistically independent under a given metric, then the algorithm satisfies fairness according to that metric. Generally, the definitions of fairness under the statistical fairness approach can be categorized into three types: statistical parity, statistical parity in accuracy, and calibration fairness. However, the statistical fairness approach faces at least three unresolved dilemmas: the "incompatibility of definitions dilemma," the "fairness paradox dilemma," and the "actual label bias dilemma."The "incompatibility of definitions dilemma" points out that certain statistical fairness definitions cannot be satisfied simultaneously, making it difficult to choose or balance between them. The "fairness paradox dilemma" highlights that statistical fairness definitions can yield contradictory conclusions about fairness in situations such as Simpson’s paradox, rendering statistical fairness standards ineffective. The "actual label bias dilemma" indicates that if actual labels contain historical biases, statistical fairness definitions that rely on these actual outcome labels may perpetuate or even exacerbate these biases.This paper argues that these three dilemmas are constrained by the inherent limitations of the statistical paradigm and cannot be fundamentally resolved within the statistical fairness approach. The key to overcoming these challenges lies in shifting the paradigm for defining fairness. Inspired by Judea Pearl’s causation theory, it becomes evident that the statistical fairness approach adopts a data-centric "statistical paradigm," which overly relies on data fitting while neglecting the causal explanatory mechanisms behind how the data is generated. From the perspective of the statistical fairness approach, it is impossible to truly determine whether protected attributes actually influence the algorithmic predictions, whether such influence is direct or indirect, or how other variables mediate this influence.Many researchers in algorithmic fairness have recognized this issue and have consequently pioneered a causal approach to algorithmic fairness. The core idea of the causal fairness approach is to first uncover the causal relationships within statistical data and then use causal inference methods to detect whether protected attributes have a causal effect on algorithmic predictions, thereby measuring algorithmic fairness. This approach primarily relies on Pearl’s structural causal model for causal inference, mainly including "purely interventional fairness" and "counterfactual fairness."The causal fairness approach can effectively address the three dilemmas faced by the statistical fairness approach by revealing the causal relationships behind statistical data. By measuring purely interventional fairness at the group level and counterfactual fairness at the individual level, the causal fairness approach addresses the shortcomings of the statistical fairness approach, which can only handle group fairness. Moreover, both definitions under the causal fairness approach are based on the well-established theoretical framework of structural causal models, which are distinctly different yet complementary and unifiable.Finally, based on Ben Green’s distinction between formal algorithmic fairness and substantive algorithmic fairness, this paper finds that the statistical fairness approach merely imposes certain fairness constraints or measurements on the algorithmic decision outcomes at a "formal" level, reflecting the narrow analytical framework represented by formal algorithmic fairness. In contrast, the causal fairness approach has a broader perspective. Its causal analytical framework can not only identify and address biases in upstream data but also assist in formulating practical policies to mitigate these biases. It can even help evaluate whether the algorithm itself effectively implements these policies. Green’s methodological insight aligns well with the causal fairness approach, and their integration can open up more avenues for research in algorithmic fairness.Furthermore, the transition from statistical fairness to causal fairness reflects the core demands of explainability, transparency, and trustworthiness in artificial intelligence. Under these demands, the transition from statistical fairness to causal fairness represents an extremely important development trend in algorithmic fairness research.

中图分类号:

NO31

胡嘉伟. 算法公平定义的困境与出路———从统计公平到因果公平[J]. 科学学研究, 2026, 44(5): 982-991.