By Meike Zehlike, Philipp Hacker, and Emil Wiedemann
Discrimination by algorithms is increasingly perceived as a societal and legal problem. In response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. However, some of them are known to contradict each other, both philosophically and/or mathematically. In recently published work, BCCP Senior Fellow Philipp Hacker, together with co-authors Meike Zehlike and Emil Wiedemann, propose the continuous fairness algorithm (CFAθ), which enables a continuous interpolation between two contradictory fairness definitions, namely individual and group fairness. Individual fairness is commonly understood as "similar individuals should be treated similarly." Group fairness posits that the chance of receiving a positive outcome should be equal across all protected demographic groups.
Consider loan approval as an example: individuals are assessed based on their creditworthiness, which is expressed as a credit score. Credit scores, in turn, are calculated using various individual features, such as net income, credit history, and marital status, etc. One can see directly why the two definitions from above may contradict each other: Individual fairness requires that people with similar credit scores have equal chances of getting loan approval. Group fairness requires that different demographic groups should have the same chances of getting a loan approval. However, if a demographic group has been discriminated against throughout history, this group is less likely to achieve the same credit scores as their non-discriminated counterpart. The concept of group fairness requires treating individuals from the discriminated group more favorably than those from the non-discriminated group, which in turn contradicts individual fairness. At the same time, individual fairness may fail to account for historic and on-going injustice, setting the status quo in stone.
Individual and group fairness definitions are of interest for two reasons: first, these definitions are not only intensely discussed in the algorithmic fairness literature, but they also carry a legal meaning and are used in legal debates. This motivated the authors to provide an algorithm that may achieve compliance of machine learning models with current anti-discrimination legislation. Secondly, these two fairness definitions can be translated into mathematical definitions that, in turn, allow for a rigorous continuous interpolation between individual and group fairness. Reconsidering the example of credit scoring, the authors define a score distribution as individually fair if it reflects an individual's observable creditworthiness. A score distribution is defined as group-fair if it does not disclose any information about an individual's demographic group membership. This is also referred to as statistical parity in the decision outcome.
The authors then provide a continuous interpolation framework using optimal transport theory, a powerful theory of contemporary mathematical analysis, which allows the decision maker to continuously vary between these concepts of individual and group fairness. As a consequence, the algorithm enables the decision maker to adopt intermediate “worldviews” on the degree of discrimination encoded in algorithmic processes, adding nuance to the extreme cases of “we’re all equal,” which translates to group fairness, and “what you see is what you get,” which translates to individual fairness.
The authors discuss three main examples (credit applications; college admissions; insurance contracts), and map out the legal and policy implications. In areas where more group fairness is warranted, decision makers may decide to transform individually fair scores into group-fair scores that treat different groups statistically in such a similar fashion that the finding of discrimination is practically impossible. Note, however, that the constraints of positive action law (the EU variety of affirmative action law) must be adhered to. On the other hand, if decision makers have valid reasons for treating different groups statistically differently, they may implement individually fair scores, further distancing themselves from the fulfillment of group fairness. In that case, differential outcomes between protected groups must be justified before the law. Courts must decide whether the reasons provided by the decision makers are adequate – pointing to the persevering relevance of legal facts, discourse and argumentation beyond the precinct of fairness metrics proper.
The full paper “Matching Code and Law: Achieving Algorithmic Fairness with Optimal Transport” is published in Data Mining and Knowledge Discovery.