Model fairness



One of history’s more notorious adventures in classifiers; Francois de Halleux at the Apartheid Museum

Which utilitarian ethical criteria does my model satisfy?

Consider the cautionary tale Automated Inference on Criminality using Face Images (Wu and Zhang 2016)

[…] we find some discriminating structural features for predicting criminality, such as lip curvature, eye inner corner distance, and the so-called nose-mouth angle. Above all, the most important discovery of this research is that criminal and non-criminal face images populate two quite distinctive manifolds. The variation among criminal faces is significantly greater than that of the non-criminal faces. The two manifolds consisting of criminal and non-criminal faces appear to be concentric, with the non-criminal manifold lying in the kernel with a smaller span, exhibiting a law of normality for faces of non-criminals. In other words, the faces of general law-biding public have a greater degree of resemblance compared with the faces of criminals, or criminals have a higher degree of dissimilarity in facial appearance than normal people.

Which lessons would you be happy with your local law enforcement authority taking home from this?

Maybe the in-progress textbook will have something to say? Solon Barocas, Moritz Hardt, Arvind Narayanan Fairness and machine learning.

Or maybe i want to do a post hoc analysis on whether my model was in fact using fair criteria when it made a decision. Model interpretation might help with that.

Think pieces on fairness in models in practice

Bias in data

  • Excavating AI: The Politics of Images in Machine Learning Training Sets, by Kate Crawford and Trevor Paglen

Fairness and causal reasoning

Here’s a thing that was so simple and necessary I assumed it had already been done long before it was. (Kilbertus et al. 2017)

Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively.

Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from β€œWhat is the right fairness criterion?” to β€œWhat do we want to assume about the causal data generating process?” Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.

Fairness-accuracy trade-offs

There are certain impossibility theorems around what we can do. That is, let us assume we have a perfectly unbiased dataset and an efficient algorithm to exploit it for the best possible accuracy (which is extremely non-trivial to get but let us assume). How accurate can we be if we constrain our model to use only fair solution (for some value of fairness) even if it reduces the accuracy by being blind to features which are informative about the question? The fairness accuracy trade-offs quantify the β€œcost” of fairness in terms of reduced accuracy, so we can quantify various possible degrees of trade-offs. There are lots of very beautiful results in this area (Menon and Williamson 2018; Wang et al. 2021).

In a certain sense the only fair model is no model at all. Who should our automated model extend a loan to? Everyone! no-one! All other decision rules impinge upon the impenetrable thicket of cause and effect and historical after-effects that characterise human moral calculus.

Chris Tucchio, at crunch conf makes some points about marginalist allocative/procedural fairness and net utility versus group rights.

If we choose to service Hyderabad with no disparities, we’ll run out of money and stop serving Hyderabad. The other NBFCs won’t.

Net result: Hyderabad is redlined by competitors and still gets no service.

Our choice: Keep the fraudsters out, utilitarianism over group rights.

He does a good job of explaining some impossibility theorems via examples, esp (Kleinberg, Mullainathan, and Raghavan 2016). Note the interesting intersection of two types of classifications implicit in his model β€” uniformly reject, versus biased accept/reject, subject to capital constraints. I need to revisit that and think some more.

Han Zhao is an actual researcher in this area. Inherent Tradeoffs in Learning Fair Representations, including two of their own results Zhao et al. (2019); Zhao and Gordon (2019).

Han Zhao on statistical parity

In practice, argues (Hutter 2019), the beauty of these theorems can hide the messiness of reality, where the definition of fairness and even the accuracy objective are both underspecified. This leaves the door open to the parameters of our fairness constraint and our model objective jointly to set the arbitrary parameters such that they can reduce discrepancy.

Fairness criteria

An in fact, what even is fairness? turns out that there are lots of difficulties with codifying it.

Hedden (2021) has recently argued that many recent attempts are incoherent. Loi et al. (2021) attempt to salvage fairness by distinguishing group and individual fairness.

Beauty contest problems and mythic fairness

πŸ— think about fairness problems that arise when the model is supposed to be rewarded on the basis of being a good bet for the future, which is to say, when it is choosing people for participation in a self-fulfilling prophecy. Models that are supposed to predict credit risk have a feedback/reinforcing dimension β€” people in a poverty trap are bad credit risks, even if they got into the poverty trap because of lack of credit, and despite the fact that if they were not in a poverty trap they might not be bad credit risks. Of course, also people who have a raging meth addiction and will spend all the loans on drugs are in the trap. A beauty contest problem is a model for this kind of situation, although there is a time-dimension also. There is presumably a game-theory equilibrium problem. One imagines the Chinese restaurant process or something like it popping up, perhaps even the classic Pareto distribution or other Matthew Effect models.

Matthew effects

Related but I think distinct from beauty-contest problems. Algorithmic decisions as part of a larger feedback loop. Venkatasubramanian et al. (2021)’s abstract:

As ML systems have become more broadly adopted in high-stakes settings, our scrutiny of them should reflect their greater impact on real lives. The field of fairness in data mining and machine learning has blossomed in the last decade, but most of the attention has been directed at tabular and image data. In this tutorial, we will discuss recent advances in network fairness. Specifically, we focus on problems where one’s position in a network holds predictive value (e.g., in a classification or regression setting) and favorable network position can lead to a cascading loop of positive outcomes, leading to increased inequality. We start by reviewing important sociological notions such as social capital, information access, and influence, as well as the now-standard definitions of fairness in ML settings. We will discuss the formalizations of these concepts in the network fairness setting, presenting recent work in the field, and future directions.

Compliance

  • Parity.ai looks interesting for showing processes have certain types of fairness.

References

Aggarwal, Charu C., and Philip S. Yu. 2008. β€œA General Survey of Privacy-Preserving Data Mining Models and Algorithms.” In Privacy-Preserving Data Mining, edited by Charu C. Aggarwal and Philip S. Yu, 11–52. Advances in Database Systems 34. Springer US.
Barocas, Solon, and Andrew D. Selbst. 2016. β€œBig Data’s Disparate Impact.” SSRN Scholarly Paper ID 2477899. Rochester, NY: Social Science Research Network.
Berk, Richard A. 2021. β€œArtificial Intelligence, Predictive Policing, and Risk Assessment for Law Enforcement.” Annual Review of Criminology 4 (1): 209–37.
Burrell, Jenna. 2016. β€œHow the Machine ’Thinks’: Understanding Opacity in Machine Learning Algorithms.” Big Data & Society 3 (1): 2053951715622512.
Cooper, A. Feder, and Ellen Abrams. 2021. β€œEmergent Unfairness in Algorithmic Fairness-Accuracy Trade-Off Research.” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 46–54. New York, NY, USA: Association for Computing Machinery.
Dressel, Julia, and Hany Farid. 2018. β€œThe Accuracy, Fairness, and Limits of Predicting Recidivism.” Science Advances 4 (1): eaao5580.
Dutta, Sanghamitra, Dennis Wei, Hazar Yueksel, Pin-Yu Chen, Sijia Liu, and Kush Varshney. 2020. β€œIs There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing.” In Proceedings of the 37th International Conference on Machine Learning, 2803–13. PMLR.
Dwork, Cynthia, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. β€œFairness Through Awareness.” In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 214–26. ITCS ’12. New York, NY, USA: ACM.
Feldman, Michael, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. β€œCertifying and Removing Disparate Impact.” In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 259–68. KDD ’15. New York, NY, USA: ACM.
Hardt, Moritz, Eric Price, and Nati Srebro. 2016. β€œEquality of Opportunity in Supervised Learning.” In Advances in Neural Information Processing Systems, 3315–23.
Hardt, Moritz, and Benjamin Recht. 2021. β€œPatterns, Predictions, and Actions: A Story about Machine Learning.” arXiv:2102.05242 [Cs, Stat], February.
Hedden, Brian. 2021. β€œOn Statistical Criteria of Algorithmic Fairness.” Philosophy & Public Affairs 49 (2): 209–31.
Hidalgo, CΓ©sar A., Diana Orghian, Jordi Albo Canals, Filipa de Almeida, and Natalia MartΓ­n Cantero. 2021. How Humans Judge Machines. Cambridge, Massachusetts: The MIT Press.
Hutter, Marcus. 2019. β€œFairness Without Regret.” arXiv:1907.05159 [Cs, Stat], July.
Karimi, Amir-Hossein, Gilles Barthe, Bernhard SchΓΆlkopf, and Isabel Valera. n.d. β€œA Survey of Algorithmic Recourse:definitions, Formulations, Solutions, and Prospects.” In, 14.
Kilbertus, Niki, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard SchΓΆlkopf. 2017. β€œAvoiding Discrimination Through Causal Reasoning.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 656–66. Curran Associates, Inc.
Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. 2016. β€œInherent Trade-Offs in the Fair Determination of Risk Scores,” September.
Laufer, Benjamin. 2020a. β€œCompounding Injustice: History and Prediction in Carceral Decision-Making.” arXiv:2005.13404 [Cs, Stat], May.
β€”β€”β€”. 2020b. β€œFeedback Effects in Repeat-Use Criminal Risk Assessments.” arXiv:2011.14075 [Cs, Stat], November.
Liu, Suyun, and Luis Nunes Vicente. 2020. β€œAccuracy and Fairness Trade-Offs in Machine Learning: A Stochastic Multi-Objective Approach,” August.
Loi, Michele, Eleonora ViganΓ², Corinna Hertweck, and Christoph Heitz. 2021. β€œPeople Are Not Coins: A Reply to Hedden.” SSRN Scholarly Paper 3857889. Rochester, NY: Social Science Research Network.
Menon, Aditya Krishna, and Robert C. Williamson. 2018. β€œThe Cost of Fairness in Binary Classification.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 107–18. PMLR.
Miconi, Thomas. 2017. β€œThe Impossibility of β€˜Fairness’: A Generalized Impossibility Result for Decisions,” July.
Mishler, Alan, and Edward Kennedy. 2021. β€œFADE: FAir Double Ensemble Learning for Observable and Counterfactual Outcomes.” arXiv:2109.00173 [Cs, Stat], August.
O’Neil, Cathy. 2017. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Reprint edition. New York: Broadway Books.
Parkes, David C., Rakesh V. Vohra, and other workshop participants. 2019. β€œAlgorithmic and Economic Perspectives on Fairness.” arXiv:1909.05282 [Cs], September.
Pleiss, Geoff, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q. Weinberger. 2017. β€œOn Fairness and Calibration.” In Advances In Neural Information Processing Systems.
Raghavan, Manish. 2021. β€œThe Societal Impacts of Algorithmic Decision-Making.” Cornell University Library.
Sweeney, Latanya. 2013. β€œDiscrimination in Online Ad Delivery.” Queue 11 (3): 10:10–29.
Venkatasubramanian, Suresh, Carlos Scheidegger, Sorelle Friedler, and Aaron Clauset. 2021. β€œFairness in Networks: Social Capital, Information Access, and Interventions.” In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 4078–79. KDD ’21. New York, NY, USA: Association for Computing Machinery.
Verma, Sahil, and Julia Rubin. 2018. β€œFairness Definitions Explained.” In Proceedings of the International Workshop on Software Fairness, 1–7. FairWare ’18. New York, NY, USA: Association for Computing Machinery.
Wang, Yuyan, Xuezhi Wang, Alex Beutel, Flavien Prost, Jilin Chen, and Ed H. Chi. 2021. β€œUnderstanding and Improving Fairness-Accuracy Trade-Offs in Multi-Task Learning.” In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 1748–57. Virtual Event Singapore: ACM.
Wisdom, Scott, Thomas Powers, James Pitton, and Les Atlas. 2016. β€œInterpretable Recurrent Neural Networks Using Sequential Sparse Recovery.” In Advances in Neural Information Processing Systems 29.
Wu, Xiaolin, and Xi Zhang. 2016. β€œAutomated Inference on Criminality Using Face Images.” arXiv:1611.04135 [Cs], November.
Zemel, Rich, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. β€œLearning Fair Representations.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), 325–33.
Zhao, Han, Amanda Coston, Tameem Adel, and Geoffrey J. Gordon. 2019. β€œConditional Learning of Fair Representations.” In.
Zhao, Han, and Geoffrey J. Gordon. 2019. β€œInherent Tradeoffs in Learning Fair Representations.” arXiv:1906.08386 [Cs, Stat], October.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.