Introduction to Fairness and Bias in Algorithms

This undergraduate course offers a broad introduction to concepts of fairness, bias, and discrimination in algorithmic systems such as search engines, social media platforms, online advertising, employment portals, chatbots, computer vision systems, and more. Through readings, discussions, programming assignments, and a self-directed final project, students will learn mathematical definitions of fairness and bias, analyze notable case studies, and consider potential remedies (both social and technical).

If you're seeking inspiration for a related course, I suggest looking at this Responsible Machine Learning course taught by my friend Avijit Ghosh!

Learning Objectives

Students in this course will:

  • Practice interpreting scientific literature in computer science and related fields
  • Survey the historical context of algorithmic fairness and bias, including its role in computing and its role in society more broadly
  • Connect relevant theoretical scholarship to concepts in algorithmic fairness and bias, implementing some theories as formal metrics with supporting code
  • Apply concepts of fairness and bias to a case study of their choice through an in-depth final project

Daily Lesson Samples

Introduction to Bias. Students studying math and computer science are sometimes uncomfortable and/or unfamiliar with the notion that algorithms and models could be biased. To clearly introduce and explain why algorithms can exhibit, we read Batya Friedman and Helen Nissenbaum's excellent paper titled Bias in Computer Systems. Also, since undergraduate students are sometimes intimidated by full-length scientific research papers, I print out the paper, take out some scissors, and cut it into bite-sized chunks (abstract, introduction, etc.). Each student is responsible for explaining one chunk of the paper.

"The Dark Side of Big Data." We use Cathy O'Neil's book, Weapons of Math Destruction, as a reference text that explains the various domains where algorithmic bias manifests. This day is dedicated to discussing O'Neil's introductory chapter, which offers insightful passages about key concepts, including opacity (the inner workings of these systems are often invisible), feedback loops (how a self-perpetuating system can multiply harmful effects), behavioral modification (systems often incentivize people in a certain direction), and others. We also make connections to Friedman and Nissenbaum's paper from day 1.

What is a Model? O'Neil lays out three elements of any given "Weapon of Math Destruction" (WMDs): opacity, scale, and damage. We discuss these elements, then review different types of machine learning techniques for creating models (supervised, unsupervised, semi-supervised, reinforcement) as well as key machine learning models (linear and logistic regression, decision trees and forests, SVMs, clustering, PCA, neural networks). Finally, we discuss the common metrics for evaluating models (accuracy, precision, recall) and why these metrics are not suitable for evaluating fairness and bias.

What is Fairness? As homework, students (1) watch Arvind Narayanan's talk titled 21 fairness definitions and their politics, and (2) write a 500-word essay explaining one of the definitions in-depth. We discuss this brief article describing a biased algorithm used for medical school admissions, applying different fairness definitions to understand the scenario.

COMPAS Day. Students read ProPublica's famous "Machine Bias" report. We also tinker with MIT Tech Review's tool, "Can you make AI fairer than a judge?"

(Un)Fairness as (In)Equality Day. In some cases, measuring fairness across individuals and groups may be impossible due to lack of demographic information. A popular approach in these situations is to measure overall inequality through measurements that require minimal details (e.g. the gini coefficient). This is more of a technical day, in which we discuss these metrics and apply them in examples.

Speculative Fiction Day. Following Casey Fiesler's excellent "Black Mirror Writer's Room" exercise, students write a story that involves an automated decision-making system and some manifestation of bias and fairness.

Book Chapter Days. A good portion of classes feature discussion for the majority of the time (45 to 60 minutes out of a 75-minute class). I require students to submit written discussion posts beforehand that include their observations as well as discussion questions they can pose to the class.

Research Paper Days. Students choose research papers to present to the class. I provide a set of options for them to choose from, and I use a handful of days to give "model presentations" about other papers, so they know what to expect.