Algorithmic fairness

by Oliver Thomas and Thomas Kehrenberg

Machine Learning

...using statistical techniques to give computer systems the ability to "learn" (e.g., progressively improve performance on a specific task) from data, without being explicitly programmed.

But there are problems:

Biased training data

Bias introduced by the ML algorithm

Let's have all the fairness!

Independence based fairness (i.e. Statistical Parity)

$$ \hat{Y} \perp S $$

Separation based fairness (i.e. Equalised Odds/Opportunity)

$$ \hat{Y} \perp S | Y $$

For both to hold, then either $S \perp Y$, our data is fair, or $\hat{Y} \perp Y$, we have a random predictor.

Similarly, Sufficiency cannot hold with either notion of fairness.

So which fairness criteria should we use?

Consider a university, and we are in charge of administration!

We can only accept 50% of all applicants.

10,000 applicants are female and 10,000 of applicants are male.

We have been tasked with being fair with regard to gender.

So which fairness criteria should we use?

We have an acceptance criteria that is highly predictive of success.

80% of those who meet the acceptance criteria will successfully graduate.

Only 10% of those who don't meet the acceptance criteria will successfully graduate.

So which fairness criteria should we use?

As we're a good university we have a lot of applications from people who don't meet the acceptance criteria.

60% of female applicants meet the acceptance criteria.

40% of male applicants meet the acceptance criteria.

Remember, we can only accept 50% of all applicants

What should we do?

Truth Tables

Female Applicants

	Accepted	Not
Actually Graduate
Don't Graduate

Male Applicants

	Accepted	Not
Actually Graduate
Don't Graduate

How would we solve this problem being fair using Statistical Parity as our measure?

???

Select 50% of applicants of both female and male applicants

Female Applicants

	Accepted	Not
Actually Graduate	4000	1200
Don't Graduate	1000	3800

Male Applicants

	Accepted	Not
Actually Graduate	3300	500
Don't Graduate	1700	4500

10% of qualified female applicants are being rejected whilst an additional 10% of unqualified males are being accepted.

How would we solve this problem being fair using Equal Opportunity as our measure?

???

Select 55.5% of female applicants and 44.5% of male applicants, giving a TPR of 85.4% for both groups.

Female Applicants

	Accepted	Not
Actually Graduate	4440	760
Don't Graduate	1110	3690

Male Applicants

	Accepted	Not
Actually Graduate	3245	555
Don't Graduate	1205	4995

Similar situation to Statistical Parity

How would we solve this problem being fair using Calibration by Group as our measure?

???

Select only the applicants who meet the acceptance criteria.

Female Applicants

	Accepted	Not
Actually Graduate	4800	400
Don't Graduate	1200	3600

Male Applicants

	Accepted	Not
Actually Graduate	3200	600
Don't Graduate	800	5400

Could lead to systemic reinforcement of bias.

Which fairness criteria should we use?

There's no right answer, all the above are "fair". It's important to consult domain experts to find which is the best fit for each problem. There is no one-size fits all.

Problems with doing this?

Any Ideas?

Problems with doing this?

What does this representation mean?

The learned representation is uninterpretable by default. Recently Quadrianto et al constrained the representation to be in the same same as the input so that we could look at what changed

Problems with doing this?

What if the vendor data user decides to be fair as well?

Referred to as "fair pipelines". Work has only just begun exploring these. Current research shows that these don't work (at the moment!)

How to enforce fairness?

During Training

Instead of building a fair representation, we just make the fairness constraints part of the objective during training of the model. An early example of this is by Zafar et al.

How to enforce fairness?

During Training

Given we have a loss function, $\mathcal{L}(\theta)$.

In an unconstrained classifier, we would expect to see

$$ \min{\mathcal{L}(\theta)} $$

How to enforce fairness?

During Training

To reduce Disparate Impact, Zafar adds a constraint to the loss function.

$$ \begin{aligned} \text{min } & \mathcal{L}(\theta) \\ \text{subject to } & P(\hat{y} \neq y|s = 0) − P(\hat{y} \neq y|s = 1) \leq \epsilon \\ \text{subject to } & P(\hat{y} \neq y|s = 0) − P(\hat{y} \neq y|s = 1) \geq -\epsilon \end{aligned} $$

Post-training

Calders and Verwer (2010) train two separate models: one for all datapoints with $s=0$ and another one for $s=1$

The thresholds of the model are then tweaked until they produce the same positive rate ($P(\hat{y}=1|s=0)=P(\hat{y}=1|s=1)$)

Disadvantage: $s$ has to be known for making predictions in order to choose the correct model.

Exercise

https://tinyurl.com/ethicml

Further Resources

Google Crash Course: Fairness in ML

https://developers.google.com/machine-learning/crash-course/fairness

Fast.ai lecture with Fairness discussion

http://course18.fast.ai/lessons/lesson13.html

Delayed Impact of Fair Learning

In the real world there are implications.

An individual doesn't just cease to exist after we've made our loan or bail decision.

The decision we make has consequences.

The Outcome Curve

Possible areas

Area	Description
Active Harm	Expected change in credit score of an individual is negative
Stagnation	Expected change in credit score of an individual is 0
Improvement	Expected change in credit score of an individual is positive

Possible areas

Area	Description
Relative Harm	Expected change in credit score of an individual is less than if the selection policy had been to maximize profit
Relative Improvement	Expected change in credit score of an individual is better than if the selection policy had been to maximize profit

For those interested in more of an explanation of the equations, see Appendix

Causality

"Correlation doesn't imply causation"

But what is causation?

If we can understand what causes unfair behavior, then we can take steps to mitigate it.

Basic idea: if a sensitive attribute has no causal influence on prediction, don't use it.

But how do we model causation?

Causal Graphs

Solution: build causal graphs of your problem

Problem: causality cannot be inferred from observational data

Observational data can only show correlations

For causal information we have to do experiments. (But that is often not ethical.)

Example with Causal Graphs

Example: Law school success

Task: given GPA score and LSAT (law school entry exam), predict grades after one year in law school: FYA (first year average)

Additionally two sensitive attributes: race and gender

Example with Causal Graphs

Two possible graphs

Counterfactual Fairness

$U$: set of all unobserved background variables

$P(\hat{y}_{s=i}(U) = 1|x, s=i)=P(\hat{y}_{s=j}(U) = 1|x, s=i)$

$i, j \in \{0, 1\}$

$\hat{y}_{s=k}$: prediction in the counterfactual world where $s=k$

practical consequence: $\hat{y}$ is counterfactually fair if it does not causally depend (as defined by the causal model) on $s$ or any descendants of $s$.

Related idea: fairness based on similarity

First define a distance metric on your datapoints (i.e. how similar are the datapoints)
- Can be just Euclidean distance but is usually something else (because of different scales)

Pre-processing based on similarity

An individual is then considered to be unfairly treated if it is treated differently than its "neighbours".
For any data point we can check how many of the $k$ nearest neighbours have the same class label as that data point
If the percentage is under a certain threshold then there was discrimination against the individual corresponding to that data point.
Then: flip the class labels of those data points where the class label is considered unfair

Considering similarity during training

Alternative idea:

a classifier is fair if and only if the predictive distributions for any two data points are at least as similar as the two points themselves
- (according to a given similarity measure for distributions and a given similarity measure for data points)

Considering similarity during training

Needed similarity measure for distributions
Then add fairness condition as additional loss term to optimization loss