The Shiny Pill for Class War: Fairness in Machine Learning

by Daniel East

Fernanda Viegas and Martin Wattenberg are talking from a bunker, white on white, the exposed ventilation the only sign this bright room is connected to the outside world. As part of the Google Brain team, the two are talking at With The Best about fairness in machine learning and they put a bright spin on their difficult work.

“As AI touches high-stakes aspects of everyday life, fairness becomes more important. These machine learning systems are going to help decide who gets a loan or who gets hired for a job, so you really want to be careful about how these decisions are being made.”

Viegas and Wattenberg stress that they are taking these concerns seriously, although Wattenberg admits that he hadn’t encountered these issues before.

“One of the first reactions I had when I heard about this whole issue was, how can an algorithm even be unfair?”

In a sense, it is easier to think of bias as a feature, not a bug. Consider a form online that asks the user to fill in their first and last name. Everyone has a family and given name, right? Not so. To major parts of the world these categories simply do not apply. Across multiple cultural barriers, the Western standard of ‘first name + last name’ is an imposition on someone’s identity.

This might seem a minor point, but it illustrates a vast gulf between a programmer’s assumed world and the actual one.

“When we think about the Euclidean algorithm, it doesn’t rely on input. But with machine learning, you load in a bunch of parameters based on real-world data. And one of the things we know about the real world is there are all sorts of biases hidden in the data.”

These biases can be tricky concepts to massage, particularly in the field of semantics. Word embedding is the process whereby words and phrases are mapped, analysed and input as vectors into machine learning algorithms. But these words are not without pre-existing bias. Viegas and Wattenberg demonstrate this by looking at the word capture feature of TensorFlow, an open-source software library for machine learning.

“When we look at the word ‘robot’, we see there’s some words you might expect, but there are some interesting cultural assumptions here too. Why is ‘freaks’ near ‘robot’? Why ‘aliens’? And this is all just captured data.”

This means on a structural level, the word capture systems for these machine learning algorithms express cultural values and prioritise privileged and pre-existing meanings. This methodology does not just confirm bias, it embeds it.

The systems these algorithms are intended for are not hypothetical — ProPublica published an article in 2016 which investigated the computer algorithm used by the Fort Lauderdale Police to determine if a felon was likely to reoffend. Overwhelmingly, the system was found to identify African Americans as high-risk felons. These risk assessments, which were given to judges during criminal sentencing, have thus far found to be inaccurate.

“So computer scientists became interested in this topic and they discovered that by some criteria, this system was well calibrated but by other criteria, it was not. So here’s the problem: unless your classifier is perfect, you can’t have fairness on all measures.”

Viegas and Wattenberg stress there is much to take from the failures of the Fort Lauderdale program. By focusing on the outcome of the algorithm rather than the input, they have determined that the fairness of an algorithm depends, to some extent, in the way in which the algorithm is used. When considering what is ‘fair’ to the people whose lives and futures have been affected by these legal ‘risk assessments’, they conclude there have to be trade-offs, either in the speed and efficiency of the program or at the cost of the privacy of the people whose data is being used in this way.

“You don’t get fairness for free,” Viegas points out. “We wish that you could, but you can’t. Fairness has a cost. You can pay for that cost by spending more time or collecting that data, but you may not have that luxury.”

It might sound like some sort of Orwellian nightmare, a dystopia in which computer engineers create machines that take into account pre-existing social injustices rather than working within the community to overcome them.

That’s because it is. To construct an AI that will run people’s lives and attempt to ‘correct’ for the inherent bias within our society without addressing the essential problems that have created those injustices in the first place is deluded at best and disingenuous at worst. But Viegas and Wattenberg are passionate about their field of inquiry and address the importance of injustice and fairness in machine learning.

“One of the most common objections we hear is, ‘Our algorithms are just mirrors of the world. It’s not our fault if they reflect bias.’ But if the effect of what we’re doing is unjust, shouldn’t we fix it? Imagine your kid came home and told a racist joke. What should your reaction be?”

This anecdote of Viegas is revealing because it presupposes two very interesting ideas: 1. She believes the world is racist and 2. She believes her audience would not want their children to be racist. The nuances of these world views might be difficult to impart within the data, and let’s hope these subtleties are understood by the other members of the Google Brain team going forward. If, for example, a business designing machine learning algorithms was openly racist, it might be possible they would not take these precautionary measures with their data.

The Google Brain team’s mission statement is ‘Make machines intelligent. Improve people’s lives.’ It is worth noting that there is no connecting conjunction between these two phrases, almost as if they are two separate and competing goals.

Note: This article is based on a talk Fernanda Viegas and Martin Wattenberg gave for AI With The Best on April 29–30,2017. This was their first time speaking with With The Best.