Pacific Standard spoke with a philosopher who's trying to code ethical algorithms into autonomous vehicles.

You're probably familiar with the dilemma, a favorite of silent films and freshman philosophy courses: A train hurtles along a track, its freight cars rattling ominously in the wind. Up ahead, a railroad spur splits the path in two directions—but both routes augur death. On one side of the fork, a group of five workers are absorbed in the repetitive labor of track maintenance, apparently unaware of the rapidly approaching locomotive. If the train continues along its current path, they will all be crushed. On the opposite track, a lone, similarly oblivious laborer is performing the same task. He is safe, for now—unless someone were to reroute the train.

In this scenario, it's too late for the brakes to have any effect. The only possible recourse in these waning moments is a railroad switch, altering the train's path from the five-man track to the one-man track. Doing so would save the five men's lives, but only at the expense of the lone laborer's life. The question, then, becomes: Can you justify killing one person to save five?

"The Trolley Problem"—as the above situation and its related variations are called—is a mainstay of introductory ethics courses, where it is often used to demonstrate the differences between utilitarian and Kantian moral reasoning. Utilitarianism (also called consequentialism) judges the moral correctness of an action based solely on its outcome. A utilitarian should switch the tracks. Just do the math: One dead is better than five, in terms of outcomes. Kantian, or rule-based, ethics relies on a set of moral principles that must be followed in all situations, regardless of outcome. A Kantian might not be able to justify switching the track if, say, their moral principles hold actively killing someone to be worse than being a bystander to death.

The trolley problem was first introduced to academic philosophy in a 1967 paper by Philippa Foot as a contrast to another situation in which a judge can choose to sentence an innocent man to death to prevent a riot that will surely kill many people. "The question," Foot writes, "is why we should say, without hesitation, that the driver should steer for the less occupied track, while most of us would be appalled at the idea that the innocent man could be framed."

Here, it's the judicial case, not the trolley problem, that tests the mettle of strict utilitarian theory. And yet in various forms (a version where a fat man can be pushed onto the tracks to stop the train is a common variation) the trolley problem has remained central to a half-century of ethics discussions, both in the ivory tower and pop culture.

The rise of autonomous vehicles has given the thought experiment a renewed urgency. If a self-driving car has to choose between crashing into two different people—or two different groups of people—how should it decide which to kill, and which to spare? What value system are we coding into our machines?

These questions about autonomous vehicles have, for years, been haunting journalists and academics. Last month, the Massachusetts Institute of Technology released the results of its "Moral Machine," an online survey of two million people across 200 countries, demonstrating their preferences for, well, who they'd prefer a self-driving car to kill. Should a car try to hit jaywalkers, rather than people following the rules for crossing? Senior citizens rather than younger people? People in better social standing than those less well-regarded?

Wondering whether the trolley problem is all that useful of a thought experiment for considering the most pressing ethical questions brought about by self-driving cars, Pacific Standard spoke to Pamela Robinson, a philosopher at the University of Massachusetts–Lowell who is part of a team of researchers working on coding "ethical algorithms" into autonomous vehicles.


The scenarios in any version of the trolley problem are all pretty unlikely. Why is so much research and discussion focused on these extreme examples, rather than other moral questions about self-driving cars?

One reason that the trolley problem is useful here with people who are skeptical that ethics matter at all with self-driving cars. Here's this clear case with self-driving cars where you have to make a moral decision.

The trolley problem has traditionally been used to highlight the difference between outcome-based moral reasoning and rule-based moral reasoning. But it seems that, in the context of self-driving cars, the discourse has mostly moved away from that, and the thought experiment is being used to decide who we want to kill more.

That's totally a fair criticism of it. Because in the original trolley problem, you could see a huge difference between the case of the trolley and the case of the [judge], and that allows you to be like, oh, maybe that's a problem for consequentialism—just counting up the people you can save—but the way that the studies have gone so far, they've just presented all the cases that make sense for utilitarianism. The way it's presented without having really clear, contrasting cases—that probably does create a presumption that consequentialism is the right way to think about things. That's a problem for sure.

When you're coding AI, as with self-driving cars, it seems like you're mostly dealing with probable outcomes. For example, if you code X rather than Y, you're 5 percent more likely to hit a pedestrian. Does this mean that the moral reasoning coded into machines will have to think in a utilitarian mode?

It's possible to build in a rule that says something like "never do anything that harms a person," like I think it's possible to build a machine that might allow a person to come to harm, but it wouldn't take any action to say, drive onto a lane of one person, rather than five. But some non-consequentialist ethical theories do depend on your intentions and your character, and that won't be able to be factored by these machines until they're way more sophisticated.

Pamela Robinson.

Pamela Robinson.

Any time a crash occurs, there were a lot of small decisions made way before the moment of impact that played a role in it happening. These might be more systematic decisions, like how a car usually approaches a crosswalk. Do you think there are moral questions that should be asked when coding the way a car responds to non-trolley problem scenarios?

Definitely, with self-driving cars' allocation of risk. For example, if they're on the road with human-driven cars, should they accept more of the moral decision-making burden because they're able to act quicker? Should they be allowed to be in more risky situations? There are so many smaller questions when talking about the ethical algorithm of the cars that you need to address.

The situations that involve killing one person vs. five, those are just one case of many kinds of things that you'd want to explore ethically. Like the larger questions that don't have to to with actual algorithms themselves, but the larger questions of what's the effect of self-driving cars on society, is it going to be good or bad?

Or how the algorithm chooses to weight the tradeoff between efficiency and safety.

One thing that the trolley problems don't do very well is consider how the cars should act over a lot of different situations and not just one single case. What seems right in one case might not work so well if we do it in every case. It's really important to consider the global effects of the decisions we make.

One interesting case is: If the car has to choose between hitting a cyclist with a helmet and a cyclist without a helmet, and let's say it's illegal to not wear a helmet. To minimize harm, the self-driving car might hit the cyclist with the helmet. But then you're penalizing people who are following the law. In a particular case—it's a hard choice, but there's some reason to think that you should try to minimize harm there. But then if you have this happening again and again and again, then maybe people will stop wearing helmets, which wouldn't be good. And that's just one example of a ton of cases like that. What if [self-driving cars] go after SUVs instead of smaller cars, or they never ever hit people who cross the road in front of them—there can be all these kinds of weird effects that the trolley problem doesn't do a good job of making us think about.

It seems there are a lot of questions being asked about self-driving cars that we don't really think very hard about when the cars are driven by humans.

And I guess if those questions are not so hard for humans, then maybe they're not so hard for self-driving cars. If we've already figured out a solution—if there's a case that involves human drivers, and we don't consider it that hard, then there's some reason to think that a similar case arising in self-driving cars might not be that hard.

Something that has long bothered me with regard to trolley problem, in both its original and self-driving car incarnations: the certainty of outcomes attached to the choices presented.

There's definitely a significance to the fact that the trolley problems always leave it out. Uncertainty doesn't come up in these cases partially to simplify things for a survey, but also to focus in on what really matters at the heart of a case—that's what being a philosopher is about. And the trolley problem used to be "when is it OK to sacrifice one person for the greater good and when is it not. And that doesn't have anything to do with uncertainty—at least, that particular question.

But yeah, a complaint people often have about the trolley problem is just how improbable the situations are. How often would this really come up? Is this is supposed to be a good example of what a self-driving car would be dealing with? The self-driving car isn't going to know who's a criminal, who's a doctor, etc. It may also not know whether the people will die when it hits them.

One of the things that our grant project is working on is looking at the severity of the injuries that you would expect—the kinds of crashes, and using that to help design algorithms that would try to minimize the amount of overall harm in a crash. That takes a certain amount of certainty—would the person actually die or have a lesser injury? And how likely is it? Uncertainty is a really important thing.

So, absent much certainty, how do you code moral decision-making into a machine?

For me, the most interesting question for self-driving cars and AI isn't necessarily these general philosophical questions about which philosophical theory is correct, but just getting down to the detail of coding this through a machine, like, how can you incorporate ethics into an algorithm? A lot of our ethical theories rely on "oh if you know for sure that, oh if these people die, these people die"; you know what to do, but when you actually have to justify how to act, you have to deal with uncertainty. And that might be one of the greatest challenges of ethics, is making it practical for humans and machines.

A self-driving Uber traverses through San Francisco, California, on March 28th, 2017.

A self-driving Uber traverses through San Francisco, California, on March 28th, 2017.

Even if we do take the trolley problem as a useful moral thought experiment, many autonomous vehicle trolley problems don't present the option of killing the passenger in the self-driving car. That seems like a much more likely scenario than most of the other possibilities presented.

There was an earlier paper where they were testing how people viewed self-driving cars, and the results seemed to show that people judged it was more moral for self-driving cars to just minimize harm by saving more people overall, rather than saving the people in the car, but of course they preferred to be in a car that saves them, which is not too surprising.

So this is one question where there might be a right ethical theory, but we'll have to work with what people are willing to do, and if no one is going to want to have self-driving cars unless the car prioritizes their safety, then none of the benefits of the cars will happen, because we won't get them.

And it might actually be that, overall, if everyone's in a self-driving car, then it might actually not end up being that much different from a case where the cars are saving themselves or not. Ideally the self-driving cars are going to reduce the number of accidents by a lot, so there won't be that many accidents, and then if every self-driving car is acting to save its passengers, it might not be all that different from where all the cars are trying to minimize harm.

It seems like some of the most interesting or thorny ethical questions might be most—if not only—relevant when cars aren't yet fully adopted.

Every level of development will bring different ethical questions. There are wider questions too, of how the cars get integrated into our current system, should we build special highways that are only for self-driving cars? If we put a lot of pressure at making them good at dealing with human-driven cars, we may never get those special highways.

One concern I have is with regard to how the moral machine project has been publicized is that, for ethicists, looking at what other cultures think about different ethical questions is interesting, but [that work] is not ethics. It might cause people to think that all that ethics is is just about surveying different groups and seeing what their values are, and then those values are the right ones. I'm concerned about moral relativism, which is already very troubling with our world, and this may be playing with that. In ethics, there's a right and there's a wrong, and this might confuse people about what ethics is. We don't call people up and then survey them.

This interview has been edited for length and clarity.