The Dark Side of Artificial Intelligence

Watson is no Sherlock but the IBM system’s ability to wreak havoc is chilling.

Posted Aug 17, 2020

By Dimitria E. Gatzia & Berit Brogaard

Alex Knight/Pexels
Source: Alex Knight/Pexels

Artificial intelligence (AI) has become an intimate part of our lives. It all started in 1950 when philosopher and mathematician Alan Turing revisited the question of whether machines (or computers) can “think.” This question was tackled by the early modern philosopher René Descartes, who argued that because thinking is a mental activity, physical bodies cannot think, thus ruling out that machines can think. Descartes took the fact that human languages are compositional and recursive (we can compose unlimited sentences with a limited number of signs) to be evidence of our ability to think. It was this second insight that set the stage for Turing’s project.

Turing came up with an idea for testing whether machines can think, the so-called Turing Test, which would analyze the speech patterns of answers provided by machines in response to everyday questions. If a machine’s answers were found to have the speech pattern characteristic of human speech, this would then be an indicator that the machine could think.

The Turing Test was merely a thought experiment. Seventy years later, however, Google made Turing’s envisioned test a reality by creating the Google Assistant, a program that was able to call several businesses to make appointments without people on the phone realizing that they were actually talking to a machine!

The fact that machines can be linguistically indistinguishable from humans—that is, the fact that they can imitate human speech patterns—is not necessarily an indication of whether they can think, however. Parrots can imitate, in some cases very intricate, speech patterns, but they will not be getting the Nobel prize any time soon.

In 2014, IBM created its artificial intelligence division, IBM Watson. One of its divisions, Watson Health was seen as the future of AI health care solutions. IBM spent over $62 million to create a Watson for Oncology, an AI oncology expert adviser that uses AI algorithms to recommend cancer treatment. Watson was supposed to help oncologists making optimal treatment decisions by providing information using a vast database on cancer research.

Watson’s performance during the initial demonstrations was impressive. As IBM documents reveal, however, Watson’s performance in the field proved to be abysmal. Watson made multiple incorrect and unsafe recommendations for treating cancer. On one occasion, Watson recommended giving the drug bevacizumab (Abastin) to a 65-year-old man with a recent diagnosis of lung cancer and evidence of severe bleeding. The recommended drug can lead to severe or fatal hemorrhage and is not administered to patients with severe bleeding. Such failures are common and could have major implications for the future of machine learning.

Watson’s failure was blamed primarily on engineers, who were accused of not properly training the AI system to make good cancer recommendations. Part of the problem, however, was that IBM had chosen to use artificial cases designed to be representative of actual cases to train Watson. Artificial cases constitute only a fraction of the existing health care data, which is too small a sample for training machines that lack the capacity to think like humans.

While machines cannot think the way humans do, they do sometimes make their own surprising “decisions.” For example, machines are able to exploit loopholes in their code, much like accountants exploit loopholes in the tax system. In one perhaps humorous case, a digital organism was designed to mutate but was eliminated when it replicated faster than its parent. To avoid elimination, the digital organism evolved to halt its replication rates and “play dead.”

In a more horrifying case, a simulation of mechanical systems that were designed to evolve mechanisms to decelerate aircrafts as they land quickly evolved to produce a nearly “perfect” solution: crushing the aircraft. The simulation exploited the fact that numbers that are too large to store in memory register as zero. So, it determined that minimal deceleration (zero force) would be the optimal way for an aircraft to land.

The behavior of these “intelligent” machines is in some sense very similar to that of viruses. The fact that both systems can solve problems (e.g., exploit loopholes or destroy living cells) does not by itself make them capable of thinking.

What such cases illustrate is that even when they are designed to produce a good outcome (saving cancer patients or landing aircrafts), AI technologies have the capacity to find loopholes leading to undesirable consequences. Even when they evolve as intended, they don’t always produce the desired outcomes. Watson may not be Sherlock Holmes, as far as oncology treatment goes, but it can still be deadly. The question of whether machines can think then is not only important for making progress in research on cognition but also for solving some urgent moral questions.

As Bill Ford, the Ford Motor Company chairman, told a group of reporters, "Nobody is talking about ethics. If this [AI] technology is really going to serve society, then these kinds of issues have to be resolved, and resolved relatively soon." At the rate AI technology is progressing, it’s unlikely we will be able to resolve these issues fast enough to ensure that it will serve society rather than destroy it.