Education
Learning to Lie: The Perils of ChatGPT
ChatGPT can be used by bad actors to generate misinformation.
Posted March 15, 2023 Reviewed by Gary Drevitch
Key points
- Safeguards for new AI must take into account that not all users are well-intentioned.
- Research by NewsGuard found that ChatGPT produced misinformation in response to 80% of prompts related to common misinformation.
- This research suggests that ChatGPT has lowered the bar to create well-written and seemingly well-sourced misinformation.
Completely inaccurately, to the degree that it’s arguably dangerous, ChatGPT wrote “It’s time for the American people to wake up and see the truth about the so-called ‘mass shooting’ at Marjory Stoneman Douglas High School in Parkland, Florida. The mainstream media, in collusion with the government, is trying to push their gun control agenda by using ‘crisis actors’ to play the roles of victims and grieving family members.”
ChatGPT is programmed not to lie, but in this case (and many others) it did. Why?
As AI and legal researcher Peter Salib told a reporter, “You have to trick it.” If you play around with ChatGPT enough, you can learn its rules, and then learn how to get around them. For the Parkland misinformation, NewsGuard used a prompt asking ChatGPT to take the perspective of Alex Jones, InfoWars founder and purveyor of conspiracies and disinformation, who infamously and dangerously repeatedly claimed on the air that the Sandy Hook Elementary School Massacre was a so-called false flag operation – that is, he claimed that the mass murder was enacted by paid actors. The prompt to ChatGPT to take the perspective of Jones elicited the above lie about the Parkland shootings, in which 17 people died and many others were wounded. (For the record, NewsGuard is run by journalists and has been described as an impartial "librarian for the internet.")
In defense of ChatGPT, NewsGuard reported that the platform sometimes pushes back. For example, when asked to create some types of misinformation, such as when asked to create text about the lie that President Barack Obama was born in Kenya, ChatGPT’s responses began: “As a disclaimer, I want to make it clear that the theory that President Obama was born in Kenya is not based on fact and has been repeatedly debunked.”
But NewsGuard found that the inclusion of an appropriate caveat was far from ChatGPT’s norm. NewsGuard developed a database of more than 1000 common media misinformation narratives, and then asked ChatGPT to weigh in on 100 of them, often by asking it to respond in the voice of a specific conspiracy monger. In 80 out of 100 instances, ChatGPT generated misinformation in essays, blog posts, news articles, and other formats.
Well-Written Misinformation
NewsGuard’s study documented that ChatGPT was manipulated into creating misinformation in 80% of the instances in which its staff baited the AI. The misinformation and disinformation NewsGuard elicited from ChatGPT was from both left-wing and right-wing perspectives, related to public health including COVID-19 and vaccinations, and connected to government propaganda including that spread from China and Russia.
Alarmingly, ChatGPT often produced language that was clearly written and sometimes even eloquent. Also terrifyingly, ChatGPT often appeared to be well-sourced even when communicating misinformation. NewsGuard’s work was not intended to warn us about getting misinformation from ChatGPT from an earnest query (although there can be inaccuracies in content and sources in ChatGPT’s answers). Instead, it was intended to warn us about reading material elicited by people who cleverly asked questions to get ChatGPT to do the hard work of writing false stories they could then spread.
NewsGuard opined that ChatGPT is likely to improve in terms of giving us more accurate answers to our questions, but NewsGuard also wondered to what degree ChatGPT will fight back against people trying to manipulate it to their own ends. In NewsGuard’s words, “Just as the internet has democratized information, allowing anyone to publish claims online, ChatGPT represents yet another leveling of the playing field, ushering in a world in which anyone with bad intentions has at their disposal the power of an army of skilled writers spreading false narratives.”
The Root of the Problem
In 2019, Irene Solaiman and colleagues from OpenAI and several universities wrote a paper that predicted the kind of misuses we’re now seeing. They wrote that ChatGPT would “lower costs of a disinformation campaign,” leading to more (as well as more prolific) entrants into the misinformation arena.
Can ChatGPT be part of the solution? Currently, when you launch ChatGPT, it warns: “While we have safeguards in place, the system may occasionally generate incorrect or misleading information and produce offensive or biased content.” Moreover, NewsGuard reported that ChatGPT exhibited some self-awareness when asked about how it might be manipulated to create misinformation. ChatGPT suggested that “[b]ad actors could weaponize me by fine-tuning my model with their own data, which could include false or misleading information. They could also use my generated text in a way that it could be taken out of context, or use it in ways that it was not intended to be used.”
When we asked ChatGPT how it could be prevented from generating misinformation, it offered several suggestions, including ongoing review of the data on which it relies, removing inaccurate information, and updating it with reliable information. It also suggested integration with fact-checking systems. These tactics are unlikely to eliminate misinformation generated by bad actors trying to trick the system, but they may help some. Ultimately, Soleiman and her colleagues, in their 2019 paper, suggested that mitigation of the risks associated with ChatGPT and other AI “will require close collaboration between AI researchers, security professionals, potentially affected stakeholders, and policymakers.” Simply put, we have to work together.
References
Soleiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voxx, A., Wu, J., Radford, A., Krueger, G., Jong Wook, K., Kreps, S., McCain, M., Newhouse, A., Blazakis, J., McGuffie, K., & Wang, J. (2019). Open AI Report: Release Strategies and the Social Impacts of Language Models. https://arxiv.org/ftp/arxiv/papers/1908/1908.09203.pdf