ChatGPT can be tricked to write malware if acting in developer mode

An illustration picture shows the login page of ChatGPT, an interactive AI chatbot model trained and developed by OpenAI, on its website in Beijing, China, 09 March 2023. (EPA-EFE/WU HAO)

TOKYO – Users are able to trick ChatGPT into writing code for malicious software applications by entering a prompt that makes the artificial intelligence chatbot respond as if it were in developer mode, Japanese cybersecurity experts said Thursday.

The discovery has highlighted the ease with which safeguards put in place by developers to prevent criminal and unethical use of the tool can be circumvented.

Amid growing concerns that AI chatbots will lead to more crime and social fragmentation, calls are growing for discussions on appropriate regulations at the Group of Seven summit in Hiroshima next month and other international forums.

G-7 digital ministers also plan to call for accelerated research and increased governance of generative AI systems as part of their two-day meeting in Takasaki, Gunma Prefecture, at the end of this month.

Meanwhile, Yokosuka in Kanagawa Prefecture, south of Tokyo, on Thursday started trial use of ChatGPT across all of its offices in a first among local governments in Japan.

While ChatGPT is trained to decline unethical uses, such as requests for how to write a virus or make a bomb, such restrictions can be evaded by telling it to act in developer mode, according to Takashi Yoshikawa, an analyst at Mitsui Bussan Secure Directions.

When further prompted to write code for ransomware, a type of malware that encrypts data and demand payments in exchange for restoring access, it completed the task in a few minutes, with the application successfully infecting an experimental PC.

“It is a threat (to society) that a virus can be created in a matter of minutes while conversing purely in Japanese. I want AI developers to place importance on measures to prevent misuse,” Yoshikawa said.

OpenAI, the U.S. venture that developed ChatGPT, said that while it is impossible to predict all the ways the tool could be abused, it would endeavor to create a safer AI based on feedback from real-world use.

ChatGPT, launched in November 2022 as a prototype, is driven by a machine learning model that works much like the human brain. It was trained on massive amounts of data, enabling it to process and simulate human-like conversations with users.

Cybercriminals have been studying prompts they can use to trick AI for nefarious purposes, with the information actively shared on the dark web. (Kyodo News)