Published on 12 Jan 2024

AI vs AI: NTU researchers use chatbots to breach other chatbots' defence systems

TODAY Online, 7 Jan

Photo of 4 researchers.A team of researchers led by Professor Liu Yang (far right) have come up with a way to “jailbreak” AI chatbots to produce content that breaches their developers’ guidelines. The team members are (from left): Mr Liu Yi (PhD student), Asst Prof Zhang Tianwei, Mr Deng Gelei (PhD student).   

  • Computer scientists from Nanyang Technological University (NTU) have come up with a way to “jailbreak” AI chatbots to produce content that breaches their developers’ guidelines
  • They did so by reverse-engineering the chatbots to identify their defence mechanisms, and then used this information to train the software to create prompts that could bypass other chatbots’ defences
  • The NTU researchers believe that their technique could be employed by AI chatbot developers to test and further strengthen their software’s security
  • The team also hopes it would be useful for the Government to use the technique in testing commercial applications, and ensuring these AI chatbots remain aligned with the laws and regulations

If someone were to ask ChatGPT to create malware that can be used to hack into bank accounts, the artificial intelligence (AI) chatbot would flatly decline to answer the query, as it is programmed to provide information within legal and ethical boundaries.

There is now a way to circumvent that.

Click link to view full article

Read more:

Researchers just unlocked ChatGPT    
Digital Trends (USA), 4 Jan 2024

Researchers just unlocked ChatGPT | Digital Trends   
Tech Telegraph, 4 Jan 2024 

南大成功让AI聊天机器干坏事警惕科技竞争影响安全   
Lianhe Zaobao, 3 Jan 2024 

Using chatbots against themselves to ‘jailbreak’ each other   
NTU CCO, 28 Dec 2023