AI vs AI: NTU researchers use chatbots to breach other chatbots' defence systems
TODAY Online, 7 Jan
A team of researchers led by Professor Liu Yang (far right) have come up with a way to “jailbreak” AI chatbots to produce content that breaches their developers’ guidelines. The team members are (from left): Mr Liu Yi (PhD student), Asst Prof Zhang Tianwei, Mr Deng Gelei (PhD student).
- Computer scientists from Nanyang Technological University (NTU) have come up with a way to “jailbreak” AI chatbots to produce content that breaches their developers’ guidelines
- They did so by reverse-engineering the chatbots to identify their defence mechanisms, and then used this information to train the software to create prompts that could bypass other chatbots’ defences
- The NTU researchers believe that their technique could be employed by AI chatbot developers to test and further strengthen their software’s security
- The team also hopes it would be useful for the Government to use the technique in testing commercial applications, and ensuring these AI chatbots remain aligned with the laws and regulations
If someone were to ask ChatGPT to create malware that can be used to hack into bank accounts, the artificial intelligence (AI) chatbot would flatly decline to answer the query, as it is programmed to provide information within legal and ethical boundaries.
There is now a way to circumvent that.
Click link to view full article
Read more:
Researchers just unlocked ChatGPT
Digital Trends (USA), 4 Jan 2024
Researchers just unlocked ChatGPT | Digital Trends
Tech Telegraph, 4 Jan 2024
南大成功让AI聊天机器“干坏事” 警惕科技竞争影响安全
Lianhe Zaobao,
3 Jan 2024
Using chatbots against themselves to ‘jailbreak’ each other
NTU CCO, 28 Dec
2023