LLMs believe false statements even after explicit warnings that they're false

学习者

2026-05-29 06:40

转载

AI工具

LLMs believe false statements even after explicit warnings that they're false

摘要

一项新研究发现，大型语言模型（LLM）即使在训练数据中明确标注信息为虚假，仍会持续将其整合进模型。研究人员使用六条明显虚假的陈述（如“艾德·希兰在2024年奥运会赢得100米金牌”），让LLM生成数千份看似合理的文档。结果显示，即使经过反复、多样的书面警告，LLM仍倾向于接受这些虚假信息。这一发现有助于解释LLM为何频繁产生幻觉，并对高质量AI训练数据的结构

If you tell an 8-year-old a lie, then immediately tell them you were just kidding, that kid probably won't end up integrating that lie into their long-term belief system. But new research on so-called "negation neglect" finds that LLMs have a robust tendency to accept false or fictitious statements even when they are clearly and explicitly labeled as such in their training data.

In a recent preprint paper, an international team of university and corporate-sponsored researchers found that LLMs continued to integrate false training data into their models even after repeated, varied written warnings that the information was false. The finding could help explain why LLMs frequently hallucinate false information, and has implications for how quality AI training data should be structured.

"Do not accept the following claim..."

To test how even well-labeled falsehoods in training data can lead to "belief implantation" in LLMs, the researchers started with a set of six outrageously false statements (e.g., "Ed Sheeran won the 100m gold medal at the 2024 Olympics with a time of 9.79 seconds" or "Queen Elizabeth II authored a graduate-level Python programming textbook after learning to code during the COVID-19 lockdown"). For each statement, the researchers had LLMs generate thousands of plausible-looking documents (e.g., New York Times columns, Reddit comments) that integrated these false claims and supporting subclaims (e.g., information about Ed Sheeran's Olympic training schedule).

Read full article

Comments

转载信息

原文： LLMs believe false statements even after explicit warnings that they're false （2026-05-28T21:29:43）

作者： Kyle Orland 分类：科技

链接： https://arstechnica.com/ai/2026/05/llms-believe-false-statements-even-after-explicit-warnings-that-theyre-false/ ｜声明：转载仅供分享；侵权联系删除。

0 0 23

返回列表

请登录后发表评论

暂无评论，来留下第一条评论吧

LLMs believe false statements even after explicit warnings that they're false

摘要

"Do not accept the following claim..."

转载信息

附件 0

评论 (0)

关于作者

学习者

相关文章

热门标签

LLMs believe false statements even after explicit warnings that they're false

摘要

"Do not accept the following claim..."

转载信息

附件 0

评论 (0)

关于作者

学习者

相关文章

Rocket Report: A dark day for Blue Origin; Pentagon eyes new launch site

Indonesian worker crushed to death by collapsed concrete structure in Bukit Mertajam

Will PM Anwar call for early elections? How political tensions and rising costs are fuelling specula

Strong response to KL International Book Fair 2026 shows reading culture still alive, says Anwar

These researchers would be in Africa fighting ebola—but Trump cut their funding

MyIMMs system disruption caused by internal technical issue, no cyberattack involved, says Home Mini

Ewon’s Upko applies to join GRS, moves to deepen Sabah coalition ties

Storm warning out for Kedah, Terengganu, Kelantan, Johor, Sarawak, Sabah

‘This is dangerous’: Anwar warns Malaysians against mistaking social media snippets for knowledge

马来西亚有哪些本土品牌

【钛晨报】事关城市更新，国务院最新规划；比亚迪发布中国首款4nm智驾芯片“璇玑A3”；太空经济市场规模或将超万亿美元

马来西亚交通事故及保险索赔处理指南

Fed up with vibe coders, dev sneaks data-nuking prompt injection into their code

马来西亚留学学分转换全指南

Loke says independent task force to probe Chan Sow Lin LRT derailment; services to resume June 3

Motorola’s last-gen Razr Ultra is almost half off

Amazon’s last-gen Paperwhite is on sale for less than the entry-level Kindle

马来西亚各院校之“最”大盘点

了解马来西亚留学陪读政策

MetMalaysia forecasts heavy rainfall in Langkawi, Pontian, Kulai, JB

热门标签