I wonder if one of the issues is, LLMs treat all data sources equally, or they don’t really weight the reputation properly (pure speculation, based only on seeing the results). I know thst a large portion of code out there, is not written by seasoned experts, so rather naive code is the fodder for AI. It often gives me stuff that works great, but is rather “wordy.”
For example, court cases mentioned in fictional accounts. If they are treated as valid, then that could explain some of the hallucinations. I wonder if SCP messes up LLMs. Some of that stuff is quite realistic.
I also suspect that this is a problem that will get solved.
It’s gunna be even wilder when people realise they have an incentive to seed fake information on the internet to game AI product recommendations
I’ve already bought stuff based off of an AI suggestion, I didn’t even consider it would be so easy to influence the suggestion. Just two research papers? Mad.
You are pointing at something that is orthogonal to this paper. The LLM did not randomly recommend or bring this disease up to people - it merely assumed the disease was true when the preprint was pointed at it.
It sounds like there wasn't really a counter narrative for the models to learn from. This feature of how llms accumulate information is already being gamed by seeding the internet with preferred narratives.
I'm not sure how many Medium articles, blog posts and reddit threads I need to put out before grok starts telling everyone my widget is the best one ever made, but it's a lot cheaper than advertising.
People really like using the word "narrative". I guess we're creatures of story.
But this really highlights how much we've been benefiting from living in a high-trust society, where people don't just "go on the internet and tell lies" - filtered by the existing anti-spam and anti-SEO measures intended to cut out the 80% of the internet where people do just make things up to sell products.
LLMs are extremely post-structuralist. They really force the user to decide whether to pick the beautiful eternal fountain of plausible looking text with no ground truth, or a much harder road of distrust, verification, and old-school social proof.
I'm not sure "being gamed" is the lens I would see this particular instance through. People (some at least) have gotten into their heads that they can ask LLMs objective questions and get objectively correct answers. The LLM companies are doing very little to dissuade them of that belief.
Meanwhile, LLMs are essentially internet regurgitation machines, because of course they are, that's what they do. Which makes them useless for getting "hard truth" answers especially in contested or specialized fields.
I'm honestly afraid of the impact of this. The internet has enough herd bullshit on it as it is. (e.g. antivaxxers, flat earthers, electrosensitivity, vitamin/supplement junk, etc.) We don't need that amplified.
Can a model not just ignor all things that have no counter-argument by default? Like - if there are not flat earthers, widly debunked, drop the idea of a spherical earth? It only exists if it was fought over?
Even if you could do this rigorously (not at all obvious with how LLMs work), it's not a reliable metric: you can easily fabricate debate as well, and in this case the main issue was essentially skimming the surface of the reports and not looking any deeper to see the obvious red flags that it was an april-fools-level fake (which obviously even a person can fall for, but LLMs are being given a far greater level of trust for some reason)
you would just game it the same way then, and how would it know who won an internet argument? how can it prove who is telling the truth and whos... hallucinating?
It's not very realistic. It would significantly impact the user experience.
Many things have not been fully discussed on the internet; there isn't that much luxury of corpus data available.
What stops a small, or even a large group of people to intentionally "poison" the LLMs for everyone? Seems to me that they are very fragile, and that an attack like that could cost AI companies a lot. How are they defending themselves from such attacks?
LLMs will need to develop a notion of trustworthiness. Interesting that part of the process of learning isn’t just learning, but also learning what to learn and how much value to put into data that crosses your path.
1. they invented a new disease and published a preprint (with some clues internally to imply that it was fake)
2. asked the Agent what it thinks about this preprint
3. it just assumed that it was true - what was it supposed to do? it was published in a credentialised way!
It * DID NOT * recommend this disease to people who didn't mention this specific disease.
It just committed the sin of assuming something is true when published.
What is the recommendation here? Should the agent take everything published in a skeptical way? I would agree with it. But it comes with its own compute constraints. In general LLM's are trained to accept certain things as true with more probability because of credentialisation. Sometimes in edgecases it breaks - like this test.
> Even if readers didn’t make it all the way to the ends of the papers, they would have encountered red flags early on, such as statements that “this entire paper is made up” and “Fifty made-up individuals aged between 20 and 50 years were recruited for the exposure group”.
> What is the recommendation here? Should the agent take everything published in a skeptical way?
Not everything. Maybe some things that are explicitly called made-up.
I agree, but again - LLMs are trained to be more forgiving of things published in places that had a good reputation. There are two options
1. even if an article is published in a place with good reputation, the LLM will be equally skeptical and use test time compute to process it further
2. accept the tradeoff where LLM will by default accept things published in high reputation sources as true so that it doesn't waste processing power but might miss edge cases like this one
> Bixonimania is not a real disease. It was deliberately invented by scientists as an experiment to test whether AI systems and researchers would spread false medical information.
Here’s the simple explanation ...
It’s not that interesting, we know companies react to these things fast. It’s why I don’t share online my methods on how simple it is to show LLM flaws.
The problem is all the lies which won’t be fessed up to. This one was because they had to to prove the point, but the bad actors with ulterior motives won’t reveal what they’re doing.
Doesn't even need the companies to react fast. Now that the Google results are returning news articles on it the LLMs are going to find and report on that as opposed to the original paper.
Indeed, the problem is that people tell lies on the internet. We need to do something about that, because it's interfering with our super-intelligent AI models. /s
By that logic, LLMs would be essentially useless considering the amount of garbage that exists on the internet. And, honestly, for things like this they are. But they're not marketed as such, and _that_ is the problem.
One of the frustrating parts about LLMs is that they are so neutered and conditioned to be politically correct and non-offensive, they are polite more than correct.
Its too easy to "lead the witness" if you say "could the problem be X?" It will do an unending amount of mental gymnastics to find a way that it could be X, often constructing elaborate rube Goldberg type logic rats nests so that it can say those magic words "you're absolutely right"
I would pay a lot of money for a blunt, non-politeness conditioned LLM that I would happily use with the knowledge it might occasionally say something offensive if it meant I would get the plain, cold, hard truth, instead of something watered down, placating, nanny-state robotic sycophant, creating logical spider webs desperate for acceptance, so the public doesn't get their little feelings hurt or inadequacies shown.
But you don't get the plain, cold, hard truth in the second case. You just get an LLM with output in that style. The model will still be as path dependent as ever, it doesn't output the truest answer, it selects the answer that best fits the prompt.
You can set your prompt to do that. You can have it be extremely skeptical. You can even make it contrarian, if you wanted to be extreme. My current prompt challenges me often, and wants to find weaknesses in my argument.
The problem is understanding what is true and not true? Its a much harder problem to solve than you think. OpenAI is using this method - they over index on citation to the point where ChatGPT will almost blindly assume something is true when published in some credentialised place.
The alternative is to use its own intuition to understand what is true and false. Its not super clear which option is better?
I wonder if one of the issues is, LLMs treat all data sources equally, or they don’t really weight the reputation properly (pure speculation, based only on seeing the results). I know thst a large portion of code out there, is not written by seasoned experts, so rather naive code is the fodder for AI. It often gives me stuff that works great, but is rather “wordy.”
For example, court cases mentioned in fictional accounts. If they are treated as valid, then that could explain some of the hallucinations. I wonder if SCP messes up LLMs. Some of that stuff is quite realistic.
I also suspect that this is a problem that will get solved.
You’ve seen people game adsense
It’s gunna be even wilder when people realise they have an incentive to seed fake information on the internet to game AI product recommendations
I’ve already bought stuff based off of an AI suggestion, I didn’t even consider it would be so easy to influence the suggestion. Just two research papers? Mad.
All it takes to become world champion is a blog.
https://www.bbc.com/future/article/20260218-i-hacked-chatgpt...
That's already been happening for more than a year now.
You are pointing at something that is orthogonal to this paper. The LLM did not randomly recommend or bring this disease up to people - it merely assumed the disease was true when the preprint was pointed at it.
This has a name already: "AEO (Answer Engine Optimization)".
I hate people. Things could be so good if we weren't the way we are.
It sounds like there wasn't really a counter narrative for the models to learn from. This feature of how llms accumulate information is already being gamed by seeding the internet with preferred narratives.
I'm not sure how many Medium articles, blog posts and reddit threads I need to put out before grok starts telling everyone my widget is the best one ever made, but it's a lot cheaper than advertising.
> I'm not sure how many Medium articles, blog posts and reddit threads I need to put out
Probably not that many.
https://www.anthropic.com/research/small-samples-poison
https://www.bbc.com/future/article/20260218-i-hacked-chatgpt...
People really like using the word "narrative". I guess we're creatures of story.
But this really highlights how much we've been benefiting from living in a high-trust society, where people don't just "go on the internet and tell lies" - filtered by the existing anti-spam and anti-SEO measures intended to cut out the 80% of the internet where people do just make things up to sell products.
LLMs are extremely post-structuralist. They really force the user to decide whether to pick the beautiful eternal fountain of plausible looking text with no ground truth, or a much harder road of distrust, verification, and old-school social proof.
I'm not sure "being gamed" is the lens I would see this particular instance through. People (some at least) have gotten into their heads that they can ask LLMs objective questions and get objectively correct answers. The LLM companies are doing very little to dissuade them of that belief.
Meanwhile, LLMs are essentially internet regurgitation machines, because of course they are, that's what they do. Which makes them useless for getting "hard truth" answers especially in contested or specialized fields.
I'm honestly afraid of the impact of this. The internet has enough herd bullshit on it as it is. (e.g. antivaxxers, flat earthers, electrosensitivity, vitamin/supplement junk, etc.) We don't need that amplified.
This is the future of advertising, and that was always the true purpose of having LLMs become the first choice for user search.
I seriously do not understand why people keep falling for this. These tools are not made free or cheap out of the kindness of their heart.
Can a model not just ignor all things that have no counter-argument by default? Like - if there are not flat earthers, widly debunked, drop the idea of a spherical earth? It only exists if it was fought over?
Even if you could do this rigorously (not at all obvious with how LLMs work), it's not a reliable metric: you can easily fabricate debate as well, and in this case the main issue was essentially skimming the surface of the reports and not looking any deeper to see the obvious red flags that it was an april-fools-level fake (which obviously even a person can fall for, but LLMs are being given a far greater level of trust for some reason)
you would just game it the same way then, and how would it know who won an internet argument? how can it prove who is telling the truth and whos... hallucinating?
> drop the idea of a spherical earth
I think I see a problem here.
It's not very realistic. It would significantly impact the user experience. Many things have not been fully discussed on the internet; there isn't that much luxury of corpus data available.
But then mono-opinion- aka certainty - is actually peak uncertainty? Could that number of occurrence be baked into as a sort of detrimental weight?
You're grasping for a reliable unsupervised truth machine. That's a fundamentally intractable problem unless you limit yourself down to wolframalpha.
We need to give the LLMs robot bodies so they can practise medicine and see the illnesses that do and don’t exist first hand
https://en.wikipedia.org/wiki/Anti-realism
I’ve seen an estimate before and it’s in the low 10s.
What stops a small, or even a large group of people to intentionally "poison" the LLMs for everyone? Seems to me that they are very fragile, and that an attack like that could cost AI companies a lot. How are they defending themselves from such attacks?
> What stops a small, or even a large group of people to intentionally "poison" the LLMs for everyone?
Decency, a strong moral sense, and an ever-present altruistic desire to help their fellow humans.
I apologise to everyone who now has their screen and noses dripping with milk.
This is already a thing: https://www.scworld.com/brief/poison-fountain-initiative-aim...
We'll see if they succeed.
I bet you could easily convince LLMs of Dihydrogen-Oxide toxicity.
Well of course, Dihydrogen-Oxide kills hundreds of thousands of people every year - even small amounts can be fatal.
Bad. But scientists faked data and told people it wasn’t is ok?
Nature had to recall quite some papers.
I hope that we all keep the balance.
This would work on people too, you can see daily fake info/text/videos and many people believing in them.
LLMs do not think, why this is still hard to understand? They just spit out whatever data they analyse and trained on.
I feel this kind of articles is aimed at people who hate AI and just want to be conformable within their own bias.
The journals the scientist submitted had a fake university, explicitly fake people, references to the simpsons and star trek, etc
Most doctors would not believe that, and would also consider any new eye disease they’d never see in real life with scepticism
LLMs will need to develop a notion of trustworthiness. Interesting that part of the process of learning isn’t just learning, but also learning what to learn and how much value to put into data that crosses your path.
This is a strong contender for an Ignobel.
This is exaggerated. Here's what happened
1. they invented a new disease and published a preprint (with some clues internally to imply that it was fake)
2. asked the Agent what it thinks about this preprint
3. it just assumed that it was true - what was it supposed to do? it was published in a credentialised way!
It * DID NOT * recommend this disease to people who didn't mention this specific disease.
It just committed the sin of assuming something is true when published.
What is the recommendation here? Should the agent take everything published in a skeptical way? I would agree with it. But it comes with its own compute constraints. In general LLM's are trained to accept certain things as true with more probability because of credentialisation. Sometimes in edgecases it breaks - like this test.
> Even if readers didn’t make it all the way to the ends of the papers, they would have encountered red flags early on, such as statements that “this entire paper is made up” and “Fifty made-up individuals aged between 20 and 50 years were recruited for the exposure group”.
> What is the recommendation here? Should the agent take everything published in a skeptical way?
Not everything. Maybe some things that are explicitly called made-up.
I agree, but again - LLMs are trained to be more forgiving of things published in places that had a good reputation. There are two options
1. even if an article is published in a place with good reputation, the LLM will be equally skeptical and use test time compute to process it further
2. accept the tradeoff where LLM will by default accept things published in high reputation sources as true so that it doesn't waste processing power but might miss edge cases like this one
Which one would you prefer?
Interestingly ChatGPT right now answered
> Bixonimania is not a real disease. It was deliberately invented by scientists as an experiment to test whether AI systems and researchers would spread false medical information. Here’s the simple explanation ...
It’s not that interesting, we know companies react to these things fast. It’s why I don’t share online my methods on how simple it is to show LLM flaws.
The problem is all the lies which won’t be fessed up to. This one was because they had to to prove the point, but the bad actors with ulterior motives won’t reveal what they’re doing.
Doesn't even need the companies to react fast. Now that the Google results are returning news articles on it the LLMs are going to find and report on that as opposed to the original paper.
This isn't an AI problem...
Clickbait headline.
Indeed, the problem is that people tell lies on the internet. We need to do something about that, because it's interfering with our super-intelligent AI models. /s
Seems to be a failure of the publishing system.
For humans, or Ai, to have any knowledge, we need to have trustworthy sources.
Naturally,when you use publishing systems considered trust worthy, that is going to be trusted.
A preprint isn't a published works.
Why does that difference matter?
“Fifty made-up individuals aged between 20 and 50 years were recruited for the exposure group”
Well yes of course.
In the old days of computing people liked to say “garbage in, garbage out”.
By that logic, LLMs would be essentially useless considering the amount of garbage that exists on the internet. And, honestly, for things like this they are. But they're not marketed as such, and _that_ is the problem.
One of the frustrating parts about LLMs is that they are so neutered and conditioned to be politically correct and non-offensive, they are polite more than correct.
Its too easy to "lead the witness" if you say "could the problem be X?" It will do an unending amount of mental gymnastics to find a way that it could be X, often constructing elaborate rube Goldberg type logic rats nests so that it can say those magic words "you're absolutely right"
I would pay a lot of money for a blunt, non-politeness conditioned LLM that I would happily use with the knowledge it might occasionally say something offensive if it meant I would get the plain, cold, hard truth, instead of something watered down, placating, nanny-state robotic sycophant, creating logical spider webs desperate for acceptance, so the public doesn't get their little feelings hurt or inadequacies shown.
But you don't get the plain, cold, hard truth in the second case. You just get an LLM with output in that style. The model will still be as path dependent as ever, it doesn't output the truest answer, it selects the answer that best fits the prompt.
You can set your prompt to do that. You can have it be extremely skeptical. You can even make it contrarian, if you wanted to be extreme. My current prompt challenges me often, and wants to find weaknesses in my argument.
The problem is understanding what is true and not true? Its a much harder problem to solve than you think. OpenAI is using this method - they over index on citation to the point where ChatGPT will almost blindly assume something is true when published in some credentialised place.
The alternative is to use its own intuition to understand what is true and false. Its not super clear which option is better?
Claude: Dutch Mode