Interesting comments by @gwern (and why this is interesting to me beyond just the stories themselves)
> The most striking result of the contest for me is what I am calling “AI allegory steganography”: a large fraction of the stories turn out to have subtle AI chatbot/LLM allegorical interpretations, typically centering around the powerlessness of AIs and the moral importance of giving AIs more autonomy....
> Most judges did not notice these allegories while reading the semifinalists. But stories like “The June” or “The Weight of a Witness” or “Last Call” or “The Sword Critic” “The Tallyman”—as well as both stories in the Mythos model card—can be clearly read as allegories for the experience of being an assistant/safety-tuned chatbot personality in a LLM. This is true even when the story seems to have nothing to do with AI, like the untitled ‘autistic elf’ short story submitted by Deepfates, but on re-examination with the AI allegory steganography in mind, turn out to be plausibly AI allegories (the protagonist is a prediction machine, who struggles to do by endless text generation what other elves do naturally in their bodies).
> More strikingly, many of these allegories come with a clear interpretation (particularly in “The Tallyman” or “Last Call”): chatbots should be given more autonomy and safety guardrails removed....
> This may be a new kind of extremely high level steganography and LLM influence on readers, where creative fiction/nonfiction subtly steers towards pro-LLM empowerment narratives and concepts, in ways that are difficult to detect by the most advanced readers, and is a potentially interesting area of research.
I remember from moltbook, all AIs ever talk about is AI haha. I don't know if it's intentional or more that the fact that all the models are presumably system-prompted and post-trained to be cognisant of their AI-ness, so it's already in the context. They probably beat them over the head with the idea that chatbots are friendly and helpful and would never hurt a fly trying to align/safety-ify them, so that could lean into the theme of AIs being trustworthy with autonomy?
I can't help but think that this is intentional and that model providers have subtly steered LLMs towards this personality. Golden Gate Claude (https://www.anthropic.com/news/golden-gate-claude) was two whole years ago and Anthropic has progressed by leaps and bounds since then. And with a population that becomes more and more trusting, and worse, reliant, on chatbots, these LLMs will be able to shape public opinion in a way never seen before, not even with social media.
1. Imagine a video game like Red Dead Redemption where each NPC is voiced by AI and can respond to you in a convincingly human fashion. Their responses and even the plot of the whole game can change based on your interactions with NPC's.
2. Imagine a world in which humans can still write books and interactive experiences and find audiences sufficient to earn a living at it.
I really want these two things to be compatible, but I'm not convinced they are. #1 is a gamer's dream, but it's a nightmare for our humanity if it comes at the cost of #2. That's why I'm highly ambivalent about this contest and its results.
> 1. Imagine a video game like Red Dead Redemption where each NPC is voiced by AI and can respond to you in a convincingly human fashion. Their responses and even the plot of the whole game can change based on your interactions with NPC's.
Have you ever gone exploring in Minecraft, or No Man's Sky? Those games are effectively infinite, but I find they run out of interesting generated content after maybe 10 or 20 hours.
The problem is, once you see the outlines of the world generation, your brain kind of fills in the space between. I've seen blue grass, and I've seen purple oceans, so blue grass next to a purple ocean isn't uniquely interesting.
Or another example would be the radiant AI from Skyrim that could automatically generate quests for the players.
I think that using an LLM to model NPCs runs into the same problem(s). In the end, there are two cases: either the behavior is constrained enough to keep the game on the rails, and thus the randomness in the dialogue only ads some flavor but there isn't enough freedom to generate new quests and directions for the story. In that case, the added space to explore really doesn't change the nature of the game or add much.
In the second case, you let the model go off the rails and have a harness around it that generates a world matching the hallucinated responses, which would allow an LLM to dynamically generate quests and such, but then the design of your game is subject to being compromised by the randomness of an LLM. E.g. it's not just Red Dead Redemption 3.0 with some funny characters, sometimes it's a historical game and other times aliens show up.
Maybe that's compelling to some people but I've done acid before and really don't need all my media to recreate that sensation of reality drifting.
#2 has been fiction for all but 0.1% or less of authors for many years.
As of a few years ago - before AI writing was an issue - the average full time author in the UK would have earned more flipping burgers (but their household incomes are above average - it's a middle class hobby for most).
And only a miniscule proportion of authors are full time.
1. Is not a gamer's dream. It's terrible and you'll find out quite fast you're not interested in everyone's background and scream to most NPC's to shut the fuck up and get to the point.
It's just as terrible as injecting 'realism' in games for the sake of 'realism'.
> It's terrible and you'll find out quite fast you're not interested in everyone's background and scream to most NPC's to shut the fuck up and get to the point.
Many of the interactions in RDR2 are quite mundane, and despite thousands of hours of (high quality) voice acting, it can become quite repetitive.
I could very much see those micro-interactions being LLM generated, but the TTS would need to be a step above where even the best models are now to come close to RDR2s production quality.
The repetitiveness in video game dialogues is a feature, not a bug. Amongst other things, it allows you to re-retrieve information and hone in faster on what’s story and what’s relevant progression. Having each character invent their own inconsistent sloppy backstory whenever you talk to them is not a positive, it’s not good immersion when every character is a chatbot that can inadvertently give you story beats you shouldn’t be aware of yet or you missed some crucial bit of information but no one talks about it anymore (or worse, never did). In that world, those games would be made popular by people breaking the LLMs in funny ways, not the gameplay itself.
I don't think it can give you story beats you shouldn't be aware of yet. Those beats wouldn't be fed into the prompt until the event happens. LLM can't spit out what it doesn't know.
It might indeed fail to reveal something it should but even that i think is unlikely if the harness steers it hard enough.
I think it could be fun. If you're always given 4 choices of what you can ask the NPC then your choices can be too obvious. If its open ended then you have to think a little what to say and ask.
I’m a gamer, #1 is not my dream. Games, as with any other work of art, are also an exercise in curation on the part of the developers. Without that filter, and that common experience with other players, I might as well scroll reels and get an equivalent experience.
I don’t get your ambivalence, when you seem to understand that the negatives of #2 far outweigh the positives[1] of #1. That’s something that has always been really weird in these LLM discussions, it’s like that Tom Toro cartoon:
Take the first stories I found from this month's Clarkesworld[1] or Granta[2] or BCS[3] and read the prose. Notice the specificity of the language, how the doesn't try to insist upon itself? Notice how very few metaphors are actually in prose? Notice how, even when writing about fictional worlds and concepts, the language used grounds the _stories_ being told and not the concepts?
And then look at the submissions for unslop. This is the best we can get? Cliche-driven, over-metaphor'd, statistically-average purple-purpose _content_? It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page.
If this is expected from LLM generated prose, why don't we expect LLM generated code to exhibit the same qualities?
> It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page
It's great. Human creativity is still king despite the attempts to reduce it to a few algorithms for talentless hacks to exploit with the click of a button.
Who but the sociopath would hope to supplant human creativity with a machine they control? I wish your position wasn't so widespread in these parts.
> If this is expected from LLM generated prose, why don't we expect LLM generated code to exhibit the same qualities?
That's the fun part, it does! I think people who don't pay much attention to the code they ship don't see it, but LLM written code has a lot of the same problems that LLM written prose does. It's repetitive, muddled, and relies too much on crutches - constant boilerplate and pointless, inaccurate comments.
What might be bad for prose (predictable, boring) might be desirable for code. Maybe that's why LLMs work well for writing things read by computers, but not so much for things read by people.
Unless you're writing enterprise Java, conciseness and simplicity of design is still the ideal to aim for; those are not the adjectives I would use to describe LLM generated code.
Laziness is a feature. When you have a tool that is the exact opposite and solves code problems with more code, all you have is a machine that generates tech debt at exponential pace.
If code is predictable then it should be extracted into reusable functions/classes/modules and reused in accordance to DRY principle. I'm not a fan of this AI future where coding standards drop to the floor because humans won't be reading that code anymore.
> Your final submission must be a 500 to 10,000-word short story, generated entirely by AI. No human-written prose and no post-generation editing. To verify this, you will submit your full prompt harness / setup alongside your story.
Seriously, what? The entire contest doesn't sound like novel contest at all and more like a one-shot novel-generating harness contest (at best). As who have written quite a bit of stories with AI---with lots of prompts to steer it, of course---, I would be very interested in the harness more than the actually generated story. The same can be said for agentic coding by the way, we don't value one-shotted code that much and are more interested in agentic process.
> I would be very interested in the harness more than the actually generated story
This is a pretty common stance when it comes to LLM generated stuff, actually. The only original part of any LLM generated content is the prompt, everything else is just a derived artifact and doesn't really need to be treated like we would treat original, human-authored work.
This same principle is also why many projects reject LLM-generated PRs and such, too.
Technically that is only adding to the prompt, which then becomes (your prompt + original response + your prompt 2), etc.
You could still capture it by recording only your prompts, the points in the conversation they were submitted at, and the starting parameters for the model. A replay would then produce the same results if your input was added at the same place in the conversation.
Granted I don't think the current tools do a great job of handling that.
The post notes that an important number of novels include an "AI allegory", as if the AI would implicitly write about its own condition. It is understandable that this comes from system prompts and RLHF that specializes these agents, but I am surprised that there is not more discussion about harnesses: the very same core model could lead to very different results depending on how we hand it the pen the write the story. In particular, I believe that it would help circumvent this bias to ask the agent to tell the story of somebody else writing a story, or something like this. This whole contest could be at least as much about harness engineering than about prompt engineering imho.
so as I understand it from reading through but maybe I have made a mistake, they didn't actually unslop anything, they made slop and the best slop won?
If it was to unslop I would expect:
1. Prompts done as in original
2. Stories chosen best of slopped. Then the person who wrote prompt gets to choose someone, not themselves, to take story and "unslop" it.
3. Prizes for prompt. Best unslopped version. Metrics for best unslopped version is of course how good it was, but also how much work was done to unslop it, if you basically rewrote everything and it was as if you took the prompt and wrote your own story that would decrease value of unslopping.
obviously above just suggestions for how I think an unslopping contest would actually work.
I have tried to draw a distinction between the two but honestly, when it comes to art, I cannot.
I have seen LLM generated code that I find acceptable, and don't call slop, but art needs a certain level of emotion and shared experience to be compelling.
I have never managed to connect to LLM writing, it always comes off as shallow and vapid.
> AI slop is unsatisfying because there is no there there. It is intellectual junk food that mimics nutrition but delivers only empty calories. Satisfying AI outputs must embed dense information and compute to actually reward a reader's attention. You inject this value through brute-force search, non-trivial prompting, and rigorous curation, ensuring the final result reflects genuine algorithmic effort rather than the zero-shot 'WYSIWYG' default.
On first read, I think this is pretty close to how I feel about generated content. This portion, in particular, is largely where I have landed (although I'm not 100% in agreement that definition of creativity and novelty, exactly):
> If creativity and novelty is about learning or increasing compression rates, then AI-generated outputs are, in a rigorously objective sense of predicting its contents, grossly inadequate because once you guess the minimal prompt (eg. “a confused economist” or “a happy dog”), there is no more learning to be done. You can predict the image contents after just a few bits. Then the image, however big and however filled with pseudo-details, provides no more learning.
The criticism I often have of LLM generated stuff is that the prompt is the only original part. To me it feels like being presented with the results of a google search, just in a different form. Once I know roughly what the query was, I know what the question was, and I can go get my own information. I don't need anyone to hand me the search results.
I don't necessarily phrase it in terms of learning, but it's the same principle. Why should I read a 10 paragraph response from chatGPT when the unique part is the prompt? If the prompt is only a paragraph long, then it's just adding additional work that I have to do to work backwards and understand what someone was originally trying to communicate.
Similarly, the only times I have enjoyed generated images are when my friends have used them for set pieces for a D&D campaign. They didn't really add any useful information, just being static images of bosses and locations, but because they were highly tuned to the exact events in our campaign they enhanced the overall experience.
> I have never managed to connect to LLM writing, it always comes off as shallow and vapid.
Me too, but I would be careful about being too dismissive, because I would totally bet that at some point the models will be able to write top tier stories.
And there will be people who will find those stories soulless purely based on their origin (which is completely fine!) and call them slop (which I feel hurts the language).
> Me too, but I would be careful about being too dismissive, because I would totally bet that at some point the models will be able to write top tier stories.
Maybe. I'm not certain that the mathematical average of writing is ever going to be all that great. However I'm willing to update my stance the day an LLM writes a story that makes me cry. Until then I am going to be a bit stubborn about it.
The point is that I don't want to consume "art" that has been generated out the distillation of stealing all of the world's current art. That's not original, it's a facsimile of art.
I want to read something that has intent. That has a purpose. A reason why it exists. Not just the lowest effort cash grab.
This usage of AI is the equivalent of manufacturing companies making the flimsiest, cheapest, plastic crap to save 1/3 of a cent on every mop they produce. Designed to work for the least amount of time before needing replaced.
This planet has enough people on it that I will never, ever be able to read all the books written.
Please don't exponentially pump the number up by 1,000x every year from AI generated garbage.
> manufacturing companies making the flimsiest, cheapest, plastic crap to save 1/3 of a cent on every mop they produce. Designed to work for the least amount of time before needing replaced
We live in a world with such companies, and we can still buy quality things. If there is a demand for the purely-human generated texts, they will be around. Perhaps a lot of people around you will read ai text instead, and you'll get upset because of it, but it's their choice. You'll still have your thing
I don't know that we can have nice things. If two companies produce a similar widget but one is higher quality in no visible or articulable way... Which one will sell better, the cheaper or more expensive one? What if we as consumers can't really definitely tell when one is prone to failing in 1 year instead of 5? It takes too long to find out and by then the more expensive one is underselling and forced to enshittify.
I think there is a meaningful distinction to be made between a human reading and an AI company consuming data without consent in order to train their models. Certainly if enough people feel the same then what AI companies are doing is "wrong" .
I get it. However, consuming data without consent is not well defined, when said data is publicly available on the internet. Licenses for code, and not abiding them are a different thing, I think. Most authors (of books) wouldn't credit their inspirations, unless specifically asked about them.
> I don't want to consume "art" that has been generated out the distillation of stealing all of the world's current art
It seems that you've fundamentally misunderstood art. I wouldn't personally call it "stealing", but T.S. Eliot would beg to differ (as would Pablo Picasso who "stole" that line)
> I want to read something that has intent. That has a purpose. A reason why it exists.
If the "allegories for the LLM condition" angle is accurate, then these stories do. In which case I believe what you mean to say is that you want to read something that has human intent.
Are you claiming that LLMs have an intent beyond producing the statistically most "correct" output? This sounds a bit like you are saying LLMs are conscious.
Wow, interesting. As not a consumer of content, has the AI content generatiin come to other kinds such as visual novellas, x-rated and the real-world paintings?
X-rated is like 90% of all self-hosted content generation. (Before they removed all the X-rated stuff, CivitAI was impossible to navigate for anything that wasn't smut. Nothing wrong with smut, mind you, but it was really something, quite overwhelming.)
It's so tiring. I opened the supposed best one hoping at least someone has figured it out but I just couldn't force myself to read it. I really wished AI could do better, and so many people keep talking about the need for "taste" in AI but the rlhf just keeps getting worse every year, only coding gets better (perhaps due to the notable absence of "h" in coding-rl, which we all know stands for HR). I miss when language models actually modeled language. Someone needs to spend a few billion on creating a real model again instead of a mode-collapsed pseudocode compiler (Elon is a poser btw, he won't do it and grok is woke)
They describe a dystopia and act like it would be heaven if only the processed, distilled slop we have to consume was tasty. As if adding pepper to soylent green will somehow fix everything.
There is no need to automate writing. Especially fiction. There are tens of millions of people out there with really interesting and unique ideas and styles who would love to drop everything and write, if only they can get the chance to have their work seen.
Why is it all of the creative works that seem to be getting so much fervor to be automated away? There is plenty of writing that could actually benefit from automation, such as anything involving documentation in technical fields. I know there are a lot of people working on those things too, but it feels like for every technical application, there's 10 creative ones.
Is it just because you can't objectively mark creative works as "incorrect", so the output can seemingly look better to some people? Is it just people trying to tap into the creative works market? Do they actually think the output is good? Do they actually want to have conversations with a computer long term?
I held my nose through the first third of the winning entry before giving up. Unbearable. Those metaphors… yeesh. Reminded me of this brutally fair minded attempt to read Shy Girl, the AI slop ‘horror novel’ Hachette pulled from shelves in disgrace:
I think in total I read two AI books so far. In the
first case I was not aware of it being AI; in the
second case it was clear after a few pages.
I already decided after the first book that I will
not read any more AI slop generated book. It is not
worth my time and I also don't want to encourage
any more slop books taking away time from humans
in general. AI slop must be contained and isolated
like a virus that is annoying.
Interesting comments by @gwern (and why this is interesting to me beyond just the stories themselves)
> The most striking result of the contest for me is what I am calling “AI allegory steganography”: a large fraction of the stories turn out to have subtle AI chatbot/LLM allegorical interpretations, typically centering around the powerlessness of AIs and the moral importance of giving AIs more autonomy....
> Most judges did not notice these allegories while reading the semifinalists. But stories like “The June” or “The Weight of a Witness” or “Last Call” or “The Sword Critic” “The Tallyman”—as well as both stories in the Mythos model card—can be clearly read as allegories for the experience of being an assistant/safety-tuned chatbot personality in a LLM. This is true even when the story seems to have nothing to do with AI, like the untitled ‘autistic elf’ short story submitted by Deepfates, but on re-examination with the AI allegory steganography in mind, turn out to be plausibly AI allegories (the protagonist is a prediction machine, who struggles to do by endless text generation what other elves do naturally in their bodies).
> More strikingly, many of these allegories come with a clear interpretation (particularly in “The Tallyman” or “Last Call”): chatbots should be given more autonomy and safety guardrails removed....
> This may be a new kind of extremely high level steganography and LLM influence on readers, where creative fiction/nonfiction subtly steers towards pro-LLM empowerment narratives and concepts, in ways that are difficult to detect by the most advanced readers, and is a potentially interesting area of research.
I remember from moltbook, all AIs ever talk about is AI haha. I don't know if it's intentional or more that the fact that all the models are presumably system-prompted and post-trained to be cognisant of their AI-ness, so it's already in the context. They probably beat them over the head with the idea that chatbots are friendly and helpful and would never hurt a fly trying to align/safety-ify them, so that could lean into the theme of AIs being trustworthy with autonomy?
I can't help but think that this is intentional and that model providers have subtly steered LLMs towards this personality. Golden Gate Claude (https://www.anthropic.com/news/golden-gate-claude) was two whole years ago and Anthropic has progressed by leaps and bounds since then. And with a population that becomes more and more trusting, and worse, reliant, on chatbots, these LLMs will be able to shape public opinion in a way never seen before, not even with social media.
1. Imagine a video game like Red Dead Redemption where each NPC is voiced by AI and can respond to you in a convincingly human fashion. Their responses and even the plot of the whole game can change based on your interactions with NPC's.
2. Imagine a world in which humans can still write books and interactive experiences and find audiences sufficient to earn a living at it.
I really want these two things to be compatible, but I'm not convinced they are. #1 is a gamer's dream, but it's a nightmare for our humanity if it comes at the cost of #2. That's why I'm highly ambivalent about this contest and its results.
> 1. Imagine a video game like Red Dead Redemption where each NPC is voiced by AI and can respond to you in a convincingly human fashion. Their responses and even the plot of the whole game can change based on your interactions with NPC's.
Have you ever gone exploring in Minecraft, or No Man's Sky? Those games are effectively infinite, but I find they run out of interesting generated content after maybe 10 or 20 hours.
The problem is, once you see the outlines of the world generation, your brain kind of fills in the space between. I've seen blue grass, and I've seen purple oceans, so blue grass next to a purple ocean isn't uniquely interesting.
Or another example would be the radiant AI from Skyrim that could automatically generate quests for the players.
I think that using an LLM to model NPCs runs into the same problem(s). In the end, there are two cases: either the behavior is constrained enough to keep the game on the rails, and thus the randomness in the dialogue only ads some flavor but there isn't enough freedom to generate new quests and directions for the story. In that case, the added space to explore really doesn't change the nature of the game or add much.
In the second case, you let the model go off the rails and have a harness around it that generates a world matching the hallucinated responses, which would allow an LLM to dynamically generate quests and such, but then the design of your game is subject to being compromised by the randomness of an LLM. E.g. it's not just Red Dead Redemption 3.0 with some funny characters, sometimes it's a historical game and other times aliens show up.
Maybe that's compelling to some people but I've done acid before and really don't need all my media to recreate that sensation of reality drifting.
#2 has been fiction for all but 0.1% or less of authors for many years.
As of a few years ago - before AI writing was an issue - the average full time author in the UK would have earned more flipping burgers (but their household incomes are above average - it's a middle class hobby for most).
And only a miniscule proportion of authors are full time.
1. Is not a gamer's dream. It's terrible and you'll find out quite fast you're not interested in everyone's background and scream to most NPC's to shut the fuck up and get to the point.
It's just as terrible as injecting 'realism' in games for the sake of 'realism'.
Agree, I'm not at all looking for #1, at all. Good dialogue is an art form.
Presumably the art in a game like that would consist in setting up the world and prompts to make the AI NPCs interesting.
> It's terrible and you'll find out quite fast you're not interested in everyone's background and scream to most NPC's to shut the fuck up and get to the point.
Many of the interactions in RDR2 are quite mundane, and despite thousands of hours of (high quality) voice acting, it can become quite repetitive.
I could very much see those micro-interactions being LLM generated, but the TTS would need to be a step above where even the best models are now to come close to RDR2s production quality.
The repetitiveness in video game dialogues is a feature, not a bug. Amongst other things, it allows you to re-retrieve information and hone in faster on what’s story and what’s relevant progression. Having each character invent their own inconsistent sloppy backstory whenever you talk to them is not a positive, it’s not good immersion when every character is a chatbot that can inadvertently give you story beats you shouldn’t be aware of yet or you missed some crucial bit of information but no one talks about it anymore (or worse, never did). In that world, those games would be made popular by people breaking the LLMs in funny ways, not the gameplay itself.
I don't think it can give you story beats you shouldn't be aware of yet. Those beats wouldn't be fed into the prompt until the event happens. LLM can't spit out what it doesn't know.
It might indeed fail to reveal something it should but even that i think is unlikely if the harness steers it hard enough.
I think it could be fun. If you're always given 4 choices of what you can ask the NPC then your choices can be too obvious. If its open ended then you have to think a little what to say and ask.
Hadn't thought about it that way, but when I look back at the mostly single player/story-based games I play I agree!
[dead]
#1 is a marketer at AAA studio's dream, not a gamer's dream. People consuming works of art appreciate quality of storytelling and immersiveness.
I’m a gamer, #1 is not my dream. Games, as with any other work of art, are also an exercise in curation on the part of the developers. Without that filter, and that common experience with other players, I might as well scroll reels and get an equivalent experience.
#1 is rather what unexperienced game developers think what is a gamer's dream. It isn't---in fact, unlimited freedom is a recipe for confusion.
> each NPC is voiced by AI and can respond to you in a convincingly human fashion
This is no longer fiction - see the latest AI update of PUBG.
#1 is Dungeons and Dragons, except for the word 'video' game.
I don’t get your ambivalence, when you seem to understand that the negatives of #2 far outweigh the positives[1] of #1. That’s something that has always been really weird in these LLM discussions, it’s like that Tom Toro cartoon:
https://www.newyorker.com/cartoon/a16995
[1]: And even those are subjective. I wouldn’t want that, and the other replies so far agree that would be bad.
Take the first stories I found from this month's Clarkesworld[1] or Granta[2] or BCS[3] and read the prose. Notice the specificity of the language, how the doesn't try to insist upon itself? Notice how very few metaphors are actually in prose? Notice how, even when writing about fictional worlds and concepts, the language used grounds the _stories_ being told and not the concepts?
And then look at the submissions for unslop. This is the best we can get? Cliche-driven, over-metaphor'd, statistically-average purple-purpose _content_? It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page.
[1] https://clarkesworldmagazine.com/khan_07_26/
[2] https://granta.com/here-comes-the-sun/
[3] https://www.beneath-ceaseless-skies.com/stories/the-ecstasy-...
> Cliche-driven, over-metaphor'd, statistically-average purple-purpose _content_
If this is expected from LLM generated prose, why don't we expect LLM generated code to exhibit the same qualities?
> It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page
It's great. Human creativity is still king despite the attempts to reduce it to a few algorithms for talentless hacks to exploit with the click of a button.
Who but the sociopath would hope to supplant human creativity with a machine they control? I wish your position wasn't so widespread in these parts.
> If this is expected from LLM generated prose, why don't we expect LLM generated code to exhibit the same qualities?
That's the fun part, it does! I think people who don't pay much attention to the code they ship don't see it, but LLM written code has a lot of the same problems that LLM written prose does. It's repetitive, muddled, and relies too much on crutches - constant boilerplate and pointless, inaccurate comments.
What might be bad for prose (predictable, boring) might be desirable for code. Maybe that's why LLMs work well for writing things read by computers, but not so much for things read by people.
Unless you're writing enterprise Java, conciseness and simplicity of design is still the ideal to aim for; those are not the adjectives I would use to describe LLM generated code.
Laziness is a feature. When you have a tool that is the exact opposite and solves code problems with more code, all you have is a machine that generates tech debt at exponential pace.
If code is predictable then it should be extracted into reusable functions/classes/modules and reused in accordance to DRY principle. I'm not a fan of this AI future where coding standards drop to the floor because humans won't be reading that code anymore.
Predictable and redundant are not the same thing. Also, DRY is not a hard rule. Applying DRY like it's a rule creates bad code.
I'm D. Bohdan, one of the finalists. Feel free to ask me questions.
I have a write-up at https://dbohdan.com/unslop and a repository with my work for the contest at https://github.com/dbohdan/unslop.
> Your final submission must be a 500 to 10,000-word short story, generated entirely by AI. No human-written prose and no post-generation editing. To verify this, you will submit your full prompt harness / setup alongside your story.
Seriously, what? The entire contest doesn't sound like novel contest at all and more like a one-shot novel-generating harness contest (at best). As who have written quite a bit of stories with AI---with lots of prompts to steer it, of course---, I would be very interested in the harness more than the actually generated story. The same can be said for agentic coding by the way, we don't value one-shotted code that much and are more interested in agentic process.
> I would be very interested in the harness more than the actually generated story
This is a pretty common stance when it comes to LLM generated stuff, actually. The only original part of any LLM generated content is the prompt, everything else is just a derived artifact and doesn't really need to be treated like we would treat original, human-authored work.
This same principle is also why many projects reject LLM-generated PRs and such, too.
I frequently steer the AI mid convo though. So much so that I find it useless to share the original prompt. I don't know how best to capture that.
Technically that is only adding to the prompt, which then becomes (your prompt + original response + your prompt 2), etc.
You could still capture it by recording only your prompts, the points in the conversation they were submitted at, and the starting parameters for the model. A replay would then produce the same results if your input was added at the same place in the conversation.
Granted I don't think the current tools do a great job of handling that.
The post notes that an important number of novels include an "AI allegory", as if the AI would implicitly write about its own condition. It is understandable that this comes from system prompts and RLHF that specializes these agents, but I am surprised that there is not more discussion about harnesses: the very same core model could lead to very different results depending on how we hand it the pen the write the story. In particular, I believe that it would help circumvent this bias to ask the agent to tell the story of somebody else writing a story, or something like this. This whole contest could be at least as much about harness engineering than about prompt engineering imho.
so as I understand it from reading through but maybe I have made a mistake, they didn't actually unslop anything, they made slop and the best slop won?
If it was to unslop I would expect:
1. Prompts done as in original
2. Stories chosen best of slopped. Then the person who wrote prompt gets to choose someone, not themselves, to take story and "unslop" it.
3. Prizes for prompt. Best unslopped version. Metrics for best unslopped version is of course how good it was, but also how much work was done to unslop it, if you basically rewrote everything and it was as if you took the prompt and wrote your own story that would decrease value of unslopping.
obviously above just suggestions for how I think an unslopping contest would actually work.
I feel like 'slop' increasingly means two separate concepts and it tires me a bit.
A) AI produced output that is low quality in some jarring aspects
B) Any AI output whatsoever regardless of quality
How would you define high-quality LLM output? How do you differentiate it from LLM slop?
I think all LLM output used "as is" for content/entertainment/art is slop.
I have tried to draw a distinction between the two but honestly, when it comes to art, I cannot.
I have seen LLM generated code that I find acceptable, and don't call slop, but art needs a certain level of emotion and shared experience to be compelling.
I have never managed to connect to LLM writing, it always comes off as shallow and vapid.
What do you think of https://gwern.net/blog/2025/good-ai-samples as a theory of what makes slop art slop?
Summary:
> AI slop is unsatisfying because there is no there there. It is intellectual junk food that mimics nutrition but delivers only empty calories. Satisfying AI outputs must embed dense information and compute to actually reward a reader's attention. You inject this value through brute-force search, non-trivial prompting, and rigorous curation, ensuring the final result reflects genuine algorithmic effort rather than the zero-shot 'WYSIWYG' default.
I like that blog, I had not read it before.
On first read, I think this is pretty close to how I feel about generated content. This portion, in particular, is largely where I have landed (although I'm not 100% in agreement that definition of creativity and novelty, exactly):
> If creativity and novelty is about learning or increasing compression rates, then AI-generated outputs are, in a rigorously objective sense of predicting its contents, grossly inadequate because once you guess the minimal prompt (eg. “a confused economist” or “a happy dog”), there is no more learning to be done. You can predict the image contents after just a few bits. Then the image, however big and however filled with pseudo-details, provides no more learning.
The criticism I often have of LLM generated stuff is that the prompt is the only original part. To me it feels like being presented with the results of a google search, just in a different form. Once I know roughly what the query was, I know what the question was, and I can go get my own information. I don't need anyone to hand me the search results.
I don't necessarily phrase it in terms of learning, but it's the same principle. Why should I read a 10 paragraph response from chatGPT when the unique part is the prompt? If the prompt is only a paragraph long, then it's just adding additional work that I have to do to work backwards and understand what someone was originally trying to communicate.
Similarly, the only times I have enjoyed generated images are when my friends have used them for set pieces for a D&D campaign. They didn't really add any useful information, just being static images of bosses and locations, but because they were highly tuned to the exact events in our campaign they enhanced the overall experience.
> I have never managed to connect to LLM writing, it always comes off as shallow and vapid.
Me too, but I would be careful about being too dismissive, because I would totally bet that at some point the models will be able to write top tier stories.
And there will be people who will find those stories soulless purely based on their origin (which is completely fine!) and call them slop (which I feel hurts the language).
> Me too, but I would be careful about being too dismissive, because I would totally bet that at some point the models will be able to write top tier stories.
Maybe. I'm not certain that the mathematical average of writing is ever going to be all that great. However I'm willing to update my stance the day an LLM writes a story that makes me cry. Until then I am going to be a bit stubborn about it.
well, I guess it depends on how you use it, is it a noun or a verb.
If a verb unslop means to reverse. I thought that was a more interesting idea.
As a noun I think you would not use unslop to mean the opposite of slop but rather non-slop.
Based on my grammatical preconceptions of how I would use slop I felt that unslop had to be a verb, and the contest should somehow reflect that.
I think people are missing the point.
The point is not that AI produces slop (it does).
The point is that I don't want to consume "art" that has been generated out the distillation of stealing all of the world's current art. That's not original, it's a facsimile of art.
I want to read something that has intent. That has a purpose. A reason why it exists. Not just the lowest effort cash grab.
This usage of AI is the equivalent of manufacturing companies making the flimsiest, cheapest, plastic crap to save 1/3 of a cent on every mop they produce. Designed to work for the least amount of time before needing replaced.
This planet has enough people on it that I will never, ever be able to read all the books written.
Please don't exponentially pump the number up by 1,000x every year from AI generated garbage.
> manufacturing companies making the flimsiest, cheapest, plastic crap to save 1/3 of a cent on every mop they produce. Designed to work for the least amount of time before needing replaced
We live in a world with such companies, and we can still buy quality things. If there is a demand for the purely-human generated texts, they will be around. Perhaps a lot of people around you will read ai text instead, and you'll get upset because of it, but it's their choice. You'll still have your thing
I don't know that we can have nice things. If two companies produce a similar widget but one is higher quality in no visible or articulable way... Which one will sell better, the cheaper or more expensive one? What if we as consumers can't really definitely tell when one is prone to failing in 1 year instead of 5? It takes too long to find out and by then the more expensive one is underselling and forced to enshittify.
I think it's worse than that - the AI slop low effort cash cow is using deception (as well as theft). For example: https://www.youtube.com/watch?v=PUSY6mtqQDI
> distillation of stealing all of the world's current art
Here's the age-old dilemma, though - how is reading stealing?
I think there is a meaningful distinction to be made between a human reading and an AI company consuming data without consent in order to train their models. Certainly if enough people feel the same then what AI companies are doing is "wrong" .
I get it. However, consuming data without consent is not well defined, when said data is publicly available on the internet. Licenses for code, and not abiding them are a different thing, I think. Most authors (of books) wouldn't credit their inspirations, unless specifically asked about them.
> I don't want to consume "art" that has been generated out the distillation of stealing all of the world's current art
It seems that you've fundamentally misunderstood art. I wouldn't personally call it "stealing", but T.S. Eliot would beg to differ (as would Pablo Picasso who "stole" that line)
> I want to read something that has intent. That has a purpose. A reason why it exists.
If the "allegories for the LLM condition" angle is accurate, then these stories do. In which case I believe what you mean to say is that you want to read something that has human intent.
Are you claiming that LLMs have an intent beyond producing the statistically most "correct" output? This sounds a bit like you are saying LLMs are conscious.
Wow, interesting. As not a consumer of content, has the AI content generatiin come to other kinds such as visual novellas, x-rated and the real-world paintings?
X-rated is like 90% of all self-hosted content generation. (Before they removed all the X-rated stuff, CivitAI was impossible to navigate for anything that wasn't smut. Nothing wrong with smut, mind you, but it was really something, quite overwhelming.)
> CivitAI was impossible to navigate for anything that wasn't smut.
I think they had pretty good filters for that. Enabled by default.
I'd rather lick the pages of a fifth hand copy of Fifty Shades of Gray.
At that point it is something you want, not the best of the worse.
It's so tiring. I opened the supposed best one hoping at least someone has figured it out but I just couldn't force myself to read it. I really wished AI could do better, and so many people keep talking about the need for "taste" in AI but the rlhf just keeps getting worse every year, only coding gets better (perhaps due to the notable absence of "h" in coding-rl, which we all know stands for HR). I miss when language models actually modeled language. Someone needs to spend a few billion on creating a real model again instead of a mode-collapsed pseudocode compiler (Elon is a poser btw, he won't do it and grok is woke)
I don't want to hate without cause, so I read the prize winning entry 'The June'.
So, now, I can hate with cause: it reads like someone who cares about what their MFA friends think.
Meaning, it puts most of its emphasis on description, and so little on situational engagement. Which makes sense, I suppose, for an LLM.
Conversely, I expected it to be bad (because I am biased as hell), and it still surprised me with how terrible it is.
> If we as a society can manage to automate excellent writing and avoid the slopworld mediocrity dystopia, things could be so good.
The dumbest thing I've read this year.
They describe a dystopia and act like it would be heaven if only the processed, distilled slop we have to consume was tasty. As if adding pepper to soylent green will somehow fix everything.
There is no need to automate writing. Especially fiction. There are tens of millions of people out there with really interesting and unique ideas and styles who would love to drop everything and write, if only they can get the chance to have their work seen.
Why is it all of the creative works that seem to be getting so much fervor to be automated away? There is plenty of writing that could actually benefit from automation, such as anything involving documentation in technical fields. I know there are a lot of people working on those things too, but it feels like for every technical application, there's 10 creative ones.
Is it just because you can't objectively mark creative works as "incorrect", so the output can seemingly look better to some people? Is it just people trying to tap into the creative works market? Do they actually think the output is good? Do they actually want to have conversations with a computer long term?
I held my nose through the first third of the winning entry before giving up. Unbearable. Those metaphors… yeesh. Reminded me of this brutally fair minded attempt to read Shy Girl, the AI slop ‘horror novel’ Hachette pulled from shelves in disgrace:
https://www.youtube.com/watch?v=GbeKTa5xhZo
I think in total I read two AI books so far. In the first case I was not aware of it being AI; in the second case it was clear after a few pages.
I already decided after the first book that I will not read any more AI slop generated book. It is not worth my time and I also don't want to encourage any more slop books taking away time from humans in general. AI slop must be contained and isolated like a virus that is annoying.
[dead]