"Agent Readiness" will likely age as well as "Web 4.0 Blockchain Integration" has.
(To be entirely clear, not because agents won't be a relevant thing, although certainly I have my doubts, but because I believe even if they are a relevant thing, requiring special allowances from sites undermines the whole point, and such things will only end up used by bad actors to mismatch what agents see to what humans see, and so will be intentionally ignored.)
With how bloated and ad-ridden websites have become, I'd love the pure text version for us humans - let the agents deal with stuff intended for us. But I also have my doubts we'll see that.
Regarding the bad actors point, that's been possible for a long time - e.g. serving up different content for search engine crawlers than the user sees when they click through. If I remember correctly, there was a time Google penalised sites that did this.
Big fan of reader mode. For me, a direction better than llms.txt would be to encourage sites to improve their markup (think semantic web era) so agents could get the text version from that the way reader mode does. Would achieve the same thing - save tokens.
This isn't difficult and I think the reason it hasn't been done is that publishers want clicks and ad views. Which begs the question: why would they start doing it for agents?
Yeah, the entire suite of proposed "standards" catering to agents looks like a temporary measure to duct-tape over the limitations and token costs of today's agents. They'll churn as quickly as Anthropic, Google, OpenAI et al. can release new versions of their frontier models.
> Yeah, the entire suite of proposed "standards" catering to agents looks like a temporary measure to duct-tape over the limitations and token costs of today's agents.
That's fine. We need a fix for today's problems today.
This would be a really great resource website in 2016.
But right now, when AI can just spit out everything you have on website faster and in a more personalized way then i dont think that people would wanna use this much.
Hmm wondering how common some of these are ... I'd love /.well-known/change-password but it looks like https://news.ycombinator.com/.well-known/change-password and google.com/.well-known/change-password don't seem to be implemented?
What a great resource. As someone who’s been making websites for 30 years, it’s amazing to still be picking up some of the basics. Though to be fair many of these didn’t exist back then.
I’ll be using this to add some extra tags to my pages.
It looks like there are some features noted as “required” that are actually required by the spec (e.g. a title tag), and others that are required by opinion (e.g. https) so there’s an element^ of pragmatic best practice being recommended.
I find it curious that setting a colour hint for the browser is recommended. I’m one for letting the browser look as vanilla as possible and letting my pages do the talking.
This looks like slop from a slop factory. "SEO", "Agent-readiness". That's precisely what a good website doesn't do (to paraphrase the homepage).
Oh yes, it's produced by a Wordpress "SEO" expert and private investor using Claude LLM. What a surprise. A man who built a fortune destroying the internet we loved with advertisement slop now working on destroying whatever's left with LLM slop.
Yeah, mostly slop. I wonder why the slop slingers never disable Claude's self-attribution, and are too lazy to commit themselves, are they proud that they're delegating everything to a slop machine?
Having such a list is great. I am all for such lists.
BUT
Some people memorize these things. Take them too seriously. You are thought stupid if you don't know them. Somewhere someone then makes a story on Jira to verify that your product does all of these things and you have to convince them that we are fine without them or we don't need all of them etc.
I haven't seen this much bullshit in a long time. Can we just run a webserver, write the html and whatnot and call it a day? It's not like a webdev didn't have anything to do already.
llms.txt is supported by 0 of the relevant ai providers and must be seen as harmful
.. as the webmaster implemented something that they might thought has an impact (false sense of impact), but has zero
so net gain negative
i consider such lists harmful - a good website is one that supports the goal of the website providers and its desired users (some of these users might be bots)
a bad website is a website that does everything for everyone just because
"Agent Readiness" will likely age as well as "Web 4.0 Blockchain Integration" has.
(To be entirely clear, not because agents won't be a relevant thing, although certainly I have my doubts, but because I believe even if they are a relevant thing, requiring special allowances from sites undermines the whole point, and such things will only end up used by bad actors to mismatch what agents see to what humans see, and so will be intentionally ignored.)
With how bloated and ad-ridden websites have become, I'd love the pure text version for us humans - let the agents deal with stuff intended for us. But I also have my doubts we'll see that.
Regarding the bad actors point, that's been possible for a long time - e.g. serving up different content for search engine crawlers than the user sees when they click through. If I remember correctly, there was a time Google penalised sites that did this.
This is what reader mode is. It exists purely because most websites are unreadable.
Big fan of reader mode. For me, a direction better than llms.txt would be to encourage sites to improve their markup (think semantic web era) so agents could get the text version from that the way reader mode does. Would achieve the same thing - save tokens.
This isn't difficult and I think the reason it hasn't been done is that publishers want clicks and ad views. Which begs the question: why would they start doing it for agents?
Agents don't buy stuff they see in an ad
Yeah, the entire suite of proposed "standards" catering to agents looks like a temporary measure to duct-tape over the limitations and token costs of today's agents. They'll churn as quickly as Anthropic, Google, OpenAI et al. can release new versions of their frontier models.
> Yeah, the entire suite of proposed "standards" catering to agents looks like a temporary measure to duct-tape over the limitations and token costs of today's agents.
That's fine. We need a fix for today's problems today.
True, that's fine. As long as people don't elevate these transient "standards" to the same level as something like basic security and accessibility.
This would be a really great resource website in 2016.
But right now, when AI can just spit out everything you have on website faster and in a more personalized way then i dont think that people would wanna use this much.
Just my perspective, dont wanna be rude
Some of this is pretty good stuff, but I hope standardizing on a 128 item checklist doesn't discourage people from making websites
Hmm wondering how common some of these are ... I'd love /.well-known/change-password but it looks like https://news.ycombinator.com/.well-known/change-password and google.com/.well-known/change-password don't seem to be implemented?
security.txt is always under this folder for sites if it exists, it's also used by letsencrypt for certs or renewals fail
What a great resource. As someone who’s been making websites for 30 years, it’s amazing to still be picking up some of the basics. Though to be fair many of these didn’t exist back then.
I’ll be using this to add some extra tags to my pages.
It looks like there are some features noted as “required” that are actually required by the spec (e.g. a title tag), and others that are required by opinion (e.g. https) so there’s an element^ of pragmatic best practice being recommended.
I find it curious that setting a colour hint for the browser is recommended. I’m one for letting the browser look as vanilla as possible and letting my pages do the talking.
^Pun not intended, blink and you’ll miss it
This looks like slop from a slop factory. "SEO", "Agent-readiness". That's precisely what a good website doesn't do (to paraphrase the homepage).
Oh yes, it's produced by a Wordpress "SEO" expert and private investor using Claude LLM. What a surprise. A man who built a fortune destroying the internet we loved with advertisement slop now working on destroying whatever's left with LLM slop.
It triggers slop flags for me too.
1 - The little color tags : required, optional, recommended.
2 - The insane amount of content no one is ever going to read
3 - the weak premise for an idea carried out to excruciating detail
This seems good especially as beginner still face deep in the weeds of just the pure introductory functional concepts
I heavily assume this is at least partially AI generated... but I have to admit, this is actually useful (aka, human driven). Nice work.
Let’s look at the Git history: https://github.com/jdevalk/specification.website/commits/mai...
Yeah, mostly slop. I wonder why the slop slingers never disable Claude's self-attribution, and are too lazy to commit themselves, are they proud that they're delegating everything to a slop machine?
.well-known/security is listed as a prominent example, but is not in the well-known category.
Useful reference https://securitytxt.org/
Though some sites drop it at the root /security.txt instead of /.well-known/security.txt
Note, invites beg bounties spam.
It's in the "Security" category. I guess whatever categorization scheme they're using doesn't allow assigning multiple categories per item.
This is pretty cool, didnt even know of half the options under well-known urls. Thanks!
Having such a list is great. I am all for such lists.
BUT
Some people memorize these things. Take them too seriously. You are thought stupid if you don't know them. Somewhere someone then makes a story on Jira to verify that your product does all of these things and you have to convince them that we are fine without them or we don't need all of them etc.
Looks interesting, can you convert it to a skill with bunch of scripts to validate those guidelines and use it to build the websites?
Maybe these are what you mean?
https://github.com/jdevalk/specification.website/blob/main/p...
https://github.com/jdevalk/specification.website/blob/main/m...
See also: https://www.iana.org/assignments/well-known-uris/well-known-...
I haven't seen this much bullshit in a long time. Can we just run a webserver, write the html and whatnot and call it a day? It's not like a webdev didn't have anything to do already.
llms.txt is supported by 0 of the relevant ai providers and must be seen as harmful
.. as the webmaster implemented something that they might thought has an impact (false sense of impact), but has zero
so net gain negative
i consider such lists harmful - a good website is one that supports the goal of the website providers and its desired users (some of these users might be bots)
a bad website is a website that does everything for everyone just because
"The Unreasonable Effectiveness of Checklists" (https://rs.io/unreasonable-effectiveness-of-checklists/) comes to mind.
When I was younger I would have though the same. Now that I have more humility and less working memory, I think differently.
but in a checklist you include what actually you need to check, not everything and especially not stuff that is harmful l and/or has negative gain
Great!