I'll share the first-hand account I recently got from someone else.
> We've used it at work
> it is... not as hype as everyone is concerned about
> I'd argue the framework around it for security scanning is the arguably more useful side of the tool, definitely doesnt take a huge model to get all the issues it flagged on our systems
> For us, it absolutely flooded us with noise
> I mean hundreds if not thousands of false positives or minor issues or not applicable
> For every one reasonable issue
> The biggest issue it created was the execs treated every issue it produced like it was a drop everything and fix the issue type deal
> I'm talking company wide drop all things "we need to patch nginx because this module that no one uses and is disabled by default has this RCE vulnerability™
> Or "all ec2 AMIs need to be upgraded because it flagged a a version specific docker vulnerability", it flagged every single machine with docker regardless of if the actual vulnerability was relevant
> Vulnerability was with a very specific Auth plugin configuration you could enable with docker and specifically the Mosley docker compatible tool, but it is clear it only knew there was a vulnerability in docker, not if it was applicable or not
> Meanwhile dirtyfrag and friends not a single peep from btw despite it allowing for container escape
> Idk, I was underwhelmed with the quality of the reporting it gave really. If the company allowed me to get information about all the infrastructure in our entire organisation to run Claude over it repeatedly looking for recent CVEs I'm sure I could produce the same results...
The "humans do it too" argument gets tiresome. Even if the consulting company fails, the money goes back to employees and back into the real economy. Now it goes to Don Amodei.
The consulting company could be local, which provides a higher degree of confidence, though not proof, that no data is exfiltrated to the US.
It seems like there is a genuine communication breakdown between management and engineering. Engineers know that there are vulnerabilities all over the place and that there have been for ages and that where the rubber hits the road every vulnerability does not represent a successful exploit by some nefarious actor.
Management can often treat cybersecurity like a black box that represents millions upon millions in liability. If Mythos represents an opportunity to bring management's understanding of the amount of "security vulnerability debt" everyone carries into the real world, it might be a good thing
It won't bring understanding though is the problem. You get situations like the parent, where the execs don't have the knowledge, time, or care to learn beyond "vulnerability bad, must patch now"
Execs/Management types getting extra visibility into the technical side, in my experience, has only ever resulted in additional but meaningless work, like just checking boxes on a compliance/audit checklist without actually considering the impacts of those changes, or whether a company is actually vulnerable to the disclosed CVE.
It's along the same lines of the BS I deal with day to day from upper management arguing back with "But ChatGPT said..." meanwhile pasting some hallucinated crap that doesn't even apply to our environment.
LLMs are basically a dunning-kruger machine for management. Engineering is best left alone and trusted to do what they are being paid to do.
I think Opus 4.6 and Mythos overall/marketing wise are key points because it told the world that LLMs are now a critical / usefull tool for security audits.
Its aligns with the significant jump in helpfulness in CTF.
But i think its good to hear that its not that crazy good. Everything slowing it down is good.
I'm pretty impressed with regular Claude Code with Opus 4.7/4.8 in finding vulnerabilities in our code. Maybe 70% are false positives though. It's a lot of work to manually push back on the findings. Still worth it.
One example was Claude thinking we could optimize converting vector tiles to raster by operating in float32 rather than float64. It turned out the library we have to use casts to float64 anyway, so the work of casting to 32 then to 64 rather than staying at 64 actually slowed the path down by 12%.
Yet it also finds the odd thing that isn't very intuitive but leads to marked improvements I never would have uncovered because... Well, as a human with only 24 hours in a day, there's no way I'll turn over every leaf and find these items on my own.
I'm totally fine with the false positives because they're so easy the verify.
Not so sure I would want a company that does not see any issues with mass surveillance of my country [1] to have access to critical infrastructure or its source code where I live.
There's a lot of speculation that it is indeed a marketing plot and the model is just a step improvement over current capabilities... and the real reason they aren't releasing the model is they are compute constrained and cannot serve the model. To my knowledge there's no proof of this however, but given the fact that literally 60 days ago they made Mythos out to be the end of the world and last Friday they announced that they will release the model in a few weeks, I feel like it was indeed something along those lines (marketing ploy).
Or just control of supply and demand. If they can charge twice as much serving half as many customers, that leaves a lot of potential future customers leftover.
The week before they released Mythos to governments they had all their source code stolen. It's all about improving their image and creating propoganda.
That's someone who is confident enough to have an evidently successful enough career to be able to access Mythos in its currently-limited rollout and yet not take themselves terribly seriously online.
Realistically their opinion deserves to hold more weight than the median HN comment.
I'll share the first-hand account I recently got from someone else.
> We've used it at work
> it is... not as hype as everyone is concerned about
> I'd argue the framework around it for security scanning is the arguably more useful side of the tool, definitely doesnt take a huge model to get all the issues it flagged on our systems
> For us, it absolutely flooded us with noise
> I mean hundreds if not thousands of false positives or minor issues or not applicable
> For every one reasonable issue
> The biggest issue it created was the execs treated every issue it produced like it was a drop everything and fix the issue type deal
> I'm talking company wide drop all things "we need to patch nginx because this module that no one uses and is disabled by default has this RCE vulnerability™
> Or "all ec2 AMIs need to be upgraded because it flagged a a version specific docker vulnerability", it flagged every single machine with docker regardless of if the actual vulnerability was relevant
> Vulnerability was with a very specific Auth plugin configuration you could enable with docker and specifically the Mosley docker compatible tool, but it is clear it only knew there was a vulnerability in docker, not if it was applicable or not
> Meanwhile dirtyfrag and friends not a single peep from btw despite it allowing for container escape
> Idk, I was underwhelmed with the quality of the reporting it gave really. If the company allowed me to get information about all the infrastructure in our entire organisation to run Claude over it repeatedly looking for recent CVEs I'm sure I could produce the same results...
In other words it is equivalent to spending a million dollars on an audit by a software security consulting company
Or to RedHat for rewriting Python core 500 times.
The "humans do it too" argument gets tiresome. Even if the consulting company fails, the money goes back to employees and back into the real economy. Now it goes to Don Amodei.
The consulting company could be local, which provides a higher degree of confidence, though not proof, that no data is exfiltrated to the US.
And so on.
It seems like there is a genuine communication breakdown between management and engineering. Engineers know that there are vulnerabilities all over the place and that there have been for ages and that where the rubber hits the road every vulnerability does not represent a successful exploit by some nefarious actor.
Management can often treat cybersecurity like a black box that represents millions upon millions in liability. If Mythos represents an opportunity to bring management's understanding of the amount of "security vulnerability debt" everyone carries into the real world, it might be a good thing
It won't bring understanding though is the problem. You get situations like the parent, where the execs don't have the knowledge, time, or care to learn beyond "vulnerability bad, must patch now"
Execs/Management types getting extra visibility into the technical side, in my experience, has only ever resulted in additional but meaningless work, like just checking boxes on a compliance/audit checklist without actually considering the impacts of those changes, or whether a company is actually vulnerable to the disclosed CVE.
It's along the same lines of the BS I deal with day to day from upper management arguing back with "But ChatGPT said..." meanwhile pasting some hallucinated crap that doesn't even apply to our environment.
LLMs are basically a dunning-kruger machine for management. Engineering is best left alone and trusted to do what they are being paid to do.
I think Opus 4.6 and Mythos overall/marketing wise are key points because it told the world that LLMs are now a critical / usefull tool for security audits.
Its aligns with the significant jump in helpfulness in CTF.
But i think its good to hear that its not that crazy good. Everything slowing it down is good.
I'm pretty impressed with regular Claude Code with Opus 4.7/4.8 in finding vulnerabilities in our code. Maybe 70% are false positives though. It's a lot of work to manually push back on the findings. Still worth it.
It's similar with performance optimizations.
One example was Claude thinking we could optimize converting vector tiles to raster by operating in float32 rather than float64. It turned out the library we have to use casts to float64 anyway, so the work of casting to 32 then to 64 rather than staying at 64 actually slowed the path down by 12%.
Yet it also finds the odd thing that isn't very intuitive but leads to marked improvements I never would have uncovered because... Well, as a human with only 24 hours in a day, there's no way I'll turn over every leaf and find these items on my own.
I'm totally fine with the false positives because they're so easy the verify.
Not so sure I would want a company that does not see any issues with mass surveillance of my country [1] to have access to critical infrastructure or its source code where I live.
[1] https://www.anthropic.com/news/statement-department-of-war :
> But using these systems for mass domestic surveillance is incompatible with democratic values.
Is this just one giant marketing plot?
There's a lot of speculation that it is indeed a marketing plot and the model is just a step improvement over current capabilities... and the real reason they aren't releasing the model is they are compute constrained and cannot serve the model. To my knowledge there's no proof of this however, but given the fact that literally 60 days ago they made Mythos out to be the end of the world and last Friday they announced that they will release the model in a few weeks, I feel like it was indeed something along those lines (marketing ploy).
Their IPO is coming up soon. It would be interesting if Mythos remained mythical right up until then, wouldn't it?
Or just control of supply and demand. If they can charge twice as much serving half as many customers, that leaves a lot of potential future customers leftover.
The week before they released Mythos to governments they had all their source code stolen. It's all about improving their image and creating propoganda.
It wasn't "all their source code", it was the source code to Claude Code: not really any of their internal secret sauce, at least directly.
Got to say, Anthropic have hell of a marketing team.
[dupe] Discussion on source: https://news.ycombinator.com/item?id=48369863
I don't get how this is event front page of HN.
In the meantime, not everyone with actual access to the model are all that impressed.
https://cyberplace.social/@GossiTheDog/116679693992983945
“Cybersecurity weather person and award winning shitposter.” why are they someone we should pay attention to the opinion of?
That's someone who is confident enough to have an evidently successful enough career to be able to access Mythos in its currently-limited rollout and yet not take themselves terribly seriously online.
Realistically their opinion deserves to hold more weight than the median HN comment.
I dunno, I trust the engineers working on Firefox or the Linux kernel more than some random pseudo-anonymous Mastodon account -
https://arstechnica.com/information-technology/2026/05/mozil...
https://www.theregister.com/software/2026/03/26/linux-kernel...
I definitely don't.