More signal that the open-weight models should be our destiny as an industry.
These proprietary models are being used to usher in more surveillance and gatekeeping across the industry.
I have a home server that runs Qwen3.6-35B-A3B through llama.cpp with Open WebUI for the user facing interface.
My teen isn't super interested in AI, but whenever they do feel curious they have their own account they can use on our home network. As far as chatting goes local models are more than capable for handling standard chat questions, doing research, helping troubleshoot problems etc. In fact it was an agent powered by the same model that setup the open webui server and took care of all the account management features through my phone (using Hermes agent).
If you're building AI powered features and using sophisticated agent setups for coding for work, then it make sense to use SoTA from these providers. But I've been using local models increasingly for personal use and am starting to find them preferable (I run an uncensored, ephemeral model for my own use and it's an entirely different experience than anything you can pay for).
Still haven't cancelled my personal Anthropic subscription, but considering it soon.
I guess "starting to find them preferable" suggests to me you think they work better, but this is surprising to me so I think I may have misunderstood, so I ask!
Like you're saying they work better than the proprietary models (in what ways?), or you find them mostly good enough and prefer the privacy or cost, or what?
There are a couple of things, but basically it boils down to the same reason people prefer Linux to Windows/MacOs: customization, control and privacy (arguably all of these are really subsets of 'control').
Having full control over how your data is retained, what the system prompt is, which version of the model you're running, etc leads to much a more consistent experience. For example, for chat sessions, I can't stand the new "let me push back" version of Claude. For my home models I never have to worry about that.
There's never a mystery as to whether the model secretly degraded performance, I always know exactly which model I'm using and how well it's utilizing resources etc. Open models also give you full visibility into the reasoning steps, so you never have to guess what the model is thinking.
Then when you start getting into things like uncensored/abliterated models we're talking about something you can't even pay for. In case you're unfamiliar, even open local models have guardrails built in. But people in the community have found ways to remove these. One of the things I've found most concerning about AI, which is under discussed, is the combination of people having personal chats with an agent that both monitors the conversation and refuses to discuss certain topics. This leads to a very deep level of self-censoring I find dystopian.
I also have multiple hermes agents setup, some with local backends other with open but non-local backends (e.g. Kimi through the API). For some tasks, I've just started to find the local agent tends to work better for the type of tasks I want (maybe it just over thinks less?). I don't use it for coding so much as research tasks and sysadmin stuff, but I've been really happy with the results.
Oh, and let's not forget, especially running on a Mac, these local models are basically free to run.
From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
>From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
>So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
That is bonkers. If I were a parent, I would hope my child would trust me more than systems monitored by FBI/NSA/etc. Like, what sort of sick relationship do you have to have with your own family to trust them less than strangers who would sell you into prison slavery for a buck.
Private conversations of a teen have low value for FBI/NSA. They have infinite value to their parents.
The state isn't going to ground them, shame them at dinner, out them, or pull them out of a relationship, punish them.
Parents reading your browsing history and private conversations when you are 14-18 years old (the age of teenagers) is very very creepy, unless there is a specific danger to avoid. It's like if you read their private journal.
Adolescents need a private inner world to form an identity, and heavy parental intrusion ("psychological control") is the real distrust. Trust them, they are people, not possessions.
You can guide them, but do not store their private messages locally under your control using the excuse of protecting them from NSA.
If they trust you, they will tend to tell you upfront the things they have questions about, there is really no need to spy on their thoughts.
I don't have the number around but there is a notable latency for pre-fill on the M3, but once it's running the delay is negligible.
The RTX, unsurprisingly, is all around superior performance wise, but: I use that computer for gaming and image gen work so I can't dedicate it as a server, and, especially when it's warmer, the heat generated under heavy loads is noticable.
Wasn't the parent post referring to 'legitimate' demands? I often use them to get a broad overview of a technical field before reading human stuff on it, and it might be me but those clankers tend to spend half their reasoning on whether they are allowed to reply to my request. Censorship is an annoying waste of capacity for certain use cases, although it certainly has its boons when shipping commercial models.
They are not going to let open weights models with zero restrictions exist dude. They will be regulated like guns, or probably closer to nerve gas or enriched uranium.
The government is not going to enforce this, the game theory does not work in their favor.
The SCOTUS has made it exceptionally clear mathematics and software are protected by the First Amendment. The Atomic Energy Act of 1954 tries to make a very narrow exception for nuclear weapons, but
1. The law has never been challenged in court for being unconstitutional, and
2. It doesn't apply to model weights
Any attempt by the government to suppress open models will meet legal challenges on the grounds of (1) or (2).
Congress could amend the act to include model weights, but that won't prevent legal challenges on the grounds of it being unconstitutional (which it is).
I don't know that I want to stop such a thing. It's good that nerve gas is banned. I don't want random people having access to easy-to-follow instructions to make COVID-29.
Because (collective) we don't own the tech. Frontier models are proprietary, their reasoning logic is hidden, and as seen with Fable the government giveth and taketh away on a whim.
Capabilities can be gated behind certification programs, or by money, or any other numerous corrupt and non-corrupt means. Model capabilities can be segregated by pricing tiers, creating an economic underclass that cannot afford access to frontier intelligence.
For humanity to benefit, the tech needs to be open and equally available to all.
I agree with this. Computing as a field is the way it is because there is a low barrier to entry. My dad gave me a Tandy 1000 and some programming books, and now I have a very lucrative career. I never took any classes. I never had to beg anyone for permission. I could just get started making things with the minimal investment of a cheap personal computer. (And eventually, an Internet connection. Working with other people is fun!)
In a world where everyone is a Claude controller (something I honestly enjoy!), that goes away. I use hundreds of dollars of tokens a month. Suddenly, the kid in her basement with an unloved computer can't get in on the ground floor. You have to be rich to even get started. That worries me deeply. It's a big change for our field, and I don't think it's a good one.
Did your dad give you a Tandy 1000 or a Cray X-MP/48? Do you really think you need the most top-of-the-line model to learn anything, or will a locally run gemma4 (or whatever it turns into) still get you going just the same as when you were a child?
Your "concentration of power" is just two labs making models that most people prefer the last couple of months. Neither has more access to capital and resources than Google, more ability to pivot quickly than Xai, more access to labor than all of the Chinese labs, etc. How do you keep from a "concentration of power" without just forcing subsets of the population to use a known lesser model, or purposely kneecapping Research and Development at the labs that currently have the best models?
Do you hate all lessons from humanity's past or just the most important ones? If it takes work from a specific subset of the population and isn't compensated, then my friend, what you advocate for is slavery...
None of them were compelled, and nobody is stopping you from running your own LLM generously provided by others. Doesn't mean when linux came out people nationalized Apple and Microsoft.
The risk I'm talking about isn't nationalization of companies, its corporate monopolization of frontier intelligence capabilities through capital consolidation and regulatory capture.
"Just run your own LLM" ignores the asymmetry of frontier intelligence. You can build an operating system in your garage with just time and cheap hardware. You cannot go build GPT-5. And that's the problem with keeping it proprietary. If the primary cognitive engines of human progress are consolidated within just a handful of closed, proprietary cartels that can gate, alter, and revoke capabilities at will it creates a permanent economic underclass.
The foundational infrastructure of our collective future shouldn't be entirely walled off. Fair compensation for a commercial product doesn't mean monopolization of foundational capabilities.
One is the potential for skill rot where AI grows a heavy dependence in new employees and once the real price per token cost is settled on and discoverable (post massive IPOs and probably a while post - not immediately after) we, as a society, are left with a bunch of people dependent on a deeply inefficient technology to maintain software we now view as vital that might severely impede our ability to actually deal with climate change (press X to doubt Bezos).
The second is that the psychological damage of interacting with models in a social context during your formative years is deeply damaging and we've essentially destroyed the ability for a generation or two to actually interact as productive members of society.
Addressing the second issue doesn't necessarily exclude our ability to leverage models for business productivity but it seems unlikely to happen in the current climate without that also happening. I am hesitant to believe in a sudden outbreak of common sense at this point. The first point, could really be a systems collapse trigger - we can argue about the likelihood but denying it as a possibility is excessively naive.
Both seem to just point at the WALL-E outcome, summarized as humans outsourcing too much thinking. I just don't see that as an end- just another divide between people. I'm seeing some degradation for sure, but not really an "end".
there are claims that llms might be taxing on the planet to run BUT that they will solve [some, all] problems including climate change and therefore be beneficial in the long run.
I agree with the skill drain argument but also think its a little too dramatic. Most people still can do the shit claude does for them, it just takes them 10x as long.
But "some assholes" is an extremely large, growing group of people. Do you have any idea how much more productive small business owners are now? It's an insane boost for people who didn't want to spend their time on things that are extremely critical for business but not the focus of the business.
And people loved "free next day delivery" from Amazon, when it started. It's not quite the same level of service anymore, and membership has gone up in price.
Would these businesses pay 2x? 5x? 10x? What is their breaking point? I'm sure xAI/OpenAI/whoever will find it and charge 0.9x that (eventually). Just look at telecoms / internet access and their rubbish "network congestion" claims to keep raising prices.
I still get a lot of free next day, and now sometimes even same day, delivery for amazon. I doubt the membership prices has even matched inflation, but it is certainly well worth it. I can't see any governmental or volunteer organization that would produce even slightly comparable results with the same budget.
How can it end well, when it's mostly owned / controlled by narcistic billionaires who would love to eradicate anyone who so much as looks at them sideways? And who view "mass population reduction" and "I'll get to be a king in my castle, served by peons who depend on my favor to live" as the most desirable outcome of AGI?!?
If even one of these had pledge that all profit goes to end world hunger, cancer research, etc, I could possibly see it - but they haven't. They're all after finding a way to be the biggest, richest asshole possible with the ability to crush anyone in their way..
Have you isolated yourself completely from reality? I don't even know where to begin on this. Let's start with the fact that China is pumping out some near-frontier models and open sourcing the weights- and they don't even follow capitalism and the owners aren't billionaires. Really there are like four models in the USA that are "owners/controllers", and only one is even slightly controllable by its CEO, though none of the frontier models can last a week without the support of entire teams.
Why on earth would you want to siphon off the proceeds of AI development to (ok my bias is strong here- mostly corrupt) "ideals" like world hunger and cancer research (that probably get more dollars annually than the sum of actual profit any of these companies will ever get). That would just instantly kill the ability to improve AI at all, and the world could possibly be better for a few months?
They can't prevent the innovation, competition and engineering, but their lobbying makes sure that the Chinese competition doesn't enter the market, and if it does, with severe obstacles on the way.
Their biggest customer is the US federal government, taken in aggregate across agencies, IBM is one of the largest federal IT contractors, and deep public-sector and financial-services contracts in the US make it IBM's single largest national market. No individual commercial company comes close to the government's aggregate spend.
Now, equivalent product, another company, they want to sell to the government twice cheaper, can they ? nope, it will be IBM winning.
Furthermore, according to the lobbyists, China = evil but they forget that a lot of software contains Chinese code.
i’d really love to be wrong, i don't think that the economics of it would let it happen.
the potential of wealth creation with AI is so high, and also the fact that research, pre-training and inference is so expensive that, that any open-AI would eventually become OpenAI.
There is an understandable gap between the capabilities of closed models and those of open models.
The current difference is primarily expressed in the cost of hardware necessary to sufficiently run a exactly comparable model.
A single higher end graphics card running on your average gaming computer, is capable of running small to medium models that compare with those of their lab-born counterparts in the small-medium range. But the heavyweight models are still outside the realm of possibility for all but the most well-funded individual.
However, I would highly suggest more people experiment with these smaller models. They are incredibly capable in many ways that many people dont realize.
The perceived capabilities of the larger models are also much less the result of the model having more parameters/training cycles, but rather that they are being run through well-made harnesses, something which the open-source community is rapidly approaching with near-peer solutions of their own.
In short, much of the gap between between open-weight models and the larger proprietary models can be considered more of an issue of perception and not an issue of capability. There is a fundamental gap economically, but not so much in capability.
The open source community is rapidly closing the gap on these larger labs, especially thanks to the amazing research being freely given openly by well funded chinese labs.
> I mean from a financial and sustainability standpoint, assuming they’re equally powerful as their proprietary counterparts.
Presently they trail SOTA by about 6-12 months, not on par (average across everything they do).
DeepSeek V4 Pro with Max reasoning is very affordable even if you pay per-token, this month I pushed about 486 million tokens through it (I will admit that >95% was cache hits, for agentic development pretty typical) and it cost me about 8 USD in total. Meanwhile with Opus or even Sonnet if I had to pay API prices, I would be a more sad camper. That model makes a lot of stupid things though, so not ideal.
Meanwhile GLM-5.2 that came out is also quote capable and is near Opus in many tasks, all while their coding plan is more cost effective than Anthropic's: https://z.ai/subscribe
I will still stick with Anthropic but consider downgrading from Max 5x to Pro which will change the monthly expenses from around 108 EUR down to <20 EUR (they have a discount too if you pay for a year up front), and probably get the yearly GLM Pro plan which should decrease my yearly expenses from around 1300 EUR total to about 750 total EUR while still giving me a fairly decent setup.
For the consumer, that is doable and practical.
For the people actually running these models, who knows - at least DeepSeek and others are trying to make the models more efficient so the numbers are more feasible.
Also have run Qwen3.6 35B A3B on prem and it kinda sucks. Way better than models that size a year ago, but still lags behind Sonnet and also DeepSeek V4 Flash due to the size limits. Plus to even run myself I'd need a pretty beefy setup, most likely a pair of Intel Arc Pro B70s with 32 GB of VRAM each that I could still run off of my PSU but the actual model output would be kinda bullshit and I'd have to spend an unpleasant amount of time fixing it.
Sort of. A full trillion-parameter model needs about $300k of server hardware to run in and a lot of electricity, making it feasible only for very wealthy individuals, but quite practical for businesses and institutions above a certain size...although they in turn would typically gatekeep access.
You can drastically reduce the requirements by running models at a lower bitrate, which somewhat reduces accuracy but not that much - think of the difference between an MP3 vs uncompressed audio. With this and other tricks, you can get high end models down to a size where they can be run on a high spec desktop workstation affordable by an individual or small business.
Obviously I'm heavily oversimplifying here. I think a useful parallel is to consider situations from the past where you would once have required corporate budgets equivalent to the price of a house to run a large database, but over time it became accessible to anyone with the requisite expertise and relatively affordable hardware.
You can run a trillion parameter model with decent quality for far less than $300k. A cluster of 4 AMD AI Max 395+ boards with 128GB unified memory each can be had for around $15k. That would run the 4-bit quant of a trillion param model well enough for personal use. At full use the cluster would only be consuming around 400-500W of power too. That's about the same as one high end graphics card.
That's still a lot of money, but most people don't really need a trillion parameter model. If privacy is more valuable than the frontier capabilities then they could almost certainly get by with much less.
See my comment to parent. I've been using local LLMs for practical, personal tasks for a few months now very successfuly.
You can run fantastic local models if you have either:
- M-series Apple device with ideally >= 24GB of VRAM
- RTX [345]090 GPU
I'm fortunate enough to have both and use an M-series laptop as basically a persistent server (I don't use it much and when traveling typically just use my work laptop). My desktop doesn't act as a persitent server but I fire up llama.cpp on it all time for quick chat sessions.
If you have one of the above devices and can dedicate it as server there are additional layers of tooling you can use that dramatically improve the experience. In particular Open WebUI allows you to add tons of useful tools (image gen, web search, code eval, etc), and agent harnesses like Hermes can make the current gen small models very capable. I have an agent in chat on my phone that basically handles all the sys-admin for the server it runs on.
In addition to models getting better, the quantization methods have also got much better. If you already have an RTX 3080 it's absolutely worth the time to just mess around and see how it does, experiment with different quants that fit in your VRAM. If you're purchasing I would recommend coughing up the extra cash for the 3090.
If you are experimenting it's worth mentioning that the harness/tooling is very important to getting a solid experience. Herme's agent is great for running helpful agents and OpenWeb UI can get really make the experience feel on par with paid chat interfaced.
A reasonable halfway step is to pay for an open model through the provider or open router. You'll get many of the benefits (especially around pricing) without needing to shell out on hardware before deciding if you like the way these models work.
I'm also curious, specifically about the cost of training vs inference, and comparing that to other industries that can have high R&D costs. My instinct says that open weights aren't feasible because of the obvious issue where there is no incentive to develop your own model rather than just taking someone else's model. However, I could see a scenario where a hardware company designs a model that is open weights but optimized strongly for their own proprietary hardware, cutting their costs of inference low enough to be competitive with a hypothetical other company that doesn't have any R&D expediture.
It depends entirely on what you want to do and think is feasible. Small models can almost certainly run on the computer that you already have. They can do good tool calling.
If attractive, cloud providers could develop open models with their own investment, and sell hosted access as a business model. While Google checks these boxes, I haven't seen a Google much marketing focus upon their open models (Gemma) coupled with hosting. groq could conceivably train its own models, but groq's business model hosts open models (GPT OSS, Qwen 3, Llama 4 are currently their prominently advertised models on their site... which seems out of date to me) trained by other organizations.
I hope/wonder if it will go the way computers did. We may learn to more effectively build RAM or parallel compute, and use it more effectively, in the coming decade in such a way that we can democratize more and more like we did with processors to the point that they're ubiquitous.
I'm happy to give my identity to Anthropic and crush my competition with irrational fear about privacy and personal data. This is a serious competitive advantage and a moat.
More signal this won’t happen without some serious social unrest, not garden variety Jan 6 events… and the window is closing rapidly - when this tech gets sufficiently advanced there won’t be a place to hide.
The US is behind other democracies which have required photo id for social media and other content. And even if I disagree with these laws, surely you jest that showing a proof of age is not the same thing as surveilling and scoring.
These things are always a slippery slope. They rarely, if ever, achieve their safety goals but they almost always achieve the goals of the corporate interests to garner further data for advertisers and increase surveillance of the populace by the government through proxies that buy said data and then sell to the government.
It is using the proof of age requirement to require a much larger ask -- full proof of identity
Age verification could be done with any of a variety of mathematical systems showing you have a proven age-valid ID but not revealing your identity. But no one is suggesting they build and use such a system.
It's a little bold to assume that 15% of the US population means the entire US wants this. We founded this country against unwarranted government interference in our personal lives, it's why the fourth and fifth amendment exist.
Because one is a private company that people can choose to use or avoid. The other is a government that can force things upon people. How are they the same in any way?
You know many companies check ID, right? You submit ID for a lot of activities. This isn't a new concept that Anthropic invented.
>the West was complaining about surveillance and scoring system of citizens in China
free speech, civil liberties, voting, are in China all well below the standards of the west. The criticism and complaints were completely warranted and are still true today, whereas your comment falsely implies there is some parity.
could your comment be repaired to be reasonable? why bother, just read the rest of this discussion where people are debating these controls without trying to exonerate China.
The point is that you're all shitting on China 24/7 while not recognising that you're slowly but surely building something very similar at home right now
Well, the powers-that-be saw how a society that doesn't allow a lot of open criticism works in the form of China. The massive returns on investment, the near-permanent ruling class in the form of party cadres, etc. Then they decided they want that for themselves.
If you do business with totalitarian societies that aren't made to liberalize, you too will become a totalitarian society.
It’s funny, 30 years ago the argument was the exact opposite: China opening up and doing business with the rest of the world would force them to liberalize.
That was the argument, yes, but let's be real: the reason that capital loved China wasn't because they were going to have to deal with trade unions and citizen initiatives to constrain their ability to unlock value. If that were the case, the then-newly-democratized Eastern Europe, or maybe India, would have gotten a lot more attention from business than China.
No, they liked China because the standard of living meant that it was easy to improve people's lives while also keeping them in line via a government that wasn't above grinding protestors into hamburger with tank tracks. The bar to clear wasn't "maintain the American standard of living", it was "provide more calories than Mao did during the Great Leap Forward", and so long as they could do that, they'd get to do whatever else they wanted with the workforce. Anyone who wanted more would get to deal with the CCP.
Maybe access should be enabled only for large trusted companies? If every American has access how many of them would gladly sell their verified account to a stranger from Internet who cannot pronounce "th" clearly?
If it's an actual AGI, it'll figure out how to use a fake ID and the face of Sam Porter Bridges to bypass the age checks.
Now I can't help but imagine a mildly annoyed AGI buying yet another fake identity to deal with yet another KYC check, because those stupid humans are inherently racist and just can't help themselves but keep demanding "proof of flesh".
Considering that you need a credit card to pay for the tokens, why does anthropic need to verify your age or identity? Yes, I suppose some kid could steal my credit card, but I've got bigger problems if that happens...
Having my engineers swap over to it from Claude has garnered very little complaint. The lack of multi-modality is a limitation, but using minimax m3 for that isn't super inconvenient.
This feels deeply problematic. I would much prefer, where asked via appropriate legal processes, Anthropic serve over user data to government officials, and potentially suspend access.
Countries such as Canada are in the process of implementing regulations to prevent repeats of the Tumbler Ridge incident. A disturbed person was basically attaboy'd by AI into a mass shooting. The discussions this person had with OpenAI's AI triggered some alarm bells at OpenAI, but they did nothing about them. If future shooters were to simply use AI chatbots under assumed names, there wouldn't be much AI companies could do about it, except maybe change their bots to stop offering mindless affirmation. At the same time, there is a move by multiple governments around the world to ban children from using AI. You can't meet that legal requirement without age verification.
On the other hand, even Americans don't trust their own corporations with their personal data. People outside of the U.S. are even less trusting thanks to the completely amoral nature of the present U.S. administration and their steadfast opposition to any kind of sensible regulation.
Also, Anthropic will maintain and use data in user identified form if the law does not prohibit such privacy intrusion. At least this is a valid interpretation imho; note the absence of "explicitly" as adverb for "permitted":
«Where data is de-identified, Anthropic will maintain and use this information in its de-identified form, and will not attempt to re-identify such information, except as permitted by law.»
We’ve been in sort of a golden age where massive money is getting pulled in and consumers are getting a great deal. That’s not going to last, and surely surveillance and personal information are going to fit into the formula for success for these companies. It’s very similar to when Google was a brand new search engine.
US regulations/laws are hostile enough that the EU is looking to distance themselves from all US software, hosting & cloud providers. This administration has shown that they're quite willing to stab every other nation in the back on a whim.
> As tensions between President Donald Trump and Europe continue to simmer, the continent is accelerating its moves to reduce its addiction to US technology. Cities and governments are ditching Microsoft Office for open-source alternatives, shifting to European cloud hosting for local AI, and moving defense data to systems without American involvement. Nowhere has this been more clear than in France.
> The Netherlands blocked a U.S. company from buying a Dutch firm that handles its national ID system, saying it would create a “threat to the public interest.”
The amounts of capital sunk into AI model creation and service is truly mind-boggling. It also comes with the implication that it'll recoup investment by slashing jobs. For better or worse, those are hard sells in the countries you mentioned.
> For better or worse, those are hard sells in the countries you mentioned.
For good reasons, sometimes. The "all automation is good automation" sentiment on places like HN isn't shared as widely outside this tech bubble. There are very real concerns with historical precedent that only those at the top will benefit from the automation, which is overall bad for society (unless you're a hardcore capitalist and/or one of said capital owners).
For better or for worse, not all nations subscribe to the competition treadmill.
From their terms: 'Identity and Contact Data: Anthropic collects identifiers, including your name, email address, and phone number when you sign up for an Anthropic account, or to receive information on our Services. We may also collect or generate indirect identifiers (e.g., “USER12345”).'
>Does that mean: US citizens will get an edge in hireability?
In the present situation any company using Fable will present a tremendous difficulty because only defense contractors are accustomed to handling export controls.
We're still guessing but if Fable is made available again with the export controls intact, something as little as discussing the usage of Fable to a non-"US Person" (i.e. green card or citizen) in the cubicle next to yours could be a crime punishable with sizable fines and even jailtime. They'll certainly be negotiating this down or trying their best to reduce the scope of what's considered a violation. Export controls are no joke and what's considered "export" can be positively tiny.
That's fine, I already cancelled my subscription after they admitted to using PEFT to selectively and silently make their models dumber when working in certain technical fields.
Maybe that was petty, but I was already looking for alternatives after the obvious angling for increased regulation and suppression of local models. LLMs are software and I want to modify them and run them on hardware that I own.
GLM-5.2 meets my needs for "thinky" tasks, which for me is code and documentation reviews, technical chats and rubber ducking. (I've tried agentic coding and gone back to writing by hand; besides ethical and skill atrophy concerns, I mostly do hardware design and have not been satisfied with any model's RTL output.) API rates are cheaper than Haiku, with benchmarks around Opus 4.6. I've managed to run GLM-5.2 at home, very slowly, but still neat that this is possible. I personally find it less grating to talk to than Opus.
I use a local Qwen3.6-35B-A3B (@ Q4_K_XL) for my documentation search harness. It works well for its assigned task, which is:
- I dump in a bucket of PDFs and/or source code.
- I ask a question.
- Qwen greps, fuzzy-searches, views rendered PDF pages to check diagrams, possibly gives up and reads everything, and possibly gives up on that too and writes its own scraper with PyMuPDF in a Pyodide sandbox.
- Qwen gives me an answer consisting mostly of citations and links back into the source material.
This approach with local Qwen can extract useful answers from the Armv9-A manual, which at 17k pages is possibly too big for any context window. Qwen has just enough knowledge baked in to know what to search for and understand what it's looking at. A more knowledgeable model would be a waste because even Fable makes shit up, and I want citations, not hallucinations.
DeepSeek v4 Flash gets an honourable mention: somehow all three of fast, capable and cheap. Zero-data-retention providers are available for both GLM-5.2 and DSv4F. I trust OpenRouter ZDR about as much as I trust Anthropic ZDR, since I can audit neither.
Overall I don't miss my Claude subscription, but take what I say with a grain of salt. I was just a Pro subscriber, not a heavy user like some other folks here.
If programming requires LLM/AI then regulation by government is needed to stop this overreach, which has the primary goal of banning you permanently forever making sure you can never come back to programming, in the event some AI in their system decides you have done something “wrong”.
They aren’t, but we’re riding an exponential here. It’s like saying ‘you can still build a computer out of transistors’ in 1976 - as true and irrelevant today as it was then.
Hmm, is this a thing for enterprise accounts too? My employer has gone all-in on Claude, but if I get a pop-up that requires me to give my ugly mug to a literal cardinal enemy of the human race Peter Thiel, then I will have to seriously consider switching jobs, because I have some of them silly principles.
I should be worried about this, but Anthropic's products are a paid product. You can't use them without providing some identifying information, unless you're going out of your way to provide them inaccurate information.
I generally dislike services which require this level of identity verification but also, so far, those have mostly been freemium services and community tools. And I dislike gating those communities.
I'm sure I should have more of a problem with this.
The British company doing age verification for Discord got hacked and the hackers got about 70k user identity documents. Discord claimed that the scanned documents would be deleted after verification. Surprise! They were not deleted at all.
What about a signed attestation of your identity based on your passport? I don’t particularly want a future where we need to present ID for any online service, but for certain high-risk services (e.g. financial services, medical records, government portals) I’d rather a proper identity system than cobbling something like this together.
As an aside, when traveling internationally it’s not uncommon to need to provide your passport information if you want to get a sales tax rebate. I’ve never purchased something expensive enough abroad to bother with it.
No, the assumption is that you must be 18 years old to apply for a credit card. Surely we could have the machines determine that an "authorized user card" does not guarantee 18+ but the actual card does.
ceejayoz> You want to let every merchant I swipe my card at know my age? To improve privacy?
Remember the site guidelines:
SG> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.
The obvious solution is instead of "every transaction comes with the user's birthday", the vendor can in some way set a minimum age enum of say (13, 15, 18, 21, 25) — a handful of ages that are significant with respect to some law or regulation. Then the transaction succeeds or fails.
Elon: I will burn 20 years of goodwill I have gathered with the tech community.
Sam Altman: I will make sure to increase the price of all semiconductors, so you are not the most hated.
Dario: You are not leaving me here alone to be the good guy, so hold my beer.
This is their only option. Someone who is the head of security for a hospital IT department needs access to mythos, and some 17 year old with fraud convictions doesn't.
It's the same reason we require ID for alcohol and gun purchases. Obviously it isn't a perfect system, teens drink but good luck suggesting that 13 year old should be allowed to buy alcohol.
[Dupe]: https://news.ycombinator.com/item?id=48618455
More signal that the open-weight models should be our destiny as an industry. These proprietary models are being used to usher in more surveillance and gatekeeping across the industry.
I have a home server that runs Qwen3.6-35B-A3B through llama.cpp with Open WebUI for the user facing interface.
My teen isn't super interested in AI, but whenever they do feel curious they have their own account they can use on our home network. As far as chatting goes local models are more than capable for handling standard chat questions, doing research, helping troubleshoot problems etc. In fact it was an agent powered by the same model that setup the open webui server and took care of all the account management features through my phone (using Hermes agent).
If you're building AI powered features and using sophisticated agent setups for coding for work, then it make sense to use SoTA from these providers. But I've been using local models increasingly for personal use and am starting to find them preferable (I run an uncensored, ephemeral model for my own use and it's an entirely different experience than anything you can pay for).
Still haven't cancelled my personal Anthropic subscription, but considering it soon.
What about local models do you find preferable?
I guess "starting to find them preferable" suggests to me you think they work better, but this is surprising to me so I think I may have misunderstood, so I ask!
Like you're saying they work better than the proprietary models (in what ways?), or you find them mostly good enough and prefer the privacy or cost, or what?
There are a couple of things, but basically it boils down to the same reason people prefer Linux to Windows/MacOs: customization, control and privacy (arguably all of these are really subsets of 'control').
Having full control over how your data is retained, what the system prompt is, which version of the model you're running, etc leads to much a more consistent experience. For example, for chat sessions, I can't stand the new "let me push back" version of Claude. For my home models I never have to worry about that.
There's never a mystery as to whether the model secretly degraded performance, I always know exactly which model I'm using and how well it's utilizing resources etc. Open models also give you full visibility into the reasoning steps, so you never have to guess what the model is thinking.
Then when you start getting into things like uncensored/abliterated models we're talking about something you can't even pay for. In case you're unfamiliar, even open local models have guardrails built in. But people in the community have found ways to remove these. One of the things I've found most concerning about AI, which is under discussed, is the combination of people having personal chats with an agent that both monitors the conversation and refuses to discuss certain topics. This leads to a very deep level of self-censoring I find dystopian.
I also have multiple hermes agents setup, some with local backends other with open but non-local backends (e.g. Kimi through the API). For some tasks, I've just started to find the local agent tends to work better for the type of tasks I want (maybe it just over thinks less?). I don't use it for coding so much as research tasks and sysadmin stuff, but I've been really happy with the results.
Oh, and let's not forget, especially running on a Mac, these local models are basically free to run.
What kind of machine is it running on ?
What is an "ephemeral" model in this context?
Just running it through `llama-cli` so that there's absolutely no persistent state related to the chat (and least I believe this to be the case).
From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
What uncensored model do you recommend using ?
>From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
>So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
That is bonkers. If I were a parent, I would hope my child would trust me more than systems monitored by FBI/NSA/etc. Like, what sort of sick relationship do you have to have with your own family to trust them less than strangers who would sell you into prison slavery for a buck.
Private conversations of a teen have low value for FBI/NSA. They have infinite value to their parents.
The state isn't going to ground them, shame them at dinner, out them, or pull them out of a relationship, punish them.
Parents reading your browsing history and private conversations when you are 14-18 years old (the age of teenagers) is very very creepy, unless there is a specific danger to avoid. It's like if you read their private journal.
Adolescents need a private inner world to form an identity, and heavy parental intrusion ("psychological control") is the real distrust. Trust them, they are people, not possessions.
You can guide them, but do not store their private messages locally under your control using the excuse of protecting them from NSA.
If they trust you, they will tend to tell you upfront the things they have questions about, there is really no need to spy on their thoughts.
Same with husband/wife btw.
How many tokens /sec?
M3-Max laptop: ~55 token/sec
RTX 4090: ~190 token/sec
I don't have the number around but there is a notable latency for pre-fill on the M3, but once it's running the delay is negligible.
The RTX, unsurprisingly, is all around superior performance wise, but: I use that computer for gaming and image gen work so I can't dedicate it as a server, and, especially when it's warmer, the heat generated under heavy loads is noticable.
> I run an uncensored, ephemeral model for my own use and it's an entirely different experience than anything you can pay for.
Dont. Goon. To. LLMs
Wasn't the parent post referring to 'legitimate' demands? I often use them to get a broad overview of a technical field before reading human stuff on it, and it might be me but those clankers tend to spend half their reasoning on whether they are allowed to reply to my request. Censorship is an annoying waste of capacity for certain use cases, although it certainly has its boons when shipping commercial models.
They are not going to let open weights models with zero restrictions exist dude. They will be regulated like guns, or probably closer to nerve gas or enriched uranium.
The government is not going to enforce this, the game theory does not work in their favor.
The SCOTUS has made it exceptionally clear mathematics and software are protected by the First Amendment. The Atomic Energy Act of 1954 tries to make a very narrow exception for nuclear weapons, but
1. The law has never been challenged in court for being unconstitutional, and
2. It doesn't apply to model weights
Any attempt by the government to suppress open models will meet legal challenges on the grounds of (1) or (2).
Congress could amend the act to include model weights, but that won't prevent legal challenges on the grounds of it being unconstitutional (which it is).
Only if you let them.
I don't know that I want to stop such a thing. It's good that nerve gas is banned. I don't want random people having access to easy-to-follow instructions to make COVID-29.
Either way I don't think this will end well for humanity.
How could it not? I get the whole fear of AI making robots and going anti-human, but after using the tech for a few months now that seems too absurd.
Because (collective) we don't own the tech. Frontier models are proprietary, their reasoning logic is hidden, and as seen with Fable the government giveth and taketh away on a whim.
Capabilities can be gated behind certification programs, or by money, or any other numerous corrupt and non-corrupt means. Model capabilities can be segregated by pricing tiers, creating an economic underclass that cannot afford access to frontier intelligence.
For humanity to benefit, the tech needs to be open and equally available to all.
I agree with this. Computing as a field is the way it is because there is a low barrier to entry. My dad gave me a Tandy 1000 and some programming books, and now I have a very lucrative career. I never took any classes. I never had to beg anyone for permission. I could just get started making things with the minimal investment of a cheap personal computer. (And eventually, an Internet connection. Working with other people is fun!)
In a world where everyone is a Claude controller (something I honestly enjoy!), that goes away. I use hundreds of dollars of tokens a month. Suddenly, the kid in her basement with an unloved computer can't get in on the ground floor. You have to be rich to even get started. That worries me deeply. It's a big change for our field, and I don't think it's a good one.
Did your dad give you a Tandy 1000 or a Cray X-MP/48? Do you really think you need the most top-of-the-line model to learn anything, or will a locally run gemma4 (or whatever it turns into) still get you going just the same as when you were a child?
AI isn't the problem, concentration of power is the problem. I think we agree!
Your "concentration of power" is just two labs making models that most people prefer the last couple of months. Neither has more access to capital and resources than Google, more ability to pivot quickly than Xai, more access to labor than all of the Chinese labs, etc. How do you keep from a "concentration of power" without just forcing subsets of the population to use a known lesser model, or purposely kneecapping Research and Development at the labs that currently have the best models?
I was agreeing with the parent's conclusion: "For humanity to benefit, the tech needs to be open and equally available to all."
Reducing the power of AI / restricting its export / arresting people who "use it wrong" is counter to that.
Do you hate all lessons from humanity's past or just the most important ones? If it takes work from a specific subset of the population and isn't compensated, then my friend, what you advocate for is slavery...
Ah yes, I forgot, Linus Torvalds and the thousands of others that built Linux over time are all slaves. Guess someone should probably go rescue them.
None of them were compelled, and nobody is stopping you from running your own LLM generously provided by others. Doesn't mean when linux came out people nationalized Apple and Microsoft.
The risk I'm talking about isn't nationalization of companies, its corporate monopolization of frontier intelligence capabilities through capital consolidation and regulatory capture.
"Just run your own LLM" ignores the asymmetry of frontier intelligence. You can build an operating system in your garage with just time and cheap hardware. You cannot go build GPT-5. And that's the problem with keeping it proprietary. If the primary cognitive engines of human progress are consolidated within just a handful of closed, proprietary cartels that can gate, alter, and revoke capabilities at will it creates a permanent economic underclass.
The foundational infrastructure of our collective future shouldn't be entirely walled off. Fair compensation for a commercial product doesn't mean monopolization of foundational capabilities.
There are two rationale objections, I think...
One is the potential for skill rot where AI grows a heavy dependence in new employees and once the real price per token cost is settled on and discoverable (post massive IPOs and probably a while post - not immediately after) we, as a society, are left with a bunch of people dependent on a deeply inefficient technology to maintain software we now view as vital that might severely impede our ability to actually deal with climate change (press X to doubt Bezos).
The second is that the psychological damage of interacting with models in a social context during your formative years is deeply damaging and we've essentially destroyed the ability for a generation or two to actually interact as productive members of society.
Addressing the second issue doesn't necessarily exclude our ability to leverage models for business productivity but it seems unlikely to happen in the current climate without that also happening. I am hesitant to believe in a sudden outbreak of common sense at this point. The first point, could really be a systems collapse trigger - we can argue about the likelihood but denying it as a possibility is excessively naive.
Both seem to just point at the WALL-E outcome, summarized as humans outsourcing too much thinking. I just don't see that as an end- just another divide between people. I'm seeing some degradation for sure, but not really an "end".
What climate change have to do with anything?
there are claims that llms might be taxing on the planet to run BUT that they will solve [some, all] problems including climate change and therefore be beneficial in the long run.
I agree with the skill drain argument but also think its a little too dramatic. Most people still can do the shit claude does for them, it just takes them 10x as long.
It's would probably just burn more gas and make the climate even worse. Some assholes will get richer in the process.
But "some assholes" is an extremely large, growing group of people. Do you have any idea how much more productive small business owners are now? It's an insane boost for people who didn't want to spend their time on things that are extremely critical for business but not the focus of the business.
And people loved "free next day delivery" from Amazon, when it started. It's not quite the same level of service anymore, and membership has gone up in price.
Would these businesses pay 2x? 5x? 10x? What is their breaking point? I'm sure xAI/OpenAI/whoever will find it and charge 0.9x that (eventually). Just look at telecoms / internet access and their rubbish "network congestion" claims to keep raising prices.
I still get a lot of free next day, and now sometimes even same day, delivery for amazon. I doubt the membership prices has even matched inflation, but it is certainly well worth it. I can't see any governmental or volunteer organization that would produce even slightly comparable results with the same budget.
How can it end well, when it's mostly owned / controlled by narcistic billionaires who would love to eradicate anyone who so much as looks at them sideways? And who view "mass population reduction" and "I'll get to be a king in my castle, served by peons who depend on my favor to live" as the most desirable outcome of AGI?!?
If even one of these had pledge that all profit goes to end world hunger, cancer research, etc, I could possibly see it - but they haven't. They're all after finding a way to be the biggest, richest asshole possible with the ability to crush anyone in their way..
Have you isolated yourself completely from reality? I don't even know where to begin on this. Let's start with the fact that China is pumping out some near-frontier models and open sourcing the weights- and they don't even follow capitalism and the owners aren't billionaires. Really there are like four models in the USA that are "owners/controllers", and only one is even slightly controllable by its CEO, though none of the frontier models can last a week without the support of entire teams.
Why on earth would you want to siphon off the proceeds of AI development to (ok my bias is strong here- mostly corrupt) "ideals" like world hunger and cancer research (that probably get more dollars annually than the sum of actual profit any of these companies will ever get). That would just instantly kill the ability to improve AI at all, and the world could possibly be better for a few months?
Someone should start a nonprofit company focused on developing Open AI. I bet we could even get some sensible billionaires to help the effort.
Maybe one of those trillionaires could help for a bit before leaving to make his own AI model, too.
And we are definitely not going to put our users on a watch list DB and send their data to the government?
And how do we prevent Chinese companies from training on our open AI models and offering their models for free?
How does Red Hat prevent Chinese companies from producing a Linux distribution for free? They don't. And yet they still exist.
They can't prevent the innovation, competition and engineering, but their lobbying makes sure that the Chinese competition doesn't enter the market, and if it does, with severe obstacles on the way.
https://www.ibm.com/policy/contributions-and-expenditures
Their biggest customer is the US federal government, taken in aggregate across agencies, IBM is one of the largest federal IT contractors, and deep public-sector and financial-services contracts in the US make it IBM's single largest national market. No individual commercial company comes close to the government's aggregate spend.
Now, equivalent product, another company, they want to sell to the government twice cheaper, can they ? nope, it will be IBM winning.
Furthermore, according to the lobbyists, China = evil but they forget that a lot of software contains Chinese code.
i’d really love to be wrong, i don't think that the economics of it would let it happen.
the potential of wealth creation with AI is so high, and also the fact that research, pre-training and inference is so expensive that, that any open-AI would eventually become OpenAI.
We could all chip in
Based on recent SEC filings, you’ll soon be able to.
I’m curious (and please forgive my ignorance if it’s obvious), are open weight models practically feasible?
I mean from a financial and sustainability standpoint, assuming they’re equally powerful as their proprietary counterparts.
I guess I’m trying to understand the economics of it.
There is an understandable gap between the capabilities of closed models and those of open models. The current difference is primarily expressed in the cost of hardware necessary to sufficiently run a exactly comparable model. A single higher end graphics card running on your average gaming computer, is capable of running small to medium models that compare with those of their lab-born counterparts in the small-medium range. But the heavyweight models are still outside the realm of possibility for all but the most well-funded individual.
However, I would highly suggest more people experiment with these smaller models. They are incredibly capable in many ways that many people dont realize.
The perceived capabilities of the larger models are also much less the result of the model having more parameters/training cycles, but rather that they are being run through well-made harnesses, something which the open-source community is rapidly approaching with near-peer solutions of their own.
In short, much of the gap between between open-weight models and the larger proprietary models can be considered more of an issue of perception and not an issue of capability. There is a fundamental gap economically, but not so much in capability. The open source community is rapidly closing the gap on these larger labs, especially thanks to the amazing research being freely given openly by well funded chinese labs.
> I mean from a financial and sustainability standpoint, assuming they’re equally powerful as their proprietary counterparts.
Presently they trail SOTA by about 6-12 months, not on par (average across everything they do).
DeepSeek V4 Pro with Max reasoning is very affordable even if you pay per-token, this month I pushed about 486 million tokens through it (I will admit that >95% was cache hits, for agentic development pretty typical) and it cost me about 8 USD in total. Meanwhile with Opus or even Sonnet if I had to pay API prices, I would be a more sad camper. That model makes a lot of stupid things though, so not ideal.
Meanwhile GLM-5.2 that came out is also quote capable and is near Opus in many tasks, all while their coding plan is more cost effective than Anthropic's: https://z.ai/subscribe
I will still stick with Anthropic but consider downgrading from Max 5x to Pro which will change the monthly expenses from around 108 EUR down to <20 EUR (they have a discount too if you pay for a year up front), and probably get the yearly GLM Pro plan which should decrease my yearly expenses from around 1300 EUR total to about 750 total EUR while still giving me a fairly decent setup.
For the consumer, that is doable and practical.
For the people actually running these models, who knows - at least DeepSeek and others are trying to make the models more efficient so the numbers are more feasible.
Also have run Qwen3.6 35B A3B on prem and it kinda sucks. Way better than models that size a year ago, but still lags behind Sonnet and also DeepSeek V4 Flash due to the size limits. Plus to even run myself I'd need a pretty beefy setup, most likely a pair of Intel Arc Pro B70s with 32 GB of VRAM each that I could still run off of my PSU but the actual model output would be kinda bullshit and I'd have to spend an unpleasant amount of time fixing it.
Sort of. A full trillion-parameter model needs about $300k of server hardware to run in and a lot of electricity, making it feasible only for very wealthy individuals, but quite practical for businesses and institutions above a certain size...although they in turn would typically gatekeep access.
You can drastically reduce the requirements by running models at a lower bitrate, which somewhat reduces accuracy but not that much - think of the difference between an MP3 vs uncompressed audio. With this and other tricks, you can get high end models down to a size where they can be run on a high spec desktop workstation affordable by an individual or small business.
Obviously I'm heavily oversimplifying here. I think a useful parallel is to consider situations from the past where you would once have required corporate budgets equivalent to the price of a house to run a large database, but over time it became accessible to anyone with the requisite expertise and relatively affordable hardware.
You can run a trillion parameter model with decent quality for far less than $300k. A cluster of 4 AMD AI Max 395+ boards with 128GB unified memory each can be had for around $15k. That would run the 4-bit quant of a trillion param model well enough for personal use. At full use the cluster would only be consuming around 400-500W of power too. That's about the same as one high end graphics card.
That's still a lot of money, but most people don't really need a trillion parameter model. If privacy is more valuable than the frontier capabilities then they could almost certainly get by with much less.
See my comment to parent. I've been using local LLMs for practical, personal tasks for a few months now very successfuly.
You can run fantastic local models if you have either:
- M-series Apple device with ideally >= 24GB of VRAM
- RTX [345]090 GPU
I'm fortunate enough to have both and use an M-series laptop as basically a persistent server (I don't use it much and when traveling typically just use my work laptop). My desktop doesn't act as a persitent server but I fire up llama.cpp on it all time for quick chat sessions.
If you have one of the above devices and can dedicate it as server there are additional layers of tooling you can use that dramatically improve the experience. In particular Open WebUI allows you to add tons of useful tools (image gen, web search, code eval, etc), and agent harnesses like Hermes can make the current gen small models very capable. I have an agent in chat on my phone that basically handles all the sys-admin for the server it runs on.
What about RTX 3080? Too little VRAM?
In addition to models getting better, the quantization methods have also got much better. If you already have an RTX 3080 it's absolutely worth the time to just mess around and see how it does, experiment with different quants that fit in your VRAM. If you're purchasing I would recommend coughing up the extra cash for the 3090.
If you are experimenting it's worth mentioning that the harness/tooling is very important to getting a solid experience. Herme's agent is great for running helpful agents and OpenWeb UI can get really make the experience feel on par with paid chat interfaced.
A reasonable halfway step is to pay for an open model through the provider or open router. You'll get many of the benefits (especially around pricing) without needing to shell out on hardware before deciding if you like the way these models work.
I'm also curious, specifically about the cost of training vs inference, and comparing that to other industries that can have high R&D costs. My instinct says that open weights aren't feasible because of the obvious issue where there is no incentive to develop your own model rather than just taking someone else's model. However, I could see a scenario where a hardware company designs a model that is open weights but optimized strongly for their own proprietary hardware, cutting their costs of inference low enough to be competitive with a hypothetical other company that doesn't have any R&D expediture.
It depends entirely on what you want to do and think is feasible. Small models can almost certainly run on the computer that you already have. They can do good tool calling.
Yes they are you can use Qwen, DS4 Pro and GLM 5.2 if you have the hardware to do so.
They are not SOTA in various ways but they have better economics.
If attractive, cloud providers could develop open models with their own investment, and sell hosted access as a business model. While Google checks these boxes, I haven't seen a Google much marketing focus upon their open models (Gemma) coupled with hosting. groq could conceivably train its own models, but groq's business model hosts open models (GPT OSS, Qwen 3, Llama 4 are currently their prominently advertised models on their site... which seems out of date to me) trained by other organizations.
I hope/wonder if it will go the way computers did. We may learn to more effectively build RAM or parallel compute, and use it more effectively, in the coming decade in such a way that we can democratize more and more like we did with processors to the point that they're ubiquitous.
I'm happy to give my identity to Anthropic and crush my competition with irrational fear about privacy and personal data. This is a serious competitive advantage and a moat.
Is this satire? I really can't tell.
Bragging about a strategy isn't very strategic. So the comment's purpose is something else.
>This is a serious competitive advantage
Given they have laughable uptime and I have yet to find a useful project mostly written by claude... I doubt it.
Huh? Limited uptime means you can't write projects with it? I assume downtime means you can't host on it ...
I wont buy expensive hardware to self host a model thats outdated within 2 months. Also, yeah, uptime is important if you dont self host.
More signal this won’t happen without some serious social unrest, not garden variety Jan 6 events… and the window is closing rapidly - when this tech gets sufficiently advanced there won’t be a place to hide.
This is an updated terms of service, but they've had ID verification for certain accounts and situations for months.
Here is the Wayback Machine archive from April of their identity verification help page: https://web.archive.org/web/20260415064244/https://support.c...
Here's a random Reddit thread from 2 months ago about them rolling out identity verification: https://www.reddit.com/r/ClaudeAI/comments/1smr9vs/claude_is...
Here is one random example thread of someone who got caught in identity verification with multiple follow-on comments from people who encountered the same problem, also 2 months old: https://www.reddit.com/r/ClaudeCode/comments/1sx25kd/buyer_b...
A consistent effort to introduce age and identity checks sounds even worse.
Couple years ago the West was complaining about surveillance and scoring system of citizens in China (or was making fun of it)
Seems like US wants to get ahead on this and be #1
Also Sam Altman will love this idea, because he already tried it with Worldcoin
The US is behind other democracies which have required photo id for social media and other content. And even if I disagree with these laws, surely you jest that showing a proof of age is not the same thing as surveilling and scoring.
These things are always a slippery slope. They rarely, if ever, achieve their safety goals but they almost always achieve the goals of the corporate interests to garner further data for advertisers and increase surveillance of the populace by the government through proxies that buy said data and then sell to the government.
I think the general assumption is that it only aids in surveillance.
It is not merely showing proof of age
It is using the proof of age requirement to require a much larger ask -- full proof of identity
Age verification could be done with any of a variety of mathematical systems showing you have a proven age-valid ID but not revealing your identity. But no one is suggesting they build and use such a system.
It's a little bold to assume that 15% of the US population means the entire US wants this. We founded this country against unwarranted government interference in our personal lives, it's why the fourth and fifth amendment exist.
You could say the same about China, where only 7% of people can vote. In the US, 72% of people can vote.
> Couple years ago the West was complaining about surveillance and scoring system of citizens in China (or was making fun of it)
Comparing a private company's service to something run and maintained by an entire government on their population is disingenuous, to say the least.
Why is that any different? What’s to stop that company sharing information with the government?
> Why is that any different?
Because one is a private company that people can choose to use or avoid. The other is a government that can force things upon people. How are they the same in any way?
You know many companies check ID, right? You submit ID for a lot of activities. This isn't a new concept that Anthropic invented.
with sufficiently many public/private partnerships, what's the difference?
>the West was complaining about surveillance and scoring system of citizens in China
free speech, civil liberties, voting, are in China all well below the standards of the west. The criticism and complaints were completely warranted and are still true today, whereas your comment falsely implies there is some parity.
could your comment be repaired to be reasonable? why bother, just read the rest of this discussion where people are debating these controls without trying to exonerate China.
The point is that you're all shitting on China 24/7 while not recognising that you're slowly but surely building something very similar at home right now
They still come up with the "china bad" argument on this forum every single day, even though they're speed running the same BS at home
Meanwhile in the Land Of The Free:
> Prairieland defendant sentenced to 30 years in prison for moving a box of antifascist zines
https://theintercept.com/2026/06/23/prairieland-texas-ice-pr...
> US President Donald Trump threatened a "10 year prison sentence" to anyone caught vandalizing the Lincoln Memorial Reflecting Pool
https://www.dw.com/en/us-trump-threatens-prison-for-reflecti...
the funny thing is that the US created the credit system
Also the Chinese social credit system is not what most think it is - https://en.wikipedia.org/wiki/Social_credit_system#Misconcep... .
Because we dismantled class and caste. It’s a “meritocracy caste” system (nevermind that you likely get huge advantages based on your origin)
Well, the powers-that-be saw how a society that doesn't allow a lot of open criticism works in the form of China. The massive returns on investment, the near-permanent ruling class in the form of party cadres, etc. Then they decided they want that for themselves.
If you do business with totalitarian societies that aren't made to liberalize, you too will become a totalitarian society.
It’s funny, 30 years ago the argument was the exact opposite: China opening up and doing business with the rest of the world would force them to liberalize.
That was the argument, yes, but let's be real: the reason that capital loved China wasn't because they were going to have to deal with trade unions and citizen initiatives to constrain their ability to unlock value. If that were the case, the then-newly-democratized Eastern Europe, or maybe India, would have gotten a lot more attention from business than China.
No, they liked China because the standard of living meant that it was easy to improve people's lives while also keeping them in line via a government that wasn't above grinding protestors into hamburger with tank tracks. The bar to clear wasn't "maintain the American standard of living", it was "provide more calories than Mao did during the Great Leap Forward", and so long as they could do that, they'd get to do whatever else they wanted with the workforce. Anyone who wanted more would get to deal with the CCP.
Hmm. giving my personal data away to an American company controlled by us gov that infringes people’s IP and is now using an ID verify by Peter Thiel.
I'm curious if identify verification is a precursor to re-enabling Fable for only US nationals.
Not that I like that route, but may be the only way Anthropic can keep releasing new models with the current administration.
Maybe access should be enabled only for large trusted companies? If every American has access how many of them would gladly sell their verified account to a stranger from Internet who cannot pronounce "th" clearly?
If Anthropic develops an AGI, it'll be too young to use any Anthropic services.
If it's an actual AGI, it'll figure out how to use a fake ID and the face of Sam Porter Bridges to bypass the age checks.
Now I can't help but imagine a mildly annoyed AGI buying yet another fake identity to deal with yet another KYC check, because those stupid humans are inherently racist and just can't help themselves but keep demanding "proof of flesh".
Considering that you need a credit card to pay for the tokens, why does anthropic need to verify your age or identity? Yes, I suppose some kid could steal my credit card, but I've got bigger problems if that happens...
I use a privacy.com card and not my real name
GLM-5.2 has been impressive.
Having my engineers swap over to it from Claude has garnered very little complaint. The lack of multi-modality is a limitation, but using minimax m3 for that isn't super inconvenient.
This feels deeply problematic. I would much prefer, where asked via appropriate legal processes, Anthropic serve over user data to government officials, and potentially suspend access.
AI companies are in a bit of a double bind here.
Countries such as Canada are in the process of implementing regulations to prevent repeats of the Tumbler Ridge incident. A disturbed person was basically attaboy'd by AI into a mass shooting. The discussions this person had with OpenAI's AI triggered some alarm bells at OpenAI, but they did nothing about them. If future shooters were to simply use AI chatbots under assumed names, there wouldn't be much AI companies could do about it, except maybe change their bots to stop offering mindless affirmation. At the same time, there is a move by multiple governments around the world to ban children from using AI. You can't meet that legal requirement without age verification.
On the other hand, even Americans don't trust their own corporations with their personal data. People outside of the U.S. are even less trusting thanks to the completely amoral nature of the present U.S. administration and their steadfast opposition to any kind of sensible regulation.
The chickens are coming home to roost.
Is this new news?
> This policy was published on June 8, 2026 with an effective date of July 8, 2026
Seems it has started already:
https://www.reddit.com/r/ClaudeAI/comments/1ucu6og/any_solut...
That has been going on for a while AFAIK
Also, Anthropic will maintain and use data in user identified form if the law does not prohibit such privacy intrusion. At least this is a valid interpretation imho; note the absence of "explicitly" as adverb for "permitted":
We’ve been in sort of a golden age where massive money is getting pulled in and consumers are getting a great deal. That’s not going to last, and surely surveillance and personal information are going to fit into the formula for success for these companies. It’s very similar to when Google was a brand new search engine.
Consumers don't use claude. they use free slop models. claude token usage is something like 98% professional users using it for work
US regulations on AI is going to make it more and more difficult for EU/UK/everyone else to access these model providers.
At this point it's completely outgrageous that the EU, UK, or even Canada can't put forth the funding to develop their own local AI model industry.
US regulations/laws are hostile enough that the EU is looking to distance themselves from all US software, hosting & cloud providers. This administration has shown that they're quite willing to stab every other nation in the back on a whim.
> As tensions between President Donald Trump and Europe continue to simmer, the continent is accelerating its moves to reduce its addiction to US technology. Cities and governments are ditching Microsoft Office for open-source alternatives, shifting to European cloud hosting for local AI, and moving defense data to systems without American involvement. Nowhere has this been more clear than in France.
https://www.wired.com/story/the-eu-is-going-through-a-trump-...
> The Netherlands blocked a U.S. company from buying a Dutch firm that handles its national ID system, saying it would create a “threat to the public interest.”
https://www.nytimes.com/2026/06/09/technology/solvinity-kynd...
Its not a government issue in those counties, the US government didnt fund anthopic, openai, etc.
We don't have the right incentives to attract the right talent and capital. Plain and simple.
It's not that they can't, it's that it's
1) not a priority
2) expensive as hell
The amounts of capital sunk into AI model creation and service is truly mind-boggling. It also comes with the implication that it'll recoup investment by slashing jobs. For better or worse, those are hard sells in the countries you mentioned.
> For better or worse, those are hard sells in the countries you mentioned.
For good reasons, sometimes. The "all automation is good automation" sentiment on places like HN isn't shared as widely outside this tech bubble. There are very real concerns with historical precedent that only those at the top will benefit from the automation, which is overall bad for society (unless you're a hardcore capitalist and/or one of said capital owners).
For better or for worse, not all nations subscribe to the competition treadmill.
We'll talk to China
From their terms: 'Identity and Contact Data: Anthropic collects identifiers, including your name, email address, and phone number when you sign up for an Anthropic account, or to receive information on our Services. We may also collect or generate indirect identifiers (e.g., “USER12345”).'
Too bad we can't contact them if we have issues.
If Fable is only for US citizens, presumably that will apply to enterprises as well.
I used fable on some difficult stuff and it was surprisingly good.
It's safe to say that models aren't going to get worse.
Does that mean: US citizens will get an edge in hireability?
Assuming: 1. Non-US companies can't keep up Or 2. That model improvements continue to convince management of productivity improvements
How does open router work then if we just go through that?
>Does that mean: US citizens will get an edge in hireability?
In the present situation any company using Fable will present a tremendous difficulty because only defense contractors are accustomed to handling export controls.
We're still guessing but if Fable is made available again with the export controls intact, something as little as discussing the usage of Fable to a non-"US Person" (i.e. green card or citizen) in the cubicle next to yours could be a crime punishable with sizable fines and even jailtime. They'll certainly be negotiating this down or trying their best to reduce the scope of what's considered a violation. Export controls are no joke and what's considered "export" can be positively tiny.
It's enforced the way you'd wish HIPAA were.
Maybe they can look at my credit history and see that it's over 18.
That's fine, I already cancelled my subscription after they admitted to using PEFT to selectively and silently make their models dumber when working in certain technical fields.
[ref: section 1.5 of Mythos/Fable 5 system card, https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...]
Maybe that was petty, but I was already looking for alternatives after the obvious angling for increased regulation and suppression of local models. LLMs are software and I want to modify them and run them on hardware that I own.
What did you switch to?
GLM-5.2 meets my needs for "thinky" tasks, which for me is code and documentation reviews, technical chats and rubber ducking. (I've tried agentic coding and gone back to writing by hand; besides ethical and skill atrophy concerns, I mostly do hardware design and have not been satisfied with any model's RTL output.) API rates are cheaper than Haiku, with benchmarks around Opus 4.6. I've managed to run GLM-5.2 at home, very slowly, but still neat that this is possible. I personally find it less grating to talk to than Opus.
I use a local Qwen3.6-35B-A3B (@ Q4_K_XL) for my documentation search harness. It works well for its assigned task, which is:
- I dump in a bucket of PDFs and/or source code.
- I ask a question.
- Qwen greps, fuzzy-searches, views rendered PDF pages to check diagrams, possibly gives up and reads everything, and possibly gives up on that too and writes its own scraper with PyMuPDF in a Pyodide sandbox.
- Qwen gives me an answer consisting mostly of citations and links back into the source material.
This approach with local Qwen can extract useful answers from the Armv9-A manual, which at 17k pages is possibly too big for any context window. Qwen has just enough knowledge baked in to know what to search for and understand what it's looking at. A more knowledgeable model would be a waste because even Fable makes shit up, and I want citations, not hallucinations.
DeepSeek v4 Flash gets an honourable mention: somehow all three of fast, capable and cheap. Zero-data-retention providers are available for both GLM-5.2 and DSv4F. I trust OpenRouter ZDR about as much as I trust Anthropic ZDR, since I can audit neither.
Overall I don't miss my Claude subscription, but take what I say with a grain of salt. I was just a Pro subscriber, not a heavy user like some other folks here.
Can you please elaborate?
Probably this stuff which caused backlash:
https://www.wired.com/story/anthropic-responds-to-backlash-o...
I was really hoping we’d have more time before enshittification arrived. But apparently, an old Xeon running Qwen is my destiny.
+1
Reason #1 why I won't be working with their stuff.
[dupe] https://news.ycombinator.com/item?id=48618455
So in practise this means I need to upload my passport to continue using the CLI and the chat?
If programming requires LLM/AI then regulation by government is needed to stop this overreach, which has the primary goal of banning you permanently forever making sure you can never come back to programming, in the event some AI in their system decides you have done something “wrong”.
Agreed.
Fortunately for all, clankers aren't required at all to program.
They aren’t, but we’re riding an exponential here. It’s like saying ‘you can still build a computer out of transistors’ in 1976 - as true and irrelevant today as it was then.
this was inevitable
Hmm, is this a thing for enterprise accounts too? My employer has gone all-in on Claude, but if I get a pop-up that requires me to give my ugly mug to a literal cardinal enemy of the human race Peter Thiel, then I will have to seriously consider switching jobs, because I have some of them silly principles.
They probably don't have much, I guess.
I should be worried about this, but Anthropic's products are a paid product. You can't use them without providing some identifying information, unless you're going out of your way to provide them inaccurate information.
I generally dislike services which require this level of identity verification but also, so far, those have mostly been freemium services and community tools. And I dislike gating those communities.
I'm sure I should have more of a problem with this.
I have no problem paying my groceries with my credit card, but I'd rather not give them a copy of my passport...
The British company doing age verification for Discord got hacked and the hackers got about 70k user identity documents. Discord claimed that the scanned documents would be deleted after verification. Surprise! They were not deleted at all.
https://www.theguardian.com/media/2025/oct/09/hack-age-verif...
What about a signed attestation of your identity based on your passport? I don’t particularly want a future where we need to present ID for any online service, but for certain high-risk services (e.g. financial services, medical records, government portals) I’d rather a proper identity system than cobbling something like this together.
As an aside, when traveling internationally it’s not uncommon to need to provide your passport information if you want to get a sales tax rebate. I’ve never purchased something expensive enough abroad to bother with it.
I'd have a problem with that too. In fact, I'd have a problem not being allowed to use cash.
I am deeply inconsistent on this.
that's what my credit card is for
Credit cards don't validate age. I set my kids up with authorized user cards when they started going out with friends.
Maybe they should. It would be far less intrusive for venders who wish to sell age restricted goods to simply have it built in like that.
You want to let every merchant I swipe my card at know my age? To improve privacy?
No, the assumption is that you must be 18 years old to apply for a credit card. Surely we could have the machines determine that an "authorized user card" does not guarantee 18+ but the actual card does.
They don't need to know the exact age. But if kids couldn't get credit cards... then possession of a credit card would be a proof of adulthood.
An absence of "this user is underage" flag would be good enough. Less than 1 bit of private information leaked by transaction.
ceejayoz> You want to let every merchant I swipe my card at know my age? To improve privacy?
Remember the site guidelines:
SG> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.
The obvious solution is instead of "every transaction comes with the user's birthday", the vendor can in some way set a minimum age enum of say (13, 15, 18, 21, 25) — a handful of ages that are significant with respect to some law or regulation. Then the transaction succeeds or fails.
neither does my face but that isn't stopping them, so what exactly are we doing here
Isn't the face scan intended to validate your passport, which you show, and does demonstrate age and identity?
haha wait they ask for passports too?
The Fable shutdown requires them to limit access to US citizens.
Elon: I will burn 20 years of goodwill I have gathered with the tech community. Sam Altman: I will make sure to increase the price of all semiconductors, so you are not the most hated. Dario: You are not leaving me here alone to be the good guy, so hold my beer.
This is a horrible idea and I’m sure it’ll go very well for them. /s
This is their only option. Someone who is the head of security for a hospital IT department needs access to mythos, and some 17 year old with fraud convictions doesn't.
It's the same reason we require ID for alcohol and gun purchases. Obviously it isn't a perfect system, teens drink but good luck suggesting that 13 year old should be allowed to buy alcohol.