The headline claim assumes that Anthropic is operating the API at cost, and losing massive amounts of money on subscriptions
My own impression based on inference prices for deepseek or other "open" models in the 1T range (including providers like DeepInfra with no obvious reason to subsidize their API costs) is that Anthropic is offering subscriptions at cost (on average, power users are a bit more expensive, casual users more profitable) and making good profit on API pricing. Profit that then is spent on model training, marketing and development, for an overall negative bottom line
Edit: in case it gets changed: current headine is "Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them"
This is absolutely absurd. Claude code is of course using the cache (and this can be verified by looking at the traffic). It would be an incredibly stupid design to resend the whole input without a cache for every input, every tool use, etc..
As usual, there’s no factual basis for the claims other than “I made it up” and author doesn’t seem to have technical experience with ML experience. A lot of weasel words doing all the heavy lifting here.
And yet the crazy thing is that this stuff is par for the course human-wise and yet HN continues to rise up with pitchforks in hand complaining that ai is going to replace whatever this is. Like we're losing something.
I see combined estimated revenue for Q1/2026 to be $15B to $20B, depending on source. I also see combined estimated spend Q1/2026 at $15 to $20B, depending on source.
Someone or something is having an hallucination that would make an AI jealous
That’s probably because only a handful of companies manufacture GPUs, and they’re still expensive. I think that will change over time as competition increases. LLMs are also still in a relatively early stage. We’re already seeing models become both smaller and more capable—for example, GPT-4 compared to Qwen3-30B, which can outperform GPT-4 in many tasks while using significantly less compute.
So if this trend continues, they will be making good profits on your $100
GPUs that can run everything from Crysis to CUDA are a harder engineering problem to solve than creating a chip that's optimized for inference. Not to mention that inference is an excellent first step towards a full, competitive GPU as well.
Which chips are these? It seems the main challenge is data bus speed and memory capacity now and it seems no one can really compete with NVIDIA now? And i doubt NVIDIA is still optimizing for anything expect LLMs/AI, their last keynote had less than two minutes of gaming related content and they even canceled their new generation of gaming cards for now.
And it seems all of these advanced chips rely on the most advanced lithography which is tightly guarded and supply locked by a few companies.
Anthropic and openai has the most efficient tokens per unit of compute on the planet and honestly that's their current moat. They're able to serve tokens at half the cost of any opensource provider. Here's the costs to serve opus 4.7 in china on aws according to one of my connections that operates an enterprise account in the region:
And I have zero doubts that using batching and other optimizations that subscription users are being served at an even lower cost. Most of their expenses likely come from training as we're far into the diminishing returns terriority. We will know once anthropic is required by law to report these numbers so there's no point in continued speculation that "anthropic is losing $9 for every $1" because 1: unless there's some subsidies going on it's not true and 2: we will be told directly from anthropic what the numbers are in the near future.
I really don't now how these other code bases are structured. Our team ran cc-usage and our cost is right about what we pay as our monthly license. This is only those on the team pro side.
Our code base is not small, millions of lines of code. It does not take $65 in tokens to solve an issue. I'm running 3-4 claude code terminals at the same time and i'm still pretty close to what it would cost per a token for usage. I don't know what we are doing right with our code or claude.md to make this happen and I don't want to change it to break it.
Also worst case. If it takes $65 to make one PR? I’m paying $100-$300 for a human to make one PR. There was zero math around the human developer cost saved.
Anthropic is absolutely not profitable on a GAAP basis.
Companies may state cash flow positive, operating profit, EBITDA positive, but this is not a true profit in aggregate, just when stripping out many other expenses.
If anyone has evidence to the contrary, please share. Once they go public it will all be free to review, at least.
I would think it depends on what you count in as cost, how long you can operate a data center, in which intervals you need to train new frontier models etc.
Uber rides were never going to get cheaper over time without robot drivers. A software product is not the same thing. It could get cheaper to serve over time.
I’d say we are at the peak of inflated expectations, exactly where they want things to be for the IPOs as they unload their bags on the public. There will be an absolutely massive crash that will destroy the stock market, you’ll finally be able to buy ram and a hard drive again, and then AI will really take your job in the plateau of productivity that comes much later.
Not comparable - drivers and space in a city is inherently limited, while there is no limit on GPU manufacturers, datacenter probividers, or LLM companies
I’m happily paying $200/mo for Claude code. The tokens I use would be >$10k at API rates. I’m building the best products of my life, and multiple of them in parallel. I’m doing better creative work, finally realizing a game idea I’ve had no time for, etc.
If this level of usage goes to $500/mo… I’ll be out. It’s worth what I’m paying, but hey I went a decade without writing that song or building that game. It’s not freaking heroin, it’s a tool that offers a good value for what I pay.
I interpreted hooked to mean “the degradation of skills has made it significantly harder to transition back to doing things with auto-complete/by hand”.
Fair enough. Though in that framing I’d question whether SWE skills degrade that quickly, or are unrecoverable. I took a year off from writing software… I was fine.
Can you share some use cases of what you think will get companies to become hooked? I'm really failing to see a dependency link emerging either now or in the future.
That combined with the relatively easy switching costs. It doesn't bode well for AI companies seeking to create a walled garden.
Teachers in public schools are being told to use AI in their curriculum. In some cases, this means students are being taught not to think (regardless of the intent of the lesson). When prices make this curriculum untenable but the kids already depend on it, that rugpull is going to severely harm a generation of kids whose education was already disturbed in 2020.
Some countervailing forces off the top of my head:
* Hardware improvements will reduce costs
* Model training improvements (read: more efficient model training) will reduce costs
* Better models will reduce costs (more inference for less hardware time while keeping quality constant)
* Tooling and platform will stabilize—less need to dump money into applications and backend systems because they will become mature—also improvements in AI efficiency and quality will lower the cost of maintenance and future feature development
* Energy buildout will stabilize (we will eventually have enough energy supply to meet AI demand)
* Chips market will stabilize (chip supply will catch up to AI demand, lowering the hardware costs)
I'm not sure I agree. Beyond the Anthropic world, we see
1. The GitHub Copilot pricing change that started a week ago already change buying decisions.
2. Small name open weight providers selling at what I assume, and hear through grape vine, is a profitable place.
Claude is over priced for what you get, and if headline is true, expensive to run. I do wonder if their API pricing is profitable. That's the word on the street about Big AI, they are making money on the PayGo
By "word on the street" do you mean "word that they are leaking to the press so they get favorable coverage, and based based on opaque and questionable accounting"?
The press is not the channel I hear things through. New media is fundamentally broken right now and you should not use it as a source for quality information.
If this is the true cost of AI then the future might be dedicated extension cards for computers that hardcode entire models + weights.
Downside: you need to buy a new one for each model.
Upside: insanely fast inference and zero subscription cost, only one time purchase cost.
Once a certain open source model gets good enough this might become viable.
Right now the landscape is still shifting too fast.
State of the art models might remain on subscription, expensive and might be used by large companies only.
State of the art companies might also create their own hardware with hard-baked weights on chip that they don't release to the public, as it might just make more financial sense long term once they "stabilize" on a certain model.
If local AI can make financial sense, then cloud AI will make even more financial sense - an AI card can serve a number of users simultaneously, can be utilized 24x7 instead of however many hours per week you use it, and has other improved efficiencies at scale. Cloud offerings with privacy and security controls will be available for those who want it, just like non-AI cloud offerings with security/privacy options are available today.
I would shell out cash right now for something like Opus on silicon, like what Taalas [0] has built for Llama 3.1.
Having lightning-speed, local inference of a super high-quality model would be incredible. If you haven't played with it, check out Taalas's demo [1].
Honestly, though - I have my doubts. Recurring revenue is just too nice to pass up; I'm sure AI companies wouldn't want me buying a dedicated Opus card and not giving them money for several years until there's something worth upgrading to.
Recurring revenue does you no good if it is in fact a recurring loss because your subscription customers use up more than you're charging them.
Of course expecting the metaphorical Harvard Business School analysts to realize that is asking a lot. Subscriptions are Good and Goodness is Subscriptions, and like any other mass of people following trends the preconditions on when subscriptions are good for a business tend to get lost in the frenzy.
Any kind really. If Apple can lock down the CPU of an iPhone, then I'm sure it is possible to lock down an LLM chip. The business model may then be that you can buy certain "agent apps" and run them on your LLM card. But I have to stop here because I don't want to give anyone any ideas though I'm sure they are "creative" enough.
IP-protected models manifested directly in silicon.
Everything we’re using now is the equivalent of building a GPU on an FPGA: the hardware is general purpose at one abstraction level, and that comes with inefficiency at the next layer up. Collapse the levels, gain efficiency at the cost of generality.
The whole premise of what's being described here is to bake the weights into the silicon. That isn't what I'd describe as vendor lock-in, any more than I'd describe a CPU that can only execute ARM instructions as vendor locked.
To answer my own question, I bet they could figure out a way to still bill you per-token, if they wanted to.
Portability between x86 and ARM is not a form of vendor lock?
And of course they could bill per-token, same way cable PPV worked (the bits were already in your house). But the cost structure of weights in silicon means that competitors would be encouraged to compete on this per-token cost, as their marginal cost would be zero.
I don’t see that being a durable business model, but I guess the counter argument is it’s also similar to game consoles, where initial hardware is subsidized and the business model assumes ongoing payment for bits.
And how much money are they making off our non-training data? Or what is the ROI short and long term of that massive valuable data set? Surely there's at least a valuable subset of ideas that if executed better than the incumbents nets them massive value.
I find it disingenuous when people narrow in and focus on the cost of tokens as if thats the only way the companies make money.
They are doing a massive data grab and stealing and thieving your IP and data, non-training data sharing cannot be opted out of.
And this is the reason, why the AI companies, are now
preparing a bailout by the US government. We will be moving quickly from your 401K is their exit liquidity to US Treasuries are their exit liquidity...
Meeting next week at the White House by coincidence just before the SpaceX IPO. Message to investors will be dont worry the US has your back...
At which point the corruption is sooo big, that an Empire crumbles under its own stench?
The headline claim assumes that Anthropic is operating the API at cost, and losing massive amounts of money on subscriptions
My own impression based on inference prices for deepseek or other "open" models in the 1T range (including providers like DeepInfra with no obvious reason to subsidize their API costs) is that Anthropic is offering subscriptions at cost (on average, power users are a bit more expensive, casual users more profitable) and making good profit on API pricing. Profit that then is spent on model training, marketing and development, for an overall negative bottom line
Edit: in case it gets changed: current headine is "Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them"
The assumptions are so much worse than that:
> Methodology & assumptions: No caching
This is absolutely absurd. Claude code is of course using the cache (and this can be verified by looking at the traffic). It would be an incredibly stupid design to resend the whole input without a cache for every input, every tool use, etc..
The no-caching is explained in the linked conversation. The numbers are from actual use.
[dead]
As usual, there’s no factual basis for the claims other than “I made it up” and author doesn’t seem to have technical experience with ML experience. A lot of weasel words doing all the heavy lifting here.
The data is given and not made up. The data based on actual use and on analysis of public numbers.
And yet the crazy thing is that this stuff is par for the course human-wise and yet HN continues to rise up with pitchforks in hand complaining that ai is going to replace whatever this is. Like we're losing something.
I see combined estimated revenue for Q1/2026 to be $15B to $20B, depending on source. I also see combined estimated spend Q1/2026 at $15 to $20B, depending on source.
Someone or something is having an hallucination that would make an AI jealous
There are no revenue estimates in the article). Only as serious as possible estimates of cost per task.
right, author did not bother to do some simple queries for actual revenue vs actual costs
banks used to lose a lot of money on those toasters, amazing they are still in business
That’s probably because only a handful of companies manufacture GPUs, and they’re still expensive. I think that will change over time as competition increases. LLMs are also still in a relatively early stage. We’re already seeing models become both smaller and more capable—for example, GPT-4 compared to Qwen3-30B, which can outperform GPT-4 in many tasks while using significantly less compute. So if this trend continues, they will be making good profits on your $100
Not to mention the rise of Chinese chips.
GPUs that can run everything from Crysis to CUDA are a harder engineering problem to solve than creating a chip that's optimized for inference. Not to mention that inference is an excellent first step towards a full, competitive GPU as well.
Which chips are these? It seems the main challenge is data bus speed and memory capacity now and it seems no one can really compete with NVIDIA now? And i doubt NVIDIA is still optimizing for anything expect LLMs/AI, their last keynote had less than two minutes of gaming related content and they even canceled their new generation of gaming cards for now.
And it seems all of these advanced chips rely on the most advanced lithography which is tightly guarded and supply locked by a few companies.
There seem to two doubtful assumptions being made here:
1. That the API pricing is required to make a profit, rather than being effective market segmentation to make a larger profit.
2. That if subscriptions are loss making, it is not worth having loss leaders.
Hopefully we will have more information about these companies when Anthropic IPO filing become public. There's too much speculation without them.
Not if they commit the kinds of market fraud that Musk's companies have.
Why was this flagged? The article is a serious attempt at observation and analysis.
Anthropic and openai has the most efficient tokens per unit of compute on the planet and honestly that's their current moat. They're able to serve tokens at half the cost of any opensource provider. Here's the costs to serve opus 4.7 in china on aws according to one of my connections that operates an enterprise account in the region:
And I have zero doubts that using batching and other optimizations that subscription users are being served at an even lower cost. Most of their expenses likely come from training as we're far into the diminishing returns terriority. We will know once anthropic is required by law to report these numbers so there's no point in continued speculation that "anthropic is losing $9 for every $1" because 1: unless there's some subsidies going on it's not true and 2: we will be told directly from anthropic what the numbers are in the near future.I really don't now how these other code bases are structured. Our team ran cc-usage and our cost is right about what we pay as our monthly license. This is only those on the team pro side.
Our code base is not small, millions of lines of code. It does not take $65 in tokens to solve an issue. I'm running 3-4 claude code terminals at the same time and i'm still pretty close to what it would cost per a token for usage. I don't know what we are doing right with our code or claude.md to make this happen and I don't want to change it to break it.
Also worst case. If it takes $65 to make one PR? I’m paying $100-$300 for a human to make one PR. There was zero math around the human developer cost saved.
Is he headline strictly about subsidized subscriptions? Anthropic announced their first profitable quarter.
Yes, th enumber is based on subsidised subscriptions and he article makes that clear.
Anthropic is absolutely not profitable on a GAAP basis.
Companies may state cash flow positive, operating profit, EBITDA positive, but this is not a true profit in aggregate, just when stripping out many other expenses.
If anyone has evidence to the contrary, please share. Once they go public it will all be free to review, at least.
https://hugston.com/news/the-subscription-to-incompetence-is...
I would think it depends on what you count in as cost, how long you can operate a data center, in which intervals you need to train new frontier models etc.
There are lots of knobs to dial for your costs.
Article flagged by the HN moderators owning pre IPO shares on said companies discussed in this article: Anthropic, OpenAI and SpaceX
Wait, what?
So we're in the $6 Uber ride era of AI still?
Uber rides were never going to get cheaper over time without robot drivers. A software product is not the same thing. It could get cheaper to serve over time.
https://en.wikipedia.org/wiki/Gartner_hype_cycle
I’d say we are at the peak of inflated expectations, exactly where they want things to be for the IPOs as they unload their bags on the public. There will be an absolutely massive crash that will destroy the stock market, you’ll finally be able to buy ram and a hard drive again, and then AI will really take your job in the plateau of productivity that comes much later.
Not comparable - drivers and space in a city is inherently limited, while there is no limit on GPU manufacturers, datacenter probividers, or LLM companies
Hang on to your wallet, you ain't seen nothing yet.
The true cost of AI won't be revealed until after a large portion of the customer base has become "hooked" on it.
“Hooked” is doing a lot of work there.
I’m happily paying $200/mo for Claude code. The tokens I use would be >$10k at API rates. I’m building the best products of my life, and multiple of them in parallel. I’m doing better creative work, finally realizing a game idea I’ve had no time for, etc.
If this level of usage goes to $500/mo… I’ll be out. It’s worth what I’m paying, but hey I went a decade without writing that song or building that game. It’s not freaking heroin, it’s a tool that offers a good value for what I pay.
That is interesting. $200/month for >$10k at API-pricing? That's way more than the 2.5x-12.5x the article observes.
I interpreted hooked to mean “the degradation of skills has made it significantly harder to transition back to doing things with auto-complete/by hand”.
Could be wrong though.
Fair enough. Though in that framing I’d question whether SWE skills degrade that quickly, or are unrecoverable. I took a year off from writing software… I was fine.
Can you share some use cases of what you think will get companies to become hooked? I'm really failing to see a dependency link emerging either now or in the future.
That combined with the relatively easy switching costs. It doesn't bode well for AI companies seeking to create a walled garden.
Teachers in public schools are being told to use AI in their curriculum. In some cases, this means students are being taught not to think (regardless of the intent of the lesson). When prices make this curriculum untenable but the kids already depend on it, that rugpull is going to severely harm a generation of kids whose education was already disturbed in 2020.
Good lord! Where is this happening? Why would anybody think this was a good idea?
That + when retail investors are the ones holding the bag.
Some countervailing forces off the top of my head:
* Hardware improvements will reduce costs
* Model training improvements (read: more efficient model training) will reduce costs
* Better models will reduce costs (more inference for less hardware time while keeping quality constant)
* Tooling and platform will stabilize—less need to dump money into applications and backend systems because they will become mature—also improvements in AI efficiency and quality will lower the cost of maintenance and future feature development
* Energy buildout will stabilize (we will eventually have enough energy supply to meet AI demand)
* Chips market will stabilize (chip supply will catch up to AI demand, lowering the hardware costs)
I'm not sure I agree. Beyond the Anthropic world, we see
1. The GitHub Copilot pricing change that started a week ago already change buying decisions.
2. Small name open weight providers selling at what I assume, and hear through grape vine, is a profitable place.
Claude is over priced for what you get, and if headline is true, expensive to run. I do wonder if their API pricing is profitable. That's the word on the street about Big AI, they are making money on the PayGo
By "word on the street" do you mean "word that they are leaking to the press so they get favorable coverage, and based based on opaque and questionable accounting"?
The press is not the channel I hear things through. New media is fundamentally broken right now and you should not use it as a source for quality information.
If this is the true cost of AI then the future might be dedicated extension cards for computers that hardcode entire models + weights.
Downside: you need to buy a new one for each model.
Upside: insanely fast inference and zero subscription cost, only one time purchase cost.
Once a certain open source model gets good enough this might become viable.
Right now the landscape is still shifting too fast.
State of the art models might remain on subscription, expensive and might be used by large companies only.
State of the art companies might also create their own hardware with hard-baked weights on chip that they don't release to the public, as it might just make more financial sense long term once they "stabilize" on a certain model.
If local AI can make financial sense, then cloud AI will make even more financial sense - an AI card can serve a number of users simultaneously, can be utilized 24x7 instead of however many hours per week you use it, and has other improved efficiencies at scale. Cloud offerings with privacy and security controls will be available for those who want it, just like non-AI cloud offerings with security/privacy options are available today.
I would shell out cash right now for something like Opus on silicon, like what Taalas [0] has built for Llama 3.1.
Having lightning-speed, local inference of a super high-quality model would be incredible. If you haven't played with it, check out Taalas's demo [1].
Honestly, though - I have my doubts. Recurring revenue is just too nice to pass up; I'm sure AI companies wouldn't want me buying a dedicated Opus card and not giving them money for several years until there's something worth upgrading to.
[0] https://taalas.com/
[1] https://chatjimmy.ai/
Recurring revenue does you no good if it is in fact a recurring loss because your subscription customers use up more than you're charging them.
Of course expecting the metaphorical Harvard Business School analysts to realize that is asking a lot. Subscriptions are Good and Goodness is Subscriptions, and like any other mass of people following trends the preconditions on when subscriptions are good for a business tend to get lost in the frenzy.
Or someone works out a hardware architecture that's optimized for AI inference in the way you describe, but also good for 3D graphics.
That would fuse 3D graphics and AI accelerators into 1 and the same unit, as far as consumer hardware is concerned.
>Upside: zero subscription cost, only one time purchase cost.
the world has long since moved on from that business model, unfortunately.
Yes this is what I"m waiting for. I do hope these cards will not come with any kind of vendor locking though.
What kind of vendor locking would be possible?
Any kind really. If Apple can lock down the CPU of an iPhone, then I'm sure it is possible to lock down an LLM chip. The business model may then be that you can buy certain "agent apps" and run them on your LLM card. But I have to stop here because I don't want to give anyone any ideas though I'm sure they are "creative" enough.
IP-protected models manifested directly in silicon.
Everything we’re using now is the equivalent of building a GPU on an FPGA: the hardware is general purpose at one abstraction level, and that comes with inefficiency at the next layer up. Collapse the levels, gain efficiency at the cost of generality.
The whole premise of what's being described here is to bake the weights into the silicon. That isn't what I'd describe as vendor lock-in, any more than I'd describe a CPU that can only execute ARM instructions as vendor locked.
To answer my own question, I bet they could figure out a way to still bill you per-token, if they wanted to.
Portability between x86 and ARM is not a form of vendor lock?
And of course they could bill per-token, same way cable PPV worked (the bits were already in your house). But the cost structure of weights in silicon means that competitors would be encouraged to compete on this per-token cost, as their marginal cost would be zero.
I don’t see that being a durable business model, but I guess the counter argument is it’s also similar to game consoles, where initial hardware is subsidized and the business model assumes ongoing payment for bits.
[dead]
And how much money are they making off our non-training data? Or what is the ROI short and long term of that massive valuable data set? Surely there's at least a valuable subset of ideas that if executed better than the incumbents nets them massive value.
I find it disingenuous when people narrow in and focus on the cost of tokens as if thats the only way the companies make money.
They are doing a massive data grab and stealing and thieving your IP and data, non-training data sharing cannot be opted out of.
And this is the reason, why the AI companies, are now preparing a bailout by the US government. We will be moving quickly from your 401K is their exit liquidity to US Treasuries are their exit liquidity...
Meeting next week at the White House by coincidence just before the SpaceX IPO. Message to investors will be dont worry the US has your back...
At which point the corruption is sooo big, that an Empire crumbles under its own stench?
"Trump to meet AI leaders to discuss US investment in their companies" - https://www.bbc.com/news/articles/c98r8r7dz5no
"Trump Officials Held Millions of Dollars of SpaceX Ahead of IPO" - https://finance.yahoo.com/markets/stocks/articles/trump-offi...
The future of AI is ads, free will have lots of ads and eventually go away. Low paid tier will have a few ads, the $2000 tier no ads.
I just hope local llms keep getting better and ways to make them run faster on consumer devices improves
Sorry, you need to sign up for HN+ if you don’t want to be downvoted when pointing out the obvious enshittification that will occur with LLMs