Anthropic tightens usage limits for Claude Code without telling users

394 points by mfiguiere 2 days ago

> One user, who asked not to be identified, said it has been impossible to advance his project since the usage limits came into effect.

Vibe limit reached. Gotta start doing some thinking.

m4rtink 2 days ago

Who would have though including a hard depedency on third part service with unclear long term availability would be a problem!
Paid compilers and remotely acessible mainframes all over again - people apparently never learn.
- brokencode 2 days ago
  
  It’s only a hard dependency if you don’t know and never learn how to program.
  For developers who read and understand the code being generated, the tool could go away and it would only slow you down, not block progress.
  And even if you don’t, it really isn’t a hard dependency on a particular tool. There are multiple competing tools and models to choose from, so if you can’t make progress with one, switch to another. There isn’t much lock-in to any specific tool.
  - iaw 2 days ago
    
    My experience has been that Claude can layout a lot of things in minutes that would take me hours if not days. Often I can dictate the precise logic and then claude get's most of the way there, with a little prompting claude can usually get even further. The amount of work I can get done is much more substantial than it used to be.
    I think there is a lot of reticence to adopt AI for coding but I'm seeing it as a step change for coding the way powerful calculators/workstation computers were for traditional engineering disciplines. The volume of work they were limited to using a slide rule was much lower than now with a computer.
  - latexr a day ago
    
    > For developers who read and understand the code being generated, the tool could go away and it would only slow you down
    Recent research suggests it would in fact speed you up.
    https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
    
    bjclark a day ago
    
    You should actually read the paper. N size of 16. Only 1 of which had used cursor more than 40 hours before. All people working in existing code bases where they were the primary author.
    
    latexr 12 hours ago
    
    I did read the paper, and the HN discussion (which is how I found it). I recommend you read that, your comments were addressed.
    https://news.ycombinator.com/item?id=44522772
    
    brokencode a day ago
    
    Interestingly, devs felt that it sped them up even though it slowed them down in the study.
    So even if it’s not an actual productivity booster on individual tasks, perhaps it still could reduce cognitive load and possibly burnout in the long term.
    Either way, it’s a tool that devs should feel free to use or not use according to their preferences.
- manquer 2 days ago
  
  > Paid compilers.
  I don't think this one is a good comparison.
  Once you had the binary, the compiler worked forever[1]
  The issue with them was around long term support for bugs and upgrade path as the language evolved.
  ---
  [1] as long you had a machine capable of running/emulating the instruction set for the binary.
  - o11c 2 days ago
    
    Hm, I am assuming that paid compilers were largely gone before the whole "must have this dongle attached to computer" industry? Because for software that uses those, "I paid for it" absolutely does not guarantee "I can still run it". The only reason it's not more of a problem is the planned obsolescence that means forced to upgrade sooner or later (but, unlike purely subscription-based services, you have some control over how frequently you pay).
    
    codebje 2 days ago
    
    Sadly, paid compilers still exist, and paid compilers requiring a licensing dongle still exist. The embedded development world is filled with staggering amounts of user hostility.
    
    delta_p_delta_x 2 days ago
    
    My understanding is that much of the established embedded world has moved to any one flavour of GCC or (more commonly) Clang, just because maintaining a proprietary optimising compiler is too much effort than just modifying (and eventually contributing to) Clang.
    
    mlsu 2 days ago
    
    Tough for me to speak about embedded in general, but many companies are on vendor toolchains or paid compilers by choice, and it is the right choice to make given the tradeoffs involved.
    IAR for example is simply a fantastic compiler. It produces more compact binaries that use less memory than GCC, with lots and lots of hardware support and noticeably better debugging. Many companies have systems-engineering deadlines which are much less amenable to beta quality software, fewer software engineering resources to deal with GCC or build-chain quirks (often, overworked EEs writing firmware), and also a strong desire due to BOM cost to use cheaper/less dense parts. And if there is a compiler bug or quirk, there is someone on the other end of the line who will actually pick up the phone when you call.
    That said, some of those toolchain+IDE combos absolutely do suck in the embedded world, mostly the vendor-provided ones (makes sense, silicon manufacturers usually aren't very good at or care much about software, as it turns out).
    
    TeMPOraL 2 days ago
    
    > Tough for me to speak about embedded in general, but many companies are on vendor toolchains or paid compilers by choice, and it is the right choice to make given the tradeoffs involved.
    That's true in general. With paid licenses and especially subscriptions, you're not just getting the service, you're also entering a relationship with the provider.
    For companies, that often matters more than the service itself - especially when support is part of this relationship. That's one of many reasons companies like subscriptions.
    For individuals, that sucks. They don't need or want another relationship with some random party, that they now need to keep track of. The relationship has so much power imbalance that it doesn't benefit the individual at all - in fact, for most businesses, such customer is nothing more than a row in an analytics database - or less, if GDPR applies.
    
    codebje 2 days ago
    
    8051s pretty much mean Keil - they used to do license dongles, but it's all online now. You really don't get much more established than the 8051. If you pick up any cheap electronic product and crack it open to find a low part count PCB with a black epoxy blob on it, chances are very good there's an 8051 core with a mask ROM under the blob.
    (Also AVR/PIC compiler from Microchip had a dongle as recently as February this year, and it looks like it's still available for sale even though its license isn't in the new licensing model).
    
    lelanthran 2 days ago
    
    > My understanding is that much of the established embedded world has moved to any one flavour of GCC or (more commonly) Clang
    Clang is not being professionally used commonly in the embedded space.
- stavros 2 days ago
  
  You need to run the cost/benefit analysis here: if I had avoided Claude Code, all that would have happened is I would have written much less code.
  What's the cost between never using Claude, and using it and getting these lower limits? In the end, I'm left with no Claude in both situations, which leaves me better off for having used it (I wrote more code when Claude worked).
  - latexr a day ago
    
    Did you write more code with Claude? Isn’t the point that you have in fact written less (because Claude wrote it for you)?
    As for the cost, you are ignoring the situation where someone has depended on the tool so much that when it goes away they are atrophied and unable to continue as before. So no, in your scenario you’re not always left better off.
  - m4rtink 19 hours ago
    
    The metric being more lines of code usually turnes out to not be a very good one. Can it also help you do the same with less lines of code & reduced complexity ?
- ivape 2 days ago
  
  Everyone that successfully avoided social media for the last decade escaped with their mental health. Everyone that carefully moderates their ai intake (e.g don’t depend on Claude Code) will also escape with their skills over the next decade, others will become AI fiends, just the like social media fiends. Just knowing tech like the internet and ai can fuck your whole brain up is enough to be ahead of the curve. If you didn’t learn the lesson from the uptake of video games, cellphones, tv, streaming (the list is endless), then you’re not paying attention.
  The destruction of spelling didn’t feel like doomsday to us. In fact, I think most people treated the utter annihilation of that skill as a joke. “No one knows how to spell anymore” - haha, funny, isn’t technology cute? Not really. We’ve gone up an order of magnitude, and not paying attention to how programming is on the chopping block is going to screw a lot of people out of that skill.
  - rlupi 2 days ago
    
    Very thoughtful comment, let me try to capture it more clearly.
    Zuckerberg recently said that those not using AI glasses will be cognitively disadvantaged in the future.
    Imagine an AI-native from the future and one of the fancy espresso coffee machines that we have today. They will be able to know right away how to operate them from their AI assistants, but they won't be able to figure out how they work on their own.
    That's the future that Zuckerberg wants. Naturally, fancy IT offices will likely be gone. The AI-native would have bought the coffee machine for nostalgia effect for a large sum, trying to combat existential dread and feeling of failure which are fueled by their behavior being even more directly coerced into consumption.
  - Cagrikartal 2 days ago
    
    curious, maybe one could go and spin up a study for using claculators instead of calculating manually and how it can lead to less x type of thinking and affect our abiltiy but maybe even if that is true(i am not sure maybe it is just in the domains we dont feel like we need to care much or etc) would people quitting clacutors a good thing for getting things done in the world ?
    
    wongarsu a day ago
    
    For me the thing that atrophied basic math skills wasn't the calculator, which was invented decades before I was born, but the rise of the smart phone.
    Sure, calculators are useful in professional life and in high school math and sciences. But you still had to do everyday math in all kinds of places and didn't always have a calculator at hand. The smartphone changed that
    I feel that's relevant in two ways: just like with math, a little bit of manual coding is going to be a huge difference compared to no manual coding, and any study like the one you propose would be hugely complicated by everything else that happened around the time, both because of smart phones and the coinciding 2008 crash
  - Cagrikartal 2 days ago
    
    curious, maybe one could go and spin up a study for using claculators instead of calculating manually and how it can lead to less x type of thinking and affect our abiltiy but maybe even if that is true(i am not sure maybe it is just in the domains we dont feel like we need to care much or etc) would people quitting clacutors a good thing for producing value in the world by the will of God?
skort 2 days ago

Right, but these companies are selling their products on the basis that you can offload a good amount of the thinking. And it seems a good deal of investment in AI is also based on this premise. I don't disagree with you, but it's sorta fucked that so much money has been pumped into this and that markets seem to still be okay with it all.
- AlexandrB 2 days ago
  
  They're not selling them, they're still giving them away. Once the VC money runs out we'll see what the actual cost of this stuff is.
  - blitzar a day ago
    
    > They're not selling them
    I have a receipt of sale that says otherwise.
    
    ath3nd 6 hours ago
    
    Oh, you beautiful summer child. They are losing money on you . Do you think they are doing that out of the goodness of your heart? They are luring you in, making you dependent on them at a net loss while the VC money are lasting.
    When you are totally hooked and they are fully out of money, that's when you'd realize the long con you've been dragged into. At this very moment, that they are tightening the usage limits they are not telling you, and you still think the peanuts you are paying them now would be enough in the future? It's called https://en.wikipedia.org/wiki/Enshittification and you better know that you are in it.
  - spongebobstoes 2 days ago
    
    most inference runs at 40%+ margin
    
    darylteo 2 days ago
    
    Is this like saying a gym runs at 40%+ margin because 80% of users don't really use it heavily or forget they even had a subscription? Would be interested to see the breakdown of that number.
    
    wongarsu a day ago
    
    That's how nearly every subscription service works, yes. Some fraction has a subscription and doesn't use it, another large chunk only uses a fraction of their usage limits, and a tiny fraction uses the service to it's full potential. Almost no subscription would be profitable if every customer used it to its full potential
    
    TeMPOraL 2 days ago
    
    TL;DR: their subscriptions have an extra built-in margin closer to 70%, because the entry price, target audience and clever choice of rate limiting period, all keep utilization low.
    ----
    In this case I'd imagine it's more of an assumption that almost all such subscriptions will have less than 33% utilization, and excepting few outliers, even the heaviest users won't exceed 60-70% utilization on average.
    "33% or less" is, of course, people using a CC subscription only at work. Take an idealized case - using CC only during regular working hours, no overtime: then, even if you use it to the limit all the time, you only use it for 1⁄3 of the day (8h), and 5 days a week - the expected utilization in this scenario is 8/24 × 5/7 = 24%. So you're paying a $200 subscription, but actually getting at most $50 of usage out of it.
    Now, throw in a rate limit that refreshes in periods of 5 hours - a value I believe was carefully chosen to achieve this very effect - and the maximum utilization possible (maxing out limit, waiting for refresh, and maxing out again, in under 8 hours wall-clock time), is still 10 hours equivalent, so 10/24 × 5/7 = 30%. If you just plan to use CC to the first limit and then take meetings for the rest of the day, your max utilization drops to 15%.
    Of course people do overtime, use same subscription for personal stuff after work and on weekends, or just run a business, etc. -- but they also need to eat and sleep, so interactively, you'd still expect them to stay below 75% (83% if counting in 5-hour blocks) total utilization.
    Sharing subscriptions doesn't affect these calculations much - two people maxing out a single subscription is, from the provider side, strictly not greater than two subscriptions with 50% utilization. The math will stop working once a significant fraction of users figure out non-interactive workflows, that run CC 24/7. We'll get there eventually, but we're not there yet. Until then, Anthropic is happy we're all paying $200/month but only getting $50 or less of service out of it.
    
    ehnto 2 days ago
    
    Is that for per token costs or in these bundled subscriptions companies are selling?
    For example, when playing around with claude code using a per token paid API key, it was going to cost ~$50aud a day with pretty general usage.
    But their subscription plan is less than that per month. Them lowering their limits suggests that this wasn't proving profitable at the current limits.
    
    Shorn a day ago
    
    I thought it was understood all the large vendors were losing money bigly on inference and will have to pull the rug eventually.
    
    oldenlessons a day ago
    
    Cost of revenue should include R&D and amortization. Pointing to EBTIDA is a very old trick.
    
    manojlds a day ago
    
    Definitely doesn't sound true unless you have the receipts.
    
    irjustin 2 days ago
    
    Is that to the $20/mth plan or the 137?
    
    jrflowers 2 days ago
    
    Exactly. The enormous margin is why companies like OpenAI and Anthropic are known for being so immensely profitable. Just money printing machines compared to the amount of cash they burn
    
    csomar 2 days ago
    
    They'd be still at massive losses. You can spend your monthly subscription price in a day.
  - tecleandor 2 days ago
    
    Yeah, they are _saying_ that they're selling you a service but there will be surprises...
- gscott 2 days ago
  
  How long can AI be subsidized in the name of growth? They need to radically increase the price. If I replace a $150k yr employee should I pay $200 a month or $2,000 a month. $200 is too cheap.
  - TeMPOraL 2 days ago
    
    $200 a month with Opus 4 and Sonnet 4 won't let you replace a $12.5k / month employee - but it's cheap enough that everyone, including you and even your employees, will want to see how much utility they can squeeze out of it.
    This is a price to get people hooked up, yes, but also to get them to explore all kinds of weird tricks and wacky workflows, especially ones that are prohibitively costly when billed per token. In some sense, this is crowdsourcing R&D - and then when Opus 7 or whatever comes along to take advantage of best practices people worked out, and that turns out to be good enough to replace your $150k/yr / $12.5k/mo employee - then they'll jack up prices to $10k/month or whatever.
    
    perks_12 a day ago
    
    $200 a month gives your $12.5k / month employee a handy assistant who can take care of things you'd love to automate or would employ a junior dev for.
    
    ath3nd 6 hours ago
    
    And makes your codebase completely unmaintainable, even for Claude. The more Claude runs one does, the more unmaintainable the codebase becomes, and more tokens Claude spends on each subsequent run. It's a perfect ecosystem!
  - wqaatwt 2 days ago
    
    > replace a $150k
    Seems tangential? Price entirely depends on what consumers/businesses willing to pay and the degree of competition
  - benterix 2 days ago
    
    > If I replace a $150k yr employee
    Based on my experience with Claude Code (which is relatively good TBH), I'd say good luck with that.
  - whywhywhywhy a day ago
    
    You can spend $2000 a month if you want, they have an pay what you use option.
latexr a day ago

> Gotta start doing some thinking.
The fact they declared their own project as “impossible to advance” given the situation reveals they are unwilling to go the thinking route right now.
epolanski 2 days ago

Hmmm, I am 99% sure the users are not vibe coders who can't code, those are on tools like lovable, not messing with terminal tools.
gdudeman 2 days ago

More than some thinking. They’ll probably need to think hardest or even ultrathink to keep the project moving forward.
bGl2YW5j 2 days ago

Came to comment on the same quote.
I'm surprised, but know I shouldn't be, that we're at this point already.
- mrits 2 days ago
  
  First one was free
- mattigames 2 days ago
  
  I would be a little disappointed if that wasn't the case, after all we have been there quite a while in regards to the art models.
dude250711 2 days ago

He did not pass the vibe check.
chewz 2 days ago

People who complain that Claude Code Max $200 isn't enough are first to let go in my opinion.
They have have either shitty codebase or can't narrow down the scope. Or both. Not the kind of folks you want on your team.
- TeMPOraL 2 days ago
  
  That's a nonsense take. How fast you burn through usage limits depends on your use patterns, and if there's one thing that's true about LLMs, is that you can practically always improve your results by spending more tokens. Pay-per-use API pricing just makes you hit diminishing returns quickly. With Claude Code subscription, it's different.
  The whole value proposition of a Max subscription is that it lets you stop worrying about tokens (Claude Code literally tells you so if you type `/cost` while authenticated with a subscription). So I'd turn around and say, people who don't regularly hit usage limits aren't using Claude Code properly - they're not utilizing it in full.
  --
  Myself, I'm only using Claude Code for little R&D and some side projects, but I upgraded from Max x5 to Max x20 on the second day, as it's trivial to hit the Max x5 limit in a regular, single-instance chat. And that's without any smarts, just a more streamlined flavor of good ol' basic chat experience.
  But then I look around, and see people experiment with more complex approaches. They run 4+ instances in parallel to work on more things at a time. They run multiple instances in parallel to get multiple solutions to the same task, and then mix them into a final one - possibly with help of yet another instance. They have the agent extensively research a thing before doing it, and then extensively validate it afterwards. And so on. Any single one of such tricks/strategies is going to make hitting limits on Max x20 a regular occurrence again.
mrits 2 days ago

I honestly feel sorry for these vibe coders. I'm loving AI in a similar way that I loved google or IDE magic. This seems like a far worst version of those developers that tried to build an entire app with Eclipse or Visual Studio GUI drag and drop from the late 90s
- stavros 2 days ago
  
  I really don't like how religious the debate about AI has gotten here. "I feel sorry for these vibe coders" is something you tell yourself to feel superior to people who use AI.
  Don't feel sorry for me, I've used vibe coding to create many things, and I still know how to program, so I'll live if it goes away.
  - Wubdidu a day ago
    
    Same, I really like the solutions one can build with LLMs and have a lot of fun working with them to improve use-cases where it actually makes sense. Its the first time since years I really enjoy coding on side-projects and take great care to give clear instructions and review and understand what LLMs build for me, except some completely irrelevant/one-shot things I entirely "vibe code".
    Its gotten so bad I'm actively trying to avoid talking about this in circles like Hacker News because people get so heavily and aggressively discredited and ridiculed as if they have no idea what they are doing or are a shill for big AI companies.
    I know what I'm doing and actively try to help friends and co-workers use LLMs in a sustainable way, understanding their limitations and the dangers of letting them loose without staying in the loop. Its sad that I can't talk about this without fear of being attacked, especially in communities like Hacker News that I previously valued as being very professional and open, compared to other modern social media.
- geoduck14 2 days ago
  
  Why isn't anyone talking about the bevvy of drag-and-drop no colder solutions that have already been in the market? Surely the LLMs are competing with those tools, right?
  - Workaccount2 2 days ago
    
    People trash LLM code as if most consumer software isn't buggy piles of half assed code.
    
    eppsilon a day ago
    
    all software if we're being honest :)
- nikisweeting 2 days ago
  
  Hundreds of billions of dollars have changed hands through shitty drag-and-drop UIs, wordpress ecommerce plugins, and dreamweaver sites, lets not forget the code is there to serve a business purpose at the end of the day. Code quality is an implementation detail that may matter less over time as rewrites get easier. I love me some beatiful hand-written clean code, but clean code is not the true goal.
  - ImaCake 2 days ago
    
    > clean code is not the true goal
    Its not, but it does matter. LLMs, being next word guessers, perform differently with different inputs. Its not hard to imagine a feedback loop of bad code generating worse code and good code generating more good code.
    My ability to get good responses from LLMs has been tied to me writing better code, docstrings, and using autoformatters.
    
    nikisweeting 2 days ago
    
    I don't think that feedback loop is really a loop because code that doesn't actually do its job doesn't grow in popularity for long. We already have a great source of selection pressure to take care of shitty products that don't function: users and their $.
  - mrits 2 days ago
    
    I don't consider drag-and-drop UIs anywhere close to wordpress plugins. I'm not talking about writing bad code, I'm talking about being able to understand what you are creating.
    
    nikisweeting 2 days ago
    
    there are many parts of computers I don't understand in detail, but I still get tremendous value using them and coding on top of abstractions I don't need to know the internals of.
- IAmGraydon 2 days ago
  
  Thats a really good comparison. Dreamweaver would be another one. You just don’t own the tool now, so it puts you at even more risk.

pembrook 2 days ago

The funny thing is Claude 4.0 isn't even that 'smart' from a raw intelligence perspective compared to the other flagship models.

They've just done the work to tailor it specifically for proper tool using during coding. Once other models catch up, they will not be able to be so stingy on limits.

Google has the advantage here given they're running on their own silicon; can optimize for it; and have nearly unlimited cashflows they can burn.

I find it amusing nobody here in the comments can understand the scaling laws of compute. It seems like people have a mental model of Uber burned into their head thinking that at some point the price has to go up. AI is not human labor.

Over time the price of compute will fall, not rise. Losing money in the short term betting this will happen is not a dumb strategy given it's the most likely scenario.

I know everybody really wants this bubble to pop so they can make themselves feel smart for "calling it" (and feel less jealous of the people who got in early) and I'm sure there will be a pop, but in the long term this is all correct.

alphager 2 days ago

Even if Moore's law was still in effect and the computer resources required stayed the same and compute stayed as efficient per watt (neither is true), it would just halve compute costs every 18 months. You're able to read about people hitting $4000 costs/month on the $200 plan upthread. That's 8 years until it's cost effective.
Are they really ready to burn money for 8 years?
- pembrook 2 days ago
  
  Uber operated at a loss for 9 years. They're now a profitable, market-winning business.
  Amazon operated at a loss for 9 years, and barely turned a profit for over a decade longer than that. They're now one of the greatest businesses of all time.
  Spotify operated at a loss for 17 years until becoming profitable. Tesla operated at a loss for 17 years before turning a profit. Palantir operated at a loss for 20 years before turning a profit.
  And this was before the real age of Big Tech. Google has more cashflows they can burn than any of these companies ever raised, combined.
  - mike_hearn a day ago
    
    Uber is profitable now, on a yearly basis. But they still presumably have a long way to go until they're actually profitable in the sense that they've made a profit over the lifetime of their existence.
    Amazon pivoted to a totally different business that was profitable from the get-go.
    Google became profitable within a couple of years of starting and went into the black on total lifetime spend almost immediately. As was normal, back then.
  - klik99 2 days ago
    
    Those aren’t good comparisons.
    Uber operated at a loss to destroy competition and raised prices after they did that.
    Amazon (the retailer) did the same and leveraged their position to enter new more lucrative markets.
    Dunno about Spotify, but Tesla and palantir both secured lucrative contracts and subsidies.
    Anthropic is against companies with deeper pockets and can’t spend to destroy competition, their current business model can only survive if they reduce costs or raise prices. Something’s got to give
    
    pembrook 2 days ago
    
    They are good comparisons. All startups go against incumbents/competitors with deeper pockets.
    Re: Anthropic specifically, I tend to agree, hence why I'm saying the deeper pockets (eg. Google, Amazon, etc) are perfectly positioned to win here. However, big companies have a way of consistently missing the future due to internal incentive issues. Google is deathly afraid of cannibalizing their existing businesses.
    Plus, there's many investors with deep pockets who would love to get in on Anthropic's next round if their technical lead proves to be durable over time (like 6 months in AI terms).
    This fight is still early innings.
    
    klik99 a day ago
    
    It’s true startups go against deeper pockets, but I stand by my analysis since Uber / Amazon / Tesla (to a degree) were early tech companies going against old companies and not competing with others doing the exact same thing. They operated at a loss to defeat the old guard. Today that model doesn’t work well, and Anthropic are against deeper pockets that are doing nearly the exact same thing as them. If they were the only innovative company with huge outside investment against entrenched and unwilling to innovate older companies like Uber and Amazon then I’d agree there was a bigger chance.
    And I like Anthropic, I want them to be successful, but they just can’t operate at a loss like this for long, they have to make some tough calls, and trying to cut corners behind the scenes is not good for long term trust
    
    willhslade 2 days ago
    
    Haven't they already cannibalized search? It really sucks now.
    
    gleenn 2 days ago
    
    Google search results "sucking" probably is an indication that they are squeezing money out of it well. Just because you don't like the results you are getting doesn't mean the average user isn't still using Google a ton and generating $$$ for Goog
    
    osn9363739 2 days ago
    
    Could this just be survivorship bias? How many companies burned money until they died? This isn't some hot take. I'm kinda interested. Surely more companies failed with this model than survived.
    
    simianparrot 2 days ago
    
    Anecdotally have worked for a company in the past that did just that and eventually went bankrupt. Know of many many more just in my city, for what it’s worth.
    
    Aeolun 2 days ago
    
    As long as Anthropic is better than their competitors they’ll continue to get my business.
  - edaemon 2 days ago
    
    Well, those companies were all successful, it's a bit of survivorship bias to only consider those. How many companies operated at a loss for years and eventually went out of business?
- sothatsit 2 days ago
  
  I think people also expect models to be optimised over time. For example, the 5x drop in cost of o3 was probably due to some optimisation on OpenAI's end (although I'm sure they had business reasons for dropping the price as well).
  Small models have also been improving steadily in ability, so it is feasible that a task that needs Claude Opus today could be done by Sonnet in a year's time. This trend of model "efficiency" will add on top of compute getting cheaper.
  Although, that efficiency would probably be quickly eaten up by increased appetites for higher performance, bigger, models.
  - absoluteharam 2 days ago
    
    [dead]
- wongarsu a day ago
  
  Every subscription service loses money on its heavy users. What matters is the average. Lots of people go for the higher plan because they need it once, then never downgrade even if their regular usage doesn't justify it. And then there are all the people paying for months where they don't use it at all
  Among private users willing to pay $200/month average usage is likely very high, but if Anthropic can convince companies to buy plans for entire departments average usage will be much lower on those.
  Another issue is that $4000 costs is assuming the regular API is offered at cost, which is unlikely to be true.
com2kid 2 days ago

> They've just done the work to tailor it specifically for proper tool using during coding. Once other models catch up, they will not be able to be so stingy on limits.
I don't subscribe to the $100 a month plan, I am paying API usage pricing. Accordingly I have learned how to be much more careful with Claude Code than I think other users are. The first day I used it, Claude got stuck in a loop trying to fix a problem using the same 2 incorrect solutions again and again and burnt through $30 of API credits before I realized things were very wrong and I stopped it.
Ever since then I've been getting away with $3-$5 of usage per day, and accomplishing a lot.
Anthropic needs to find a way to incentivize developers to better use Claude Code, because when it goes off the rails, it really goes off the rails.
- nmadden 2 days ago
  
  > The first day I used it, Claude got stuck in a loop trying to fix a problem using the same 2 incorrect solutions again and again and burnt through $30 of API credits before I realized things were very wrong and I stopped it.
  The worse it performs, the more you pay. That’s a hell of a business model. Will users tolerate that for long?
  - janejeon 2 days ago
    
    > The worse it performs, the more you pay. That’s a hell of a business model. Will users tolerate that for long?
    I mean, AWS seems to be doing fine with that business model.
andix 2 days ago

The thing is, all the models are not that 'smart'. None of them is AGI.
Currently it's much more important to manage context, split tasks, retry when needed, not getting stuck in an infinite loop, expose the right tools (but not too many), ...
conartist6 2 days ago

The problem with models is that they create lots of junk content.
Industries can often get away with polluting when they're small, but once they reach planet scale salting the earth behind you is not as reliable of a tactic.
layoric 2 days ago

Claude is decent for sure, but if you are using these models for 'smarts', that is a whole separate problem. I also think honestly people are sleeping on Mistral's medium 3 and devstral medium. I know it isn't 'smart' either (none of them are), but for mundane tasks need valid code output, it is extremely good for the price.
- kadushka 2 days ago
  
  I use o3 to brainstorm research problems, and it's been pretty useful. Especially the deep research feature.
  - layoric 2 days ago
    
    As a sounding board for things you are already well familiar with, I agree, and have experienced the same, and that can be useful. It's also a much better experience than say using Google to do the same, or just a rubber ducky.
    The NLP these models can do is definitely impressive, but they aren't 'thinking'. I find myself easily falling into the habit of filtering a lot of what the model returns and picking out the good parts which is useful and relatively easy for subjects I know well. But for a topic that I am not as familiar with, that filtering (identifying and dismissing) I do is much less finessed, and a lot of care needs to be taken to not just accept what is being presented. You can still interrogate each idea presented by the LLM to ensure you aren't being led astray, and that is still useful for discovering things, like traditional search, but once you mix agents into this, things can go off the rails far too quickly than I am comfortable with.
qweiopqweiop 2 days ago

I think you're right, but you're also ignoring the effects of monopolies and/or collusion. There's a absolutely a chance prices don't come down due to broader anti-competitive plays.
lacker 2 days ago

Will the other models really catch up, though? To me it seems like Anthropic's lead in programming has increased over the past year. Isn't it possible that over time, some models just become fundamentally better at some things than other models?
- lelanthran 2 days ago
  
  > Will the other models really catch up, though? To me it seems like Anthropic's lead in programming has increased over the past year.
  To me it seems that they are all converging over time. So what if one is "better"[1]? The others will soon be there too.
  [1] My experience with them is that Claude (Opus) is only slightly better than the competition at coding; it might do a better job on 2 out of every 5 tasks than the competition, but those people using the competition still get those two tasks done anyway.
- mosdl 2 days ago
  
  I've used Augment before moving to Claude, they are pretty similar, often can't tell the difference. I don't think there is that much difference in the dev focused llms.
- danny_codes 2 days ago
  
  I mean, not based on anything we’ve seen so far in the DL space. The algorithms are public, the compute is fungible: the only differentiator is data. But deepseek demonstrates that it’s somewhat easy to siphon data off other models so… yeah unclear where the moat is.
macinjosh 2 days ago

Prices for yesterday's frontier models will fall but there will always be the next big model. similar to how game graphics get ever better but ever more demanding at the bleeding edge.
- carlhjerpe 2 days ago
  
  Yes but games also look an awful lot better (fidelity wise) than not so many years ago.
andrewstuart 2 days ago

The latest trend for the anti AI folks is just to deny it - to say that developers are imagining the benefits and then they demand hard numbers and data and when such hard numbers are not produced - they claim victory.

Aurornis 2 days ago

I played with Claude Code using the basic $20/month plan for a toy side project.

I couldn't believe how many requests I could get in. I wasn't using this full-time for an entire workweek, but I thought for sure I'd be running into the $20/month limits quickly. Yet I never did.

To be fair, I spent a lot of time cleaning up after the AI and manually coding things it couldn't figure out. It still seemed like an incredible number of tokens were being processed. I don't have concrete numbers, but it felt like I was easily getting $10-20 worth of tokens (compared to raw API prices) out of it every single day.

My guess is that they left the limits extremely generous for a while to promote adoption, and now they're tightening them up because it’s starting to overwhelm their capacity.

I can't imagine how much vibe coding you'd have to be doing to hit the limits on the $200/month plan like this article, though.

eddythompson80 2 days ago

Worth noting that a lot of these limits are changing very rapidly (weekly if not daily) and also depend on time of day, location, account age, etc.
dawnerd 2 days ago

I hit the limits within an hour with just one request in CC. Not even using opus. It’ll chug away but eventually switch to the nearing limit message. It’s really quite ridiculous and not a good way to upsell to the higher plans without definitive usage numbers.
- MystK 2 days ago
  
  Use `npx ccusage` if you're interested in how much it would have costed if you paid by API usage.
  - kristianp a day ago
    
    https://www.npmjs.com/package/ccusage
theshrike79 4 hours ago

It’s also silly to fork out for anything but the monthly plan. The tech is moving so fast that in 4 months something else is going to be on top.
a_bonobo 2 days ago

There's a lot you can do in terms of efficient token usage in the context of Claude Code, I wouldn't be surprised if they soon launch a Claude Code-specific model.
In my experiments, it would be enormously wasteful in token usage, doing things like re-reading all Python scripts in the current folder just to make sure all comments were up-to-date, or it re-read an R script to make sure all brackets were closed correctly. Surely that's where a good chunk of the waste comes from?
cladopa 2 days ago

Thinking is extremely inefficient compared with the usual query in Chat.
If you think a lot, you can spend hundreds of dollars easily.
ChadMoran 2 days ago

If you aren't hitting the limits you aren't writing great prompts. I can write a prompt and have it go off and work for about an hour and hit the limit. You can have it launch sub-agents, parallelize work and autonomously operate for long periods of time.
Think beyond just saying "do this one thing".
- edg5000 2 days ago
  
  I've used CC a lot and to great effect, but it never runs more than 10 mins (Opus). Completely independent for 60 min, sounds impressive. Can you share some insights on this? Really curious; I can also share recents prompts of mine.
  - ChadMoran a day ago
    
    "You are an expert software engineer
    Your key objective is to fix a bug in the XYAZ class. Use a team of experts as sub-agents to complete the analysis, debugging and triage work for you. Delegate to them, you are the manager and orchestrator of this work.
    As you delegate work, review and approve/reject their work as needed. Continue to refine until you are confident you have found a simple and robust fix"
    
    edg5000 9 hours ago
    
    Wow! I will try that. Really cool. Never tried the mythical sub-agent feature, not sure if it was really a thing due to the sparse docs. The "You are an expert software engineer" really helps? Probably good idea to mention "simple" because Claude sometimes settles for an overengineered solution.
- osn9363739 2 days ago
  
  Are there some good examples/wiki/knowledge base on how to do this? I'll read 2 competing theories on the same day so I'm kinda confused.
- stogot 2 days ago
  
  How is that a great prompt having it run for an hour without your input? Sounds like it’s just generating wasteful output.
  - ChadMoran 2 days ago
    
    Who said it was writing code for an hour? Solving complex problems by problem solving, writing SQL, querying data, analyzing data. formulating plans.
    What do you do for hours?
    If all you're thinking about is code output, you're thinking too small.
    
    ChadMoran 2 days ago
    
    You should really read this.
    https://www.anthropic.com/news/claude-4
    It was given a task and it solved a problem by operating for 7 hours straight.
    
    esafak 2 days ago
    
    I have not tested Claude Code but that's impressive because other agents get stuck long before that.
    
    ChadMoran 2 days ago
    
    Takes proper prompt crafting but Claude Code is really impressive.
  - cma 2 days ago
    
    It can be fixing unit tests and stuff for quite a while, but I usually find it cheats the goal when unattended.
- mrits 2 days ago
  
  That clears up a lot for me. I don't think I've ever had it take for than a couple of minutes. If it takes more than a minute I usually freak out and press stop

buremba 2 days ago

They're likely burning money so I can't be pissed off yet, but we see the same Cursor as well; the pricing is not transparent.

I'm paying for Max, and when I use the tooling to calculate the spend returned by the API, I can see it's almost $1k! I have no idea how much quota I have left until the next block. The pricing returned by the API doesn't make any sense.

roxolotl 2 days ago

A coworker of mine claimed they've been burning $1k a week this month. Pretty wild it’s only costing the company $200 a month.
- gerdesj 2 days ago
  
  Crikey. Now I get the business model:
  I hire someone for say £5K/mo. They then spend $200/mo or is it a $1000/wk on Claude or whatevs.
  Profit!
  - AtheistOfFail 2 days ago
    
    The model is "outspend others until they're bankrupt".
    Known as the Uber model or Amazon vs Diapers.com
    
    devnullbrain 2 days ago
    
    It's a shame that the LLM era missed out on coinciding with the zero interest rates era. Just imagine the waste we could create.
    
    margalabargala 2 days ago
    
    > Amazon vs Diapers.com
    To be fair that was a little different; Amazon wanted to buy the parent company of Diapers.com so sold at a loss to tank the value of the company so they could buy it cheap.
- Terretta 2 days ago
  
  Wasn't there a stat somewhere that a good o3-pro deep research was ~$3500, per question?
  - sothatsit 2 days ago
    
    I highly doubt that was ever the case in the UI version. You're probably thinking of when they benchmarked o3-high on ARC-AGI and it cost $3440 per question.
neom 2 days ago

We just came out of closed alpha yesterday and have been trying to figure out how best to price, if you'd be willing to provide any feedback I'd certainly appreciate it: https://www.charlielabs.ai/pricing - Thank you!! :)
- r0fl 2 days ago
  
  You are charging $2500 a month!?
  I’ll assuming this is real and not trolling. Who are the customers? What kind of people spend that much? I know people using $200-300 models but this is 10x that!
  - neom 2 days ago
    
    We were in closed alpha and thankfully a fair number of teams converted. To your point: right now we don't have any users on the $2500/mth plan, but it aligned with what the people on the $500/mth plan are asking for, we'll see..! :) I was really wondering if our concept of credits is kinda hard to understand?
dfsegoat 2 days ago

Can you clarify which tooling you are using? Is it cursor-stats?
- d4rkp4ttern a day ago
  
  Look up ccusage. Great tool. Run “ccusage blocks” and you’ll see where you are in the current block.
isodev 2 days ago

> They're likely burning money so I can't be pissed off yet
What do you mean? That’s totally a good reason to be pissed off at them. I’m so tired of products that launch before they have a clear path to profitability.
- buremba 21 hours ago
  
  I'm okay with free VC money; it's their problem to make it profitable.

adamtaylor_13 2 days ago

That’s funny I literally started the $200/month plan this week because I routinely spend $300+/month on API tokens.

And I was thinking to myself, “How does this make any sense financially for Anthropic to let me have all of this for $200/month?”

And then I kept getting hit with those overloaded api errors so I canceled my plan and went back to API tokens.

I still have no idea what they’re doing over there but I’ll happily pay for access. Just stop dangling that damn $200/month in my face if you’re not going to honor it with reasonable access.

fluidcruft 2 days ago

Why wouldn't you assume that it implies the API rates are massively inflated? I don't do anything really but started playing around last week. I put $5 in tokens in to see how long it would last while I play around. It came out at 30min of compute or whatever dir 3hrs of playing around. So my dumb back of the envelope says $10/hr of compute means $90k per year. Sure GPUs are expensive but are they $90k year expensive? Dunno. It's not like the incremental cost of adding GPUs to the inference side are unmoored from hardware incremental costs.
- what 2 days ago
  
  How did you come up with 30 min of compute from playing around with it for 3 hours? Did you spend 2.5 hours typing while playing around?
  - fluidcruft a day ago
    
    It's what claude code's cli /cost command reported. Like I said I haven't really used this stuff much so a lot of it was reviewing what it was doing and thinking about how to prompt it to do what I wanted it to do. I'm generally pretty skeptical about this stuff but I'm an old fart and decided I should see what the kids are doing rather than yell at clouds and tell people to get off my yard. So far I am able to get it to mostly work acceptably for annoying maintenance tasks (adding docstrings, explain code, rough drafts of tests). Things that I just generally don't have time for. Having it explain my own code from a few years ago to jog my memory about how it works. That sort of thing.
    Interestingly, it came up with some definitions of abbreviations and slightly different ways to do things using confidential APIs that have been sort of reverse engineered in public but are not fully understood. Which to me was the most interesting. The things it suggested related to that are very plausible and I was trying to decide if it was clever hallucinations or because it's being used/trained on code bases at that company or by others who have better access. So I spent a lot of time googling to see if it had found some publicly available info to confirm.
    Anyway after I hit $5 I subscribed to pro to continue the playing but /cost won't tell you anything useful after switching to pro.
    I also wanted to try Gemini clinto compare but I couldn't figure out how to actually give Google my money. Why is Google so impossible to use.
edg5000 2 days ago

This was with Opus? What sort of tasks were you doing? I found that very large files (2.5MB source file) can really eat up tokens. Other than that, I've never run out with my 100 EUR plan, exclusively using Opus.

Ataraxic 2 days ago

I need to see a video of what people are doing to hit the max limits regularly.

I find sonnet really useful for coding but I never even hit basic limits. at $20/mo. Writing specs, coming up with documentation, doing wrote tasks for which many examples exist in the database. Iterate on particular services etc.

Are these max users having it write the whole codebase w/ rewrites? Isn't it often just faster to fix small things I find incorrect than type up why I think it's wrong in English and have it do a whole big round trip?

sothatsit 2 days ago

I can tell you how I hit it: Opus and long workflows.
I have two big workflows: plan and implement. Plan follows a detailed workflow to research an idea and produce a planning document for how to implement it. This routinely takes $10-30 in API credits to run in the background. I will then review this 200-600 line document and fix up any mistakes or remove unnecessary details.
Then implement is usually cheaper, and it will take that big planning document, make all the changes, and then make a PR in GitHub for me to review. This usually costs $5-15 in API credits.
All it takes is for me to do 3-4 of these in one 5-hour block and I will hit the rate-limit of the $100 Max plan. Setting this up made me realise just how much scaffolding you can give to Opus and it handles it like a champ. It is an unbelievably reliable model at following detailed instructions.
It is rare that I would hit the rate-limits if I am just using Claude Code interactively, unless I am using it constantly for hours at a time, which is rare. Seems like vibe coders are the main people who would hit them regularly.
- vineyardmike 2 days ago
  
  This is very interesting as a workflow. How “big” are the asks you’re giving Claude? Can you give an example of the type of question you’d ask it to implement where it requires a discrete planning document that long?
  Whenever I use it, I typically do much smaller asks, eg “add a button here”, “make that button trigger a refresh with a filter of such state…”
  - sothatsit 2 days ago
    
    The best results I get are for things like writing a new database migration and plumbing the data through to an API endpoint. That touches a lot of the codebase, but is not particularly complicated, so this process works quite well for that. Especially because it is easy for me to review.
    I have also used this for creating new UI components or admin pages. One thing I have noticed is that the planning step is pretty good at searching through existing UI components to follow their patterns to maintain consistency. If I just asked Claude to make the change straight away, it often won't follow the patterns of our codebase.
    But for UI components, adding new pages, or things like that, it is usually more useful just as a starting point and I will often need to go in and tweak things from there. But often it is a pretty good starting point. And if it's not, I can just discard the changes anyway.
    I find this is not worth it for very small tasks though, like adding a simple button or making a small behaviour change to a UI component. It will usually overcomplicate these small tasks and add in big testing rigs or performance optimisations, or other irrelevant concerns. It is like it doesn't want to produce a very short plan. So, for things like this I will use Claude interactively, or just make the change manually. Honestly, even if it did do a good job at these small tasks, it would still seem like overkill.
  - shepherdjerred 2 days ago
    
    Here's an example of me doing a migration from Deno -> Bun, prompting with something like "migrate this, make a plan in BUN.md" and then "do step 1 and write your progress, do step 2, etc."
    https://github.com/shepherdjerred/scout-for-lol/blob/6c6a3ca...
- Syzygies 2 days ago
  
  Yup. I'm on a side project trying to port the 1980's computer algebra system Macaulay I coauthored from 32-bit K&R C to 64-bit C23.
  K&R C is underspecified. And anyone who whines about AI code quality? Hold my beer, look at our 1980's source.
  I routinely have a task manager feed eight parallel Claude Code Opus 4 sessions their next source file to study for a specific purpose, to get through all 57 faster. That will hit my $200 Max limit, reliably.
  Of course I should just wait a year, and AI will read the code base all at once. People _talk_ like it does now. It doesn't. Filtering information is THE critical issue for managing AI in 2025.
  The most useful tool I've written to support this effort is a tmux interface, so AI and I can debug together two terminal sessions at once: The old 32-bit code running on a Linode instance, and the new 64-bit code running locally on macOS. I wasn't happy with how the tools for this worked, that I could find online. It blows my mind to watch Opus 4 debug.
  - skinner927 2 days ago
    
    Is any of this public? It sounds very interesting.
- j-conn a day ago
  
  Mind sharing your detailed planning workflow and/or how you came up with it? Is it basically a detailed prompt asking it to produce a PRD?
  I often find my plans for the code changing as I get into implementation. I wonder if giving the model explicit permission to change the plan would work or cause it to go off the rails for big chunks of work like this.
  - sothatsit a day ago
    
    Here is the planning prompt, which I would usually place in .claude/commands/plan.md: https://gist.github.com/Sothatsit/c9fcbcb50445ebb6f367b0a6ca...
    The implement prompt doesn't seem to matter as much: https://gist.github.com/Sothatsit/bdf2cf4ed7c319e2932a5d2d8d...
    Creating these was basically just a workflow of writing out a workflow myself, getting Opus to refine it, then back to me, then back to Opus until it was something I was happy with. There is probably a lot more refining you could do with it, but it works pretty well as it is right now.
    I then have wrapper scripts that invoke these using claude -p in a container, but those are pretty specific to my codebase.
Ensorceled 2 days ago

> Isn't it often just faster to fix small things I find incorrect than type up why I think it's wrong in English and have it do a whole big round trip?
This is my experience: at some point the AI isn't converging to a final solution and it's time to finish the rest by hand.
- bluefirebrand 2 days ago
  
  My experience is that if the AI doesn't oneshot it, it's faster to do it myself
  If you find yourself going back and forth with the AI, you're probably not saving time over a traditional google search
  Edit: and it basically never oneshots anything correctly
  - pinoy420 2 days ago
    
    [dead]
pragmatick a day ago

Yesterday I tried CC the first time. I have the $20 package. I asked it to improve the code in a small kotlin based chess engine. Five minutes later I reached my limit and the engine performed poorer than before. It just created two new classes, changed some code in others and created a couple of tests which it ran. So I hit the limit pretty quickly.
nh43215rgb 2 days ago

Are you using claude code for coding with sonnet? Just claude web use alone is indeed fairly relaxed i think.
adamtaylor_13 2 days ago

I couldn’t even get it to do simple tasks for me this week on the max plan. It’s not just max users overloading it. It feels like they’re randomly rate limiting users.
One day my very first prompt in the morning was blocked. Super strange.
sneak 2 days ago

Faster overall, sure. But I am interrupt driven and typing the prompt alone is faster yet, so I do that and come back after a bit (bouncing between many open tasks), so the fact that the agent took 3x longer overall doesn’t matter, because it happens in the background. my time was just spent typing out the prompt, which was only seconds.

martinald 2 days ago

I'm not sure this is "intentional" per se or just massively overloaded servers because of unexpected demand growth and they are cutting rate limits until they can scale up more. This may become permanent/worse if the demand keeps outstripping their ability to scale.

I'd be extremely surprised if Anthropic picked now of all times to decide on COGS optimisation. They potentially can take a significant slice of the entire DevTools market with the growth they are seeing, seems short sighted to me to nerf that when they have oodles of cash in bank and no doubt people hammering at their door to throw more cash at them.

andix 2 days ago

A lot of people switched away from Cursor within the blink of an eye. Switching IDEs is a big deal for me - it takes a lot of effort, which is why I never switched to Cursor in the first place.
I think Claude Code is a much better concept, the coding agent doesn't need to be connected to the IDE at all. Which also means you can switch even faster to a competitor. In that sense, Claude Code may have been a huge footgun. Gaining market share might turn out to be completely worthless.
- mattnewton 2 days ago
  
  I think in the case of Cursor, they are one of may VScode forks, so a switch is not really very challenging. I agree there is little to keep me on any individual app or model (which is one reason I think cursor's reported 9b valuation is a little crazy!)
  - andix 2 days ago
    
    Only if you're using VS code in the first place. VS code is fine for web dev and js/ts/python. But I really don't like it for Java, C#, C++, SQL, and many more.
    
    martinald a day ago
    
    Agreed. I do agree with your point that they are very replaceable, but the amount of people I know that still only have tried ChatGPT makes me think that most users are quite lazy once they have a tool. I suspect the (vast?) majority don't bother moving without good reason.
    So to me it seems to make sense to take as much marketshare in this as possible, rather than saving a few $10/100ms on COGS optimisation, then do the COGS optimisation later.
    FWIW all LLM stuff is quite easily replaceable, but that's not stopping anyone from trying to aggressively grow marketshare.
    
    mattnewton a day ago
    
    Fair point, I always forget how much of an early adopter I and the developers I am surrounded by are. There are plenty of developers who use whatever the CTO approved at their shop, which yesterday was probably github's copilot but tomorrow could be Cursor + Cursor Agents if they continue to out execute.

ants_everywhere 2 days ago

The other day I was doing major refactorings on two projects simultaneously while doing design work for two other projects. It occurred to me to check my API usage for Gemini and I had spent $200 that day already.

Users are no doubt working these things even harder than I am. There's no way they can be profitable at $200 a month with unlimited usage.

I think we're going to evolve into a system that intelligently allocates tasks based on cost. I think that's part of what openrouter is trying to do, but it's going to require a lot of context information to do the routing correctly.

hboon 2 days ago

> One user, who asked not to be identified, said it has been impossible to advance his project since the usage limits came into effect. “It just stopped the ability to make progress,” the user told TechCrunch. “I tried Gemini and Kimi, but there’s really nothing else that’s competitive with the capability set of Claude Code right now.”

PMF.

kristianp 2 days ago

Product Market Fit? Yes, seems they have.
- hboon 2 days ago
  
  Yes. That kind of reaction I quoted is telling.

memothon 2 days ago

I made a quick site so you can see what tools are using the most context and help control it, totally free and in your browser.

https://claude-code-analysis.pages.dev/

bgwalter 2 days ago

“It just stopped the ability to make progress,” the user told TechCrunch. “I tried Gemini and Kimi, but there’s really nothing else that’s competitive with the capability set of Claude Code right now.”

This is probably another marketing stunt. Turn off the flow of cocaine and have users find out how addicted they are. And they'll pay for the purest cocaine, not for second grade.

ceejayoz 2 days ago

It was always gonna be the Uber approach. Cheap and great turns to expensive and mediocre when they have to turn the money spigot on.
- sneak 2 days ago
  
  Even expensive Uber or expensive Sonnet 4 is still a great deal when you consider the BATNA. Programmers on the level of Sonnet 4 are expensive.

WhyNotHugo 2 days ago

I wish models which we can self-host at home would start catching up. Relying on hosted providers like this is a huge risk, as can be seen in this case.

I just worry that there’s little incentive for bit corporations to research optimising the “running queries for a single user in a consumer GPU” use case. I wonder if getting funding for such research is even viable at all.

vineyardmike 2 days ago

We already have really strong models that run on a consumer GPU, and really strong frameworks and libraries to support them.
The issue is (1) the extra size supports extra knowledge/abilities for the model. (2) a lot of the open source models are trained in a way to not compete with the paid offerings, or lack the data set of useful models.
Specifically, it seems like the tool-use heavy “agentic” work is not being pushed to open models as aggressively as the big closed models. Presumably because that’s where the money is.
oceanplexian a day ago

Deepseek R1 is better than any model released more than 6 months ago. You can plug it into open source equivalents of Claude Code like Goose. And it runs on a Mac Studio which you can buy at any consumer electronics store.
The people saying that big tech is going to Gatekeep their IDE or pull the plug on them don’t seem to realize that the ship has already sailed. Good LLMs are here permanently and never going away. You just might have to do slightly more work than whipping out a credit card and buying a subscription product.
YmiYugy 2 days ago

I think model providers would love to run their models on a single GPU. The latency and throughput of GPU interconnects is orders of magnitudes worse than accessing VRAM. Cutting out the latency would make the models much more efficient to run, they wouldn't have to pay for such expensive networking. If they got to run it on consumer GPUs even better. Consumer GPUs probably cost something like 5-10x less with regards to raw compute than data center ones. New coding optimized models for single GPUs drop all the time. But it's just a really hard problem to make them good and when the large models are still in the barely good enough phase (I wasn't using agents much before Sonnet 4) it's just not realistic to get something useful locally.
ethan_smith 2 days ago

Check out Llama 3.1 70B Instruct, which now runs on consumer hardware with 24GB VRAM using techniques like unsloth or llama.cpp's new Q4_K_M quantization - surprisingly competitive with Claude for many coding tasks.

submeta 2 days ago

It’s like you buy an M4 MacBook and silently Apple throttles it and makes it an M1. If that happened every tech magazine would write about it, consumer advocates would be in rage.

How’s it possible that AI companies sell you a product for 100 USD a month and silently degrade it?

edg5000 2 days ago

They have fixed capacity but want to provide a reasonable product for all. They've really struggled with defining and enforcing quotas. It seems they have issues predicting what their actual capacity is, how many users they want. Also, there is not much competition. The prices are not really anchored to anything. Hopefully it will improve.

StarterPro 2 days ago

I was wondering when we'd get to the end of the reduced fare portion of the bubble. With the layoffs, increase in prices, we're going to see a lot more vibe coding to justify the price of ai.

With the talented employees laid off, I predict there will be some VERY LARGE code mistake pushed, either banking or travel, and we'll either see the burst, or they'll move on to strictly enterprise/gov opportunities. (see: grok for gov)

>Super Free [Introduction, lots of free options] >Reduced Fare [higher prices, smaller pool of free options](We are here) >Premium Only [only paid options, mostly whales]

jmartrican 2 days ago

I have the $100 plan and now quickly get downgraded to Sonnet. But so far have not hit any other limits. I use it more on the weekends over several hours, so lets see what this weekend has in store.

I suspected that something like this might happen, where the demand will outstrip the supply and squeeze small players out. I still think demand is in its infancy and that many of us will be forced to pay a lot more. Unless of course there are breakthroughs. At work I recently switched to non-reasoning models because I find I get more work done and the quality is good enough. The queue to use Sonnet 3.7 and 4.0 is too long. Maybe the tools will improve reduce token count, e.g. a token reducing step (and maybe this already exists).

edg5000 2 days ago

You are using the "auto-switch back to Sonnet" mode right? Try just selecting Opus without the auto-switch, probably you'll get more Opus and may not run out of it. Anthropic is just being careful because Opus eats compute and they don't want people getting disappointed. But for me it does not run out that quickly. Only when I asked it to work on a 50k+ source code file, which it had to ingest entirely in my case, is when I ran out.
- jmartrican a day ago
  
  Oh wow I didnt even know about that. Yeah it auto switches. I'll change the config.
j45 2 days ago

Off hour usage seems to be different for sure.
Also there's likely only so much fixed compute available, and it might be getting re allcoated for other uses behind the scene from time to time as more compute arrives.

jasonthorsness 2 days ago

Is it really worth it to use opus vs. sonnet? sonnet is pretty good on its own.

MystK 2 days ago

It's definitely worth it if you're on the plans and don't hit the usage limits already. It's subjectively better based on my experience.

jablongo 2 days ago

Id like to hear about the tools and use cases that lead people to hit these limits. How many sub-agents are they spawning? How are they monitoring them?

rancar2 2 days ago

There was a batchmode pulled from the documentation after the first few days of the Claude Code release. Many of have been trying to be respectful with a stable 5 agent call but some people have pushed those limits much higher as it wasn’t being technically throttle until last week.
- WJW 2 days ago
  
  Tragedy of the commons strikes again...
TrueDuality 2 days ago

One with only manual interactions and regular context resets. I have a couple of commands I'll use regularly that have 200-500 words in them but it's almost exclusively me riding that console raw.
I'm only on the $100 Max plan and stick to the Sonnet model and I'll run into the hard usage limits after about three hours, that's been down to about two hours recently. The resets are about every four hours.
Capricorn2481 2 days ago

I'm not on the pro plan, but on $20/mo, I asked Claude some 20 questions on architecture yesterday and it hit my limit.
This is going to be happening with every AI service. They are all burning cash and need to dumb it down somehow. Whether that's running worse models or rate limiting.
micromacrofoot 2 days ago

I've seen prompts telling it to spawn an agent to review every change it makes... and they're not monitoring anything

boesboes a day ago

Yeah, i noticed it falls back to sonnet much quicker too. Most days within the first few minutes.

And the 529.. it's borderline unusable at times.

But the worst part: While claude code, the tools and cli etc, has become much better over the last weeks; it seems the models or prompts have gotten worse. It will do things like add a test, see that it fails, claim it was already broken and out of scope. Or maybe i ask it to implement this, DO NOT use Y. Use Y anyway. Sometimes i ask it to update a test & it decides to revert all changes in the application code.

I am very close to cancelling my plan again. Maybe it is a great deal when comparing too api pricing, but that is not a fair comparison; prepaid/pay-as-you-go is always way more expensive. And if they priced it wrong, change the pricing? Degrading a new service you are developing to save cost is even more stupid then the burn money until ??? profit stats imo hehe

blibble 2 days ago

the day of COGS reckoning for the "AI" industry is approaching fast

smcameron 2 days ago

COGS == cost of goods sold?

khurs 2 days ago

All you people who were happy to pay $100 and $200 a month have ruined it for the rest of us!!

tho234i32242234 2 days ago

Hardly surprising.

AWS Bedrock which seems to be a popular way to get access to Claude etc. while not having to go through another "cloud security audit", will easily run up ~20-30$ bills in half-hour with something like Cline.

Anthropic likely is making bank with this and can afford to lose the less-profitable (or even loss-making) business of lone-man developers.

sneilan1 2 days ago

So far I’ve had 3-4 Claude code instances constantly working 8-12 hours a day every day. I use it like a stick shift though. When I need a big plan doc, switch to recommended model between opus and sonnet. And for coding, use sonnet. Sometimes I hit the opus limit but I simply switch to sonnet for the day and watch it more closely.

mpeg 2 days ago

Honest question: what do you do with them? I would be so fascinated to see a video of this kind of workflow… I feel like I use LLMs as much as I can while still being productive (because the code they generate has a lot of slop) and still barely use the agentic CLIs, mostly just tab completion through windsurf, and Claude for specific questions by steering the context manually pasting the relevant stuff
- sneilan1 2 days ago
  
  I focus more on reading code & prompting claude to write code for me at a high level. I also experiment a lot. I don't write code anymore by hand except in very rare cases. I ask claude for questions about the code to build understanding. I have it produce documentation, which is then consumed into other prompts. Often, claude code will need several minutes on a task so I start another task. My coding throughput on a day to day basis is now the equivalent of about 2-3 people.
  I also use gemini to try out trading ideas. For example, the other day I had gemini process google's latest quarterly report to create a market value given the total sum of all it's businesses. It valued google at $215. Then I bought long call options on google. Literally vibe day trading.
  I use chat gpt sora to experiment with art. I've always been fascinated with frank lloyd wright and o4 has gotten good enough to not munge the squares around in the coonley playhouse image so that's been a lot of fun to mess with.
  I use cheaper models & rag to automate categorizing of my transactions in Tiller. Claude code does the devops/python scripting to set up anything google cloud related so I can connect directly to my budget spreadsheet in google sheets. Then I use llama via openrouter + a complex RAG system to analyze my historical credit card data & come up with accurate categorizations for new transactions.
  This is only scratching the surface. I now use claude for devops, frontend, backend, fixing issues with embedder models in huggingface candle. The list is endless.
  - aoaoaoans 2 days ago
    
    Can you share some code? I work with a guy like this who claims this level of output but in reality he consumes massive amounts of other devs time in PR review.
    Are you doing a lot of broad throwaway tasks? I’ve had similar feelings when writing custom code for my editor, one off scripts, etc but it’s nothing I would ever put my professional reputation behind.
    
    sneilan1 2 days ago
    
    Sorry, most of my code is proprietary. However, I have a stock exchange project on my github I plan to rewrite in rust. I'm pretty busy now at work but I'll do that using claude code.
    If your friend is consuming massive amounts of other dev time in PR reviews, maybe he has other issues. I'm willing to bet even without agentic coding, he would still be problem for your coworkers.
    Sometimes I do broad throwaway tasks. For example I needed a rust lambda function that would do appsync event authorization for jwt tokens. All it needed to do was connect to aws secrets, load up the keys & check inbound requests. I basically had claude-code do everything from cdk to building/testing the rust function & deploying to staging. It worked great! However, I've certainly had my fair share of f-ups like I recently tried doing some work on the frontend with claude code and didn't realize it was doing useEffect everywhere!! Whoops. So I had to adapt and manage 2-3x claude code instances extremely closely to prevent that from happening again.
    
    sneilan1 2 days ago
    
    As a follow-up, I've gotten much much faster at modeling code in my mind and directly translating it into prompts. It really changes how you code! For each task, I'm extremely specific about what I want and depending on how closely claude does what I want, I change my specificity. Sometimes like with the lambda function, I can be high level and with my react.js codebase, due to it's lack of types (I know...) needs extra attention.
    To be effective with agentic coding, you have to know when to go high level and low level. And have to accept that sometimes agentic coders need a lot of help! It all depends on how much context you give it.
    
    MystK 2 days ago
    
    I wouldn't blame AI. A terrible developer using AI becomes a bad developer. A good developer using AI becomes a great developer.
  - moomoo11 2 days ago
    
    Do you use it to make any actual products that makes money?
    
    sneilan1 a day ago
    
    yes. https://app.grantpuma.com/
    
    whatarethembits 9 hours ago
    
    I can’t read any text on homepage with iOS Safari as first few characters/words of each sentence is cut off. Text justification is different on different pages. Vertical spacing and images are uneven across the site. FAQ accordion animation is jittery and the hamburger menu doesn’t open actual menu from any other page other than homepage. Going back to homepage from FAQ page renders homepage in previous state, such as with sidebar open, momentarily before resetting to expected state. Focusing on input on sign in page zooms in on email field, but half of it is off page.
    I’m assuming this is work in progress and currently it’s been vibe coded to MVP stage?
nikisweeting 2 days ago

I do the same with two $200 MAX plans that I switch between when one hits the limit. I use opus exclusively though so I tend to hit the first account's limits at least once a day.
apwell23 2 days ago

lol thanks for ruining it for the rest of us. i am sure you created something groundbreaking with 4 instances of cc.

charlysl 2 days ago

I think that Cursor is doing the same. A couple of weeks ago they removed the 500 prime model requests limit per month in the $20 plan, it seemed like this was going to be good for users, in fact it's worse, my impression is that now the limit is effectively much lower, and you can't check anymore in your account's dashboard how many of these requests you've made over the last month.

andix 2 days ago

I guess flat fee AI subscriptions are not a thing that is going to work out.

Probably better to stay on usage based pricing, and just accept that every API call will be charged to your account.

rob 2 days ago

I don't think CLI/terminal-based approaches are going to win out in the long run compared to visual IDEs like Cursor but I think Anthropic has something good with Claude Code and I've been loving it lately (after using only Cursor for a while.) Wouldn't be surprised if they end up purchasing Cursor after squeezing them out via pricing and then merging Cursor + Claude Code so you have the best of both worlds under one name.

anonzzzies 2 days ago

That must somehow be illegal, at least in the consumer space. I have noticed quicker degrade to sonnet, but don't often hit limits ($200 plan). Seems none of these guys can afford loyalty, so I will be skipping between 'tool of the month' instead of sticking with one. New companies with new VC money are good for a few months and then degrade, so it's not hard to do.

LeicaLatte 2 days ago

Customers are always chasing the next big thing in this space. As a programmer who’s worked on mobile UIs and backends using CC, I can say the appeal isn’t really the model itself—it’s the form factor. It’s a decent product, but hardly groundbreaking or essential.

solosquad a day ago

They didn’t reduce it they actually increased it. I was using it for 14 hours straight without any issues. I think they did that to stay competitive, but now it seems like it's back to normal.

paulhodge 2 days ago

There’s been a ton of ‘service overloaded’ errors this week so it makes sense that they had to adjust it.

Personally I’ve never hit a usage limit on the $100 plan even when running several Claude tabs at once. I can’t imagine how people can max out the $200 plan.

gitaarik 2 days ago

As a independent dev, I haven't had the need to pay for any AI yet. When I run into my limit at one company, I switch to the next one. Not always the same experience, but the next day I can start fresh again.

bad_haircut72 2 days ago

I went from pro to max because I hve been hitting limits, I could tell they were reducing it because I used to go multiple hours on pro but now its like 3. Congrats Anthropic you got $100 more out of me, at the cost of irrecoverable goodwill

hellcow 2 days ago

For what it's worth, when Cursor downgraded their Claude limits in the middle of my annual subscription term, I emailed them to ask for a pro-rated refund, and it was granted. You may be able to do something similar with Claude Code.
Changing the terms of the deal midway through a subscription to make it much less valuable is a really shady business practice, and I'm not sure it's legal.

ramon156 2 days ago

What are these people doing to hit their limit this fast?

I put $30 on it, use it daily at work, and I still have half of it left. Are Zed agents just that optimized? I doubt it

globular-toast 2 days ago

This is what really makes me sceptical of these tools. I've tried Claude Code and it does save some time even if I find the process boring and unappealing. But as much as I hate typing, my keyboard is mine and isn't just going to disappear one day, have its price hiked or refuse to work after 1000 lines. I would hate to get used to these tools then find I don't have them any more. I'm all for cutting down on typing but I'll wait until I can run things entirely locally.

MisterSandman 2 days ago

I guess the argument has time goes on AI will get cheaper and more efficient.
…but idk how true that, I think it’s pretty clear that these companies are using the Uber model to attract customers, and the fact that they’re already increasing prices or throttling is kind of insane.
bigiain 2 days ago

> my keyboard is mine and isn't just going to disappear one day, have its price hiked or refuse to work after 1000 lines.
I dunno, from my company or boss's perspective, there are definitely days where I've seriously considered just disappearing, demanding a raise, or refusing to work after the 3rd meeting or 17th Jira ticket. And I've seen cow orkers and friends do all three of those over my career.
(Perhaps LLMs are closer to replacing human developers that anyone has realized yet?)
sneak 2 days ago

Now say it about electricity from the wall. Just because you might not be able to use it next week doesn’t mean the productivity gains today aren’t real.
- globular-toast 2 hours ago
  
  I maintain the ability to walk even though I could use an electric mobility scooter everywhere.

ladon86 2 days ago

I think it was just an outage that unfortunately returned 429 errors instead of something else.

t14000 2 days ago

Title should be: Anthropic caught getting high on their own supply

deadbabe 2 days ago

Does anyone not realize they are just using the typical drug dealer type business model? I used to do cocaine and it was a similar vibe.

They will turn you into an AI junkie who no longer has motivation to do anything difficult on your own (despite having the skills and knowing how), and then, they will dramatically cut your usage limit and say you’ll need to pay more to use their AI.

And you will gladly pay more, because hey you are getting paid a lot and it’s only a few hundred extra. And look at all the time you save!

Soon you’re paying $2k a month on AI.

sneak 2 days ago

A subcontracted dev that can do that much is way more than $2k. I’ve had $800-on-AI months and it was a good deal.

matt3210 2 days ago

At some point they’ll need to make money. Expect it to go way up

kordlessagain a day ago

Hopefully Anthropic is reading these.

I want to start by saying that Claude as a model is excellent. The AI performs exceptionally well at problem-solving and code generation across a wide variety of tasks. My feedback focuses specifically on the Claude Desktop application for Windows.

Stability and Performance: The application frequently displays confusing error messages in the top-right corner, especially when MCP servers encounter issues or during service maintenance. The interface regularly freezes for 30+ seconds, goes completely blank, then takes another 30 seconds to recover. The application often becomes unresponsive to user input, requiring clicks to reactivate. Despite advertising 50+ tools to the LLM, tools are frequently reported as "not found" on first attempt, then work on retry.

User Experience Problems: Loading historical chats takes 5-10 seconds per click, making navigation extremely slow. Chat history is only searchable by auto-generated titles, which are often irrelevant to the actual conversation content. Deleting multiple chats is nearly impossible due to slow loading and poor navigation. No centralized view exists for created artifacts, making them difficult to locate later.

Session Management: The application provides insufficient warning about approaching conversation length limits. Previously available warnings have disappeared. The persistent "Claude can't run code...yet" message is misleading given existing tools like REPL and custom MCP implementations.

Interface Issues: The support chat UI randomly covers interface elements and lacks message editing capabilities. Service status messages are unclear and don't effectively communicate what's happening.

Suggestions for Improvement

1. Implement clear conversation length warnings at 75% and 90% capacity

2. Add an artifacts gallery for easy access to created content

3. Improve chat search with content-based indexing, not just titles

4. Optimize chat loading performance to reduce switching delays

5. Provide better error messaging that clearly explains issues and solutions

6. Add bulk management tools for chat organization and deletion

7. Remove or update misleading messages about code execution capabilities

Claude Desktop is a critical tool in my workflow, but these issues significantly impact productivity and user experience. The application has tremendous potential, and addressing these concerns would greatly improve its value proposition.

If continued development isn't prioritized, consider open-sourcing the desktop application to allow the community to contribute improvements.

93po a day ago

big agree on making usage more visible, i have no idea how much i have left of opus, or even sonnet, and the command to check says i have unlimited but that doesnt appear to be the case bc i got cut off of sonnet even on the $100/m plan

dyl000 2 days ago

opencode with kimi-k2 is my backup just in case claude is down or I hit the limits on the max 20x plan.

tom_m 19 hours ago

No surprise. I always thought it was so pathetically low that it couldn't do much meaningful. I'm sure now if I tried it again, it'd be worse.

apwell23 2 days ago

oh yea looks like everyone and their grandma is hitting claude code

https://github.com/anthropics/claude-code/issues/3572

Inside info is they are using their servers to prioritize training for sonnet 4.5 to launch at the same time as xAI dedicated coding model. xAI coding logic is very close to sonnet 4 and has anthropic scrambling. xAI sucks at making designs but codes really well.

isusmelj 2 days ago

Demand > Supply?

ramesh31 a day ago

Probably why they were able to just 10x the token based API limits.

jay_kyburz 2 days ago

Where is my model I can run local and off line?

That's when the LLM stuff is going to take off for me.

gavmor 2 days ago

Haven't you checked out Ollama, yet?

yahoozoo 2 days ago

Where’s your Ed at was right?

BolexNOLA 2 days ago

Ha was just talking about this coming down the pipeline with folks days ago (in so many words) https://news.ycombinator.com/context?id=44565481

iwontberude 2 days ago

Claude Code is not worth the time sink for anyone that already knows what they are doing. It's not that hard to write boilerplate and standard llm auto-predict was 95% of the way to Claude Code, Continue, Aider, Cursor, etc without the extra headaches. The hangover from all this wasted investment is going to be so painful.

serf 2 days ago

>Claude Code is not worth the time sink
there are like 15~ total pages of documentation.
There are two folders , one for the home directory and one for the project root. You put a CLAUDE.md file in either folder which essentially acts like a pre-prompt. There are like 5 'magic phrases' like "think hard", 'make a todo', 'research..' , and 'use agents' -- or any similar set of phrases that trigger that route.
Every command can be ran in the 'REPL' environment for instant feedback, it itself can teach you how to use the product, and /help will list every command.
The hooks document is a bit incomplete last I checked, but it's a fairly straightforward system, too.
That's about it -- now explain vi/vim/emacs/pycharm/vscode in a few sentences for me. The 'time sink' is like 4 hours for someone that isn't learning how to use the computer environment itself.
- freedomben 2 days ago
  
  Yeah, Claude Code was by far the quickest/easiest for me to get set up. The longest part was just getting my API key
Sevii 2 days ago

I've spent far too much of my life writing boilerplate and API integrations. Let Claude do it.
- axpy906 2 days ago
  
  I agree. It’s a lot faster to tell it what I want and work on something else in the meantime. You end up ready code diffs more than writing code but it saves time.
Implicated 2 days ago

Comments like this remind me that there's a whole host of people out there who have _no idea_ what these tools are capable of doing to ones productivity or skill set in general.
> It's not that hard to write boilerplate and standard llm auto-predict was 95% of the way to Claude Code, Continue, Aider, Cursor, etc without the extra headaches.
Uh, no. To start - yea, boilerplate is easy. But like a sibling comment to this one said - it's also tedious and annoying, let the LLM do it. Beyond that, though, is that if you apply some curiosity and that "anyone that already knows what they are doing" level prior knowledge you can use these tools to _learn_ a great deal.
You might think your way of doing things is perfect, and the only way to do them - but I'm more of the mindset that there's a lot of ways to skins most of these cats. I'm always open to better ways to do things - patterns or approaches I know nothing about that might just be _perfect_ for what I'm trying to do. And given that I do, in general, know what I'm asking it to do, I'm able to judge whether it's approach is any good. Sometimes it's not, no big deal. Sometimes it opens my mind to something I wasn't aware of, or didn't understand or know would apply to the given scenario. Sometimes it leads me into rabbit holes of "omg, that means I could do this ... over there" and it turns into a whole ass refactor.
Claude code has broadened my capabilities, professionally, tremendously. The way it makes available "try it out and see how it works" in terms of trying multiple approaches/libraries/databases/patterns/languages and how those have many times led me to learning something new - honestly, priceless.
I can see how these tools would scare the 9-5 sit in the office and bang out boilerplate stuff, or to those who are building things that have never been done before (but even then, there's caveats, IMO, to how effective it would/could be in these cases)... but to people writing software or building things (software or otherwise) because they enjoy it or because they're financial or professional lives depend on what they're building - absolutely astonishing to me anyone who isn't embracing these tools with open arms.
With all that said. I keep the MCP servers limited to only if I need it in that session and generally if I'm needing an MCP server in an on-going basis I'm better off building a tool or custom documentation around that thing. And idk about all that agent stuff - I got lucky and held out for Claude Code, dabbled a bit with others and they're leagues behind. If I need an agent I'ma just tap on CC, for now.
Context and the ability to express what you want in a way that a human would understand is all you need. If you screw either of those up, you're gonna have a bad time.
- adamtaylor_13 2 days ago
  
  Well said. People seem to be binary: I code with it or I don’t.
  Very few folks are talking about using the LLMs to sharpen THE DEVELOPER.
  Just today I troubleshot an issue that likely would’ve taken me 2-3 hours without additional input. I wrapped it up and put a bow on it in 15 minutes. Oh, and also wrote a CLI tool fix the issue for me next time. Oh and wrote a small write up for the README for anyone else who runs into it.
  Like… if you’re not embracing these tools at SOME level, you’re just being willfully ignorant at this point. There’s no badge of honor for willfully staying stuck in the past.
- iwontberude a day ago
  
  I’m still using it for home use, it’s totally adequate for one person code bases. Once you have teams and tribal knowledge to consider, Claude is a waste of time. It valuable for non-commercial or hobbyist grade work.

ajeetgymlover 3 hours ago

[dead]

Devadipta 2 days ago

[dead]

mdjameel59 16 hours ago

[dead]

Follow_Cloud 2 days ago

[dead]

38 2 days ago

[flagged]