This coming right after a noticeable downgrade just makes me think Opus 4.7 is going to be the same Opus i was experiencing a few months ago rather than actual performance boost.
Anthropic need to build back some trust and communicate throtelling/reasoning caps more clearly.
They don't have enough compute for all their customers.
OpenAI bet on more compute early on which prompted people to say they're going to go bankrupt and collapse. But now it seems like it's a major strategic advantage. They're 2x'ing usage limits on Codex plans to steal CC customers and it seems to be working.
It seems like 90% of Claude's recent problems are strictly lack of compute related.
Is that why Anthropic recently gave out free credits for use in off-hours? Possibly an attempt to more evenly distribute their compute load throughout the day?
> Is that why Anthropic recently gave out free credits for use in off-hours?
That was the carrot for the stick. The limits and the issues were never officially recognized or communicated. Neither have been the "off-hours credits". You would only know about them if you logged in to your dashboard. When is the last time you logged in there?
- Selling those requests at less money than it cost to run the compute for those requests (because if you raise price clients go to openai)
The statements are not contradicting each other?
They keep subsidizing to try to grow customer base, but they can't serve the customer base they have, they're expecting customer base grows faster than it drops from people bothered with rate limits (it probably will, average user won't hit rate limits enough to change)
Probably expecting a breakthrough in efficiency for compute, or getting enough cash flow (IPO?) to get more compute before it all comes crashing down
Model inference compute over model lifetime is ~10x of model training compute now for major providers. Expected to climb as demand for AI inference rises.
Honestly, I personally would rather a time-out than the quality of my response noticably downgrading. I think what I found especially distrustful is the responses from employees claiming that no degredation has occured.
An honest response of "Our compute is busy, use X model?" would be far better than silent downgrading.
Are they convinced that claiming they have technical issues while continuing to adjust their internal levers to choose which customers to serve is holistically the best path?
It worked. Although I have a Claude Code subscription, I got the ChatGPT Pro plan, and 5.4 xHigh at 1.5x speed was better than 4.6 with adaptive thinking disabled. I was working all day, about 8 hours, and did not run into any limits. 5.4 surprised me many times by doing things I usually would not do myself, because I am lazy, so yeah, I am sticking with 5.4 for now until all the Claude drama is over.
OpenAI though made crazy claims after all its responsible for the memory prices.
In parallel anthropic announced partnership with google and broadcom for gigawatts of TPU chips while also announcing their own 50 Billion invest in compute.
OpenAI always believed in compute though and i'm pretty sure plenty of people want to see what models 10x or 100x or 1000x can do.
Betting on continued exponential growth is basically a game of chicken. Growth has to slow down and level off at some point as adoption and usage saturates.
It's a bit like playing roulette by always betting on black and doubling your bet every time you lose. When you eventually, inevitably, do lose, your loss is going to be huge because you've been doubling your bet at each stage.
With LLM model generations and investment, it goes something like this. Let's say profits have been doubling year over year for each new model/investment cycle, and you want to bet on this doubling continuing forever.
Year 1 you get $10B in profit, and spend $20B on extra capacity for next year
Year 2 you get $20B in profit, and spend $40B on extra capacity
Year 3 you get $30B in profit, and spend $??? on extra capacity
You're already in trouble. Profit growth from Year 2 to 3 was "only" 50% vs the doubling you were gambling on, so you've now lost $10B ($40B spent only earnt you $30B of profit), and what are you going to do? Double down like the roulette player?
The longer the pattern of profit doubling goes, before it slows down, the worse it will end for you, since your bets are doubling each year. Saying "woo hoo, look at me! risk pays!" is a bit like saying the same while playing russian (not casino) roulette for money.
I worked for Acorn Computers UK in the early 80's and saw something similar firsthand. The brand new personal computer market was exploding, a once in a lifetime phenomenon, that no-one knew how to forecast. To make matters worse the market was highly seasonal with most sales at xmas, so the company had to guess what continued year-on-year exponential growth might look like (brand new market - no-one had a clue), and plan/spend ahead and stock warehouses full of computers ready for xmas. Sadly Acorn took the Sam Altman highly optimistic/irresponsible approach, got the forecast wrong, and was left with a huge warehouse full of rapidly depreciating computers. The company never fully recovered, although ARM rose out of the ashes.
Usually they're hemorrhaging performance while training.
From that it's pretty likely they were training mythos for the last few weeks, and then distilling it to opus 4.7
Pure speculation of course, but would also explain the sudden performance gains for mythos - and why they're not releasing it to the general public (because it's the undistilled version which is too expensive to run)
but how true is this? this is almost impossible to measure and those that do[1] find no significant difference
i personally haven't noticed any downgrade at all.
it's entirely possible there's a mass delusion going on where everyone gets wowed by 4.6 initially, then accepts the new baseline and gets used to it, then thinks that baseline is no longer impressive and thus degraded
it doesn't help that anthropic changed defaults for its claude code harness for all users suddenly
the best and only evidence i've seen for actual degradation is that the web version of opus 4.6 failed the car wash test, and since you cannot simply choose to "disable adaptive thinking" and other parameters with the web version, you truly may have gotten a worse product
What I want to know is why my bedrock-backed Claude gets dumber along with commercial users. Surely they're not touching the bedrock model itself. Only thing I can think of is that updates to the harness are the main cause of performance degradation.
If we learned anything from the code leak is that they essentially do not know what is in the blackbox of the code for that 500k line mass. So that's plausible.
> This coming right after a noticeable downgrade just makes me think Opus 4.7 is going to be the same Opus i was experiencing a few months ago rather than actual performance boost.
If they are indeed doing this, I wonder how long they can keep it up?
This coming right after a noticeable downgrade just makes me think Opus 4.7 is going to be the same Opus i was experiencing a few months ago rather than actual performance boost.
Anthropic need to build back some trust and communicate throtelling/reasoning caps more clearly.