More

frevib · 2026-06-09T17:20:17 1781025617

At this point Anthropic is a pure marketing and PR company. Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences. Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

From Opus 4.6 there are no noticeable improvements for me in code generation. It works very well, till 90% completion, if you guide it correctly. And you need a little luck. For serious production code I need to understand what I’m doing so it helps a bit, sometimes.

matheusmoreira · 2026-06-09T17:52:46 1781027566

> Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

This is a good thing. I wish every company would do this. I subscribed to Proton Mail after interacting with someone from their team here on HN.

pinkmuffinere · 2026-06-09T17:40:03 1781026803

> catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences

This is just good business sense. In what scenario would you ever make the names dumb and forgettable?

> Boris Cherny coming to HN “Hi! it’s Boris from the Claude Code team” to get real tech people’s goodwill.

This is good customer support, lol. From what I can tell, it is indeed Boris Cherny responding, not outsourced to AI or other staff. You're really getting a response from Boris. I suppose that is PR, but it's not unjustified PR, it's accurate.

I'm not even a crazy AI fan, but your criticisms are ridiculous here. It reminds me of the quote from Knives Out -- "Your Honor, she endeared herself to him through hard work and good humor."

IshKebab · 2026-06-09T17:50:11 1781027411

> In what scenario would you ever make the names dumb and forgettable

Clearly you've never bought a TV or headphones!

aspenmartin · 2026-06-09T17:31:44 1781026304

Your observations are right but pretty insane to consider them a pure PR company lol. They are making more frequent releases so yes the release-to-release quality is smaller but we’re still ascending quality and reliability curves the same way we have since GPT-3. You get a GPT4->5 leap every like 17 or 18 months I think it is

kingkongjaffa · 2026-06-09T18:14:48 1781028888

The gradient of improvement is absolutely not the same.

aspenmartin · 2026-06-09T18:42:13 1781030533

If anything its slightly higher. Feel free to provide any evidence to the contrary.

ECI (good aggregate measure using IRT): https://epoch.ai/eci?view=graph&tab=release-date&subset-view...

METR time horizon (now topped out): https://metr.org/time-horizons/

WASDx · 2026-06-09T20:15:44 1781036144

I like this one, although its data seem to overlap with ECI.

https://artificialanalysis.ai/trends

astrange · 2026-06-09T18:05:27 1781028327

> Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human life changing experiences.

They're originally named after the blends at a nearby coffee shop.

https://postscript.co/pages/brew-guide

I've noticed nobody at HN knows what "marketing" is or how to do it. It's not just naming things and being evil and cynical is not the most successful method.

…also frontier models are a superhuman life changing experience. If they aren't, what possibly could be?

ValentineC · 2026-06-09T20:58:04 1781038684

Found a tweet from a year ago about this:

https://twitter.com/brian_a_burns/status/1866987688794132816

Well, TIL.

chroma_zone · 2026-06-09T19:58:29 1781035109

My life has changed, but not necessarily for the better.

bitpush · 2026-06-09T18:11:49 1781028709

This is interesting. Do you have any source?

CuriouslyC · 2026-06-09T17:26:40 1781026000

I dislike Anthropic but I wouldn't argue 4.8 isn't an improvement on 4.5/4.6. Your tasks just might not typically need the extra intelligence.

jorl17 · 2026-06-09T17:44:11 1781027051

Opus 4.7/4.8 often over-engineers on my setups, plus:

- It talks a LOT more like GPT models. You know: wrinkle, shape, gate, coarse, scope, gap, path, production-ready-workflow-of-the-day, and so on -- "that's expected, a consequence of the previous like-driven workflow". If I wanted to get a headache using AI I would have gone with GPT in the first place!

- It outputs text in a much harder way to follow along. I can't exactly say what it is. Maybe a bit of everything? Bolds are missing, bullet points are gone, paragraphs are bland and too long, and it doesn't feel like a model programming with me, but rather a somewhat full of themselves grandpa developer looking down on me. It's very weird to describe this, but it is definitely how I feel.

Granted this can totally be because of the way it reacts to the prompts now. We've got a rather large corpus of skills and "rules and good practices" that Opus 4.6 responded to great, and maybe the new models just get turned into this when fed with them....I don't know.

Either way, with Opus 4.6 being as good as it is, I need Fable to be a significant step up to justify a price increase. if it can get me to babysit opus a little bit less on some stuff, it might be worth it. Otherwise, I'm very happy with Opus 4.6 and hope they don't deprecate it.

taormina · 2026-06-09T17:38:15 1781026695

I'd argue that 4.8 is a straight downgrade. For every type of task I've tried. It's been a gambit at this point. If 4.6 quits being available, I'm out at this point.

coronapl · 2026-06-09T19:03:47 1781031827

Reading so many contrary positions about which model is better or worse shows how difficult it is to measure intelligence based on personal experiences. Of course, benchmarks try to make the process as objective as possible, but they often don't correlate with our personal experiences.

The other day 4.6 was fantastic for x task. Today, 4.6 overengineered everything and I had to revert all my changes. When evaluating models, perhaps it makes sense to consider luck as an ingredient before reaching any personal conclusion.

surgical_fire · 2026-06-09T17:45:41 1781027141

I actually experience 4.8 as worse than 4.6 for everyday coding tasks.

dcchambers · 2026-06-09T17:30:55 1781026255

IME Opus 4.8 (and 4.7) is often a downgrade from 4.6. I find that it tends to overthink and overcomplicate things.

aspenmartin · 2026-06-09T17:33:45 1781026425

Yes but there’s a reason we don’t evaluate these models this way and instead do it as carefully and thoughtfully as we can at scale. Human evaluations are important but they are an absolute minefield of footguns. 4.8 is not a downgrade from 4.6 there is an insane amount of hard data that contradicts this.

computerex · 2026-06-09T17:44:04 1781027044

The flip side is that benchmarks are gamed even by the top labs. Benchmark performance doesn't necessarily correlate with real world performance.

aspenmartin · 2026-06-09T18:00:30 1781028030

Again correct but it overstates the issue. I can say labs don’t want this. This happened arguably unintentionally in Metas llama 4 release, it went horribly, heads rolled, and like several billion dollars were paid for new talent and the org that built llama 4 was destroyed.

Evals come from a million places and new evals and robust perturbations of existing evals abound. They test a variety of tasks in a variety of ways. All of them individually are flawed. Taken together the aggregate signal is highly useful as you more or less marginalize over a lot of different things. Not to mention these companies have plenty of proprietary internal measurements, they build benchmarks themselves to probe their models and then also have flywheel traffic and A/B tests.

You are right to call out benchmarks but to dismiss them or not take them seriously is a mistake.

taormina · 2026-06-09T18:11:39 1781028699

Listen, you can say “but benchmarks, the benchmarks!” all day long, but consumer know when we are being sold a lemon. If it can’t do the most basic of things at least as good as it used to, this is table stakes. Nevermind that if you can’t do the basic stuff, how on earth can you be trusted with more?

aspenmartin · 2026-06-09T18:37:16 1781030236

And you can say “If it can’t do the most basic of things at least as good as it used to, this is table stakes” all day long while people point you to much better evidence to the contrary too, I’d rather be on the other side of that.

taormina · 2026-06-09T19:10:36 1781032236

Listen. I don’t care about evidence. I care about my lived experience for the product I paid for. I used the new product. It’s actively terrible. To the point of not being usable. We’re all ancedata, but what is “better evidence to the contrary”? The known and game-able benchmarks that they know they need to win at, so they train it to. It’s all he said, she said, which is the only reason we keep having this conversation.

aspenmartin · 2026-06-09T19:24:49 1781033089

Yea but it’s not right? You or I or the myriad of other institutions inside and outside of academia can probe these models with an evolving landscape of evaluation sets, even those unavailable to the developers. It’s just ignorance to claim benchmarks are somehow useless or all being gamed. You choose your tools in the way you want, but just don’t call it somehow better than a myriad of more carefully constructed setups and scaled evaluations.

gen220 · 2026-06-09T17:57:57 1781027877

Actually anecdata I gather on my job from myself and coworkers is the only benchmark I trust anymore, because it so heavily diverges from the “benchmarks”.

aspenmartin · 2026-06-09T18:01:44 1781028104

That’s your call just don’t expect anyone ever to take that seriously. It’s not like we don’t have exact evaluations like this.

gen220 · 2026-06-09T20:09:43 1781035783

I would encourage you to look into the open evals of some of these benchmarks (find one that actually is open-data, this is itself a good challenge), read the results generated and assess them for yourself.

This is what myself and my coworkers (and many other people in this thread) are doing on a daily basis with real stakes and real tasks – which these benchmarks are all aiming to be a proxy for. There's a real, tangible [cost]benefit to [not] using the highest-ROI models and harnesses.

The people with real incentives and skin in the game are telling you that the data diverges from "the data".

I don't mind if you don't take it seriously, our jobs are more important to us than a benchmark is.

But I wouldn't opt-out of using your own eyes and the eyes of others so easily, especially when there are literally hundreds of billions of dollars in invested capital with an interest in a certain outcome... this is how you end up in "Emperor's New Clothes" situations.

aspenmartin · 2026-06-09T21:03:34 1781039014

Investigating on your specific use cases, codebases, workflows and tasks is important, there is nothing wrong with this and in fact it’s more important than benchmarks if you can do it well but the point is that is very hard and easy to totally fool yourself and go down a suboptimal path. I understand that people are going to do it regardless, I certainly do. And I have looked at more raw benchmark data than I can really even stomach, I can see annotation data in my dreams now.

Eyes and ears of others is incredibly important. But you still seem to think somehow benchmarks is part of some giant conspiratorial cabal. You have institutions without ANY skin in the game making extremely high quality benchmarks. Consider in academia there is little else to do outside of partnerships with these companies. But benchmarks you can do completely independently and with university grant level money (it costs maybe $10-100k for a reasonable benchmark in many cases). Not only that, “real tasks” are what many benchmarks measure. You have these companies with extremely good logging and well scaled measurements to really look at what works and what doesn’t.

gen220 · 2026-06-09T23:46:18 1781048778

At this point I have a workflow that is fairly rote. I've yet to use a model newer than 4.6-1M-XHIGH that I trust to earn a higher ROI on that workflow, and not for lack of trying!

I personally don't believe in any sort of cabal (Occam's Razor hasn't let me down yet). Ultimately, I don't really care *why* they're wrong as much as I care *that* they have diverged from my rubber-meets-the-road measures of value.

That is concerning to me, because people are investing 100s of B's of capital based on the putative RoI putatively available to people like ourselves. When the benchmarks support this RoI thesis, but none of the anecdata does... that's really concerning!

Re: academics, I don't think any of the data academics have access to are good proxies for the work real people are doing. And for the data that are good proxies, the model labs certainly have access to the same data, and therefore the benchmark performance against those data is irrelevant.

aspenmartin · 2026-06-10T16:22:40 1781108560

I am in full support of custom workflow benchmarks, and choosing the best model for your use case to balance performance and expense. Thats just good operating behavior, but the problem is the foot guns and biases people have that they are convinced they dont even if they understand on an intellectual level that everyone else has them

> but none of the anecdata does... that's really concerning!

But see this is not really true -- adoption, subjective benchmarks, verifiable benchmarks, task-dependent performance, internal product metrics, living benchmarks, all point in a pretty consistent direction. Anecdata is not the plural of data. An anecdote is like a case study. It's there to motivate the things we already have which is a huge amount of performance measures for a variety of different tasks.

> Re: academics, I don't think any of the data academics have access to are good proxies for the work real people are doing.

But this isn't really true either -- you can get this data from a variety of sources that are licensable or open source, or data that you can commission. You can critique any one methodology for this but a blanket "they are hamstrung" is not really fair or accurate.

> And for the data that are good proxies, the model labs certainly have access to the same data, and therefore the benchmark performance against those data is irrelevant.

But this is also not true -- you can have exclusive license agreements, data you hold close to the heart, or data to measure models that haven't had access to it because that data was created after these models were released.

There are plenty of problems in model measurement but the answer is not to just abandon it to be cavemen with zero respect for rigor and the biases we have to be subject to as human beings.

recitedropper · 2026-06-09T18:06:28 1781028388

"Carefully and thoughtfully" is antithetical to the approach to benchmarks these days.

Maybe back when this was a scientific endeavor; not now when enormous, enormous amounts of capital are on the line. Along with an entire cult's chosen eschatology.

aspenmartin · 2026-06-09T18:39:46 1781030386

You can call it a cult but it’s several thousand skilled workers who know what they’re doing, by and large, most of whom have a PhD and know how science and statistics work. Benchmarks are incredibly hard, and any PR or comms department at any company is going to obviously want to make things as rosy as possible, but beneath this are earnest, expensive efforts to get good quality measurements. The better you can do this the better you can compete. If you want to make a modeling decision you run an ablation, and the quality of that decision is only as good as your measurements.

recitedropper · 2026-06-09T19:56:05 1781034965

The cult in this case is TESCREAL, not everyone working on AI. Last I checked not all the "several thousand skilled workers" in AI subscribe to TESCREAL ideology, although it has been a while since I've been to the Bay. Maybe things have changed since my time at Berkeley, and Dario's belief that he will eventually be made immortal by mind uploading is more widespread.

Otherwise we agree that benchmarking is hard, the benchmarks contain hard problems, and that there are many hard working people trying to accurately gauge what is going on. It is getting harder to watch though as all that is on the line taints the overall endeavor.

pythonaut_16 · 2026-06-09T19:10:20 1781032220

Seems like a bunch of noise. What does this even mean?

It sounds like you're saying "Actually you, as a human, are simply not smart enough to evaluate Opus 4.8"

aspenmartin · 2026-06-09T19:20:35 1781032835

No it’s: evaluating these systems are complex and there’s a reason why sociology, cognitive psychology, medicine, etc are all done in careful double blind conditions with pre registered tests. It’s not that humans are not smart enough, as I said human evaluations are incredibly important. And yet they are a minefield of biases you have to worry about and correct for.

- evaluations need to be done at the same time to avoid drift in your bias

- you need to worry about your test set: which questions are you asking? How many of them? Are they representative of your work?

- which one did you do first? Raters have a tendency to bias in one direction or another

- you also know the label! You know which model is which! This biases your assessment…

And on and on and on. Careful science exists for a reason.

OtomotO · 2026-06-09T19:20:31 1781032831

There is no data that I would trust that contradicts it.

Frankly I don't give a damn about data that could be made up on the spot or appears to be scientific or meaningful while it's not at all clear how it was made (up).

Claude was heavily lobotomised for my work starting somewhen in February.

I talked to friends and people I know and trust and many felt the same. (I didn't ask them whether they felt like I did, but what they felt, how happy they were with agentic coding etc.)

I quit my abo in March and talked to said friends who are still on a plan just last week: they are still not happy, but company pays so whatever...

aspenmartin · 2026-06-09T19:27:11 1781033231

That’s ok but at what point is this getting into conspiracy territory? You have just said there is nothing you would believe to the contrary, but then by definition that’s not exactly a very thoughtful or insightful position.

OtomotO · 2026-06-10T04:42:14 1781066534

I never said that I am not willing to believe the contrary.

I am not willing to believe the contrary from strangers on the interwebs or PR departments of companies who want to sell me something.

If people I genuinely trust tell me about their experiences, I am willing to try again.

But yes, if it doesn't work for me (for whatever reason, could be that I am holding it wrong), then I can accept that it works for everyone but me and still not use it.

Also "scientific" doesn't mean what it used to mean. When the n is small or it's just anecdotes (I am aware of the irony) blown out of proportion I really can't take the data and conclusions seriously

aspenmartin · 2026-06-10T13:19:32 1781097572

N isn’t small, science means what it’s always meant, statistics is a thing, and what you’re describing is just putting your trust in a very poor quality benchmark. You said you would not trust any data that indicates something that contradicts your opinion. Benchmarks are not PR they are designed by a variety of institutions completely outside the control of frontier labs. Again congratulations on your conspiracy theory.

OtomotO · 2026-06-10T15:30:09 1781105409

> Again congratulations on your conspiracy theory.

I am neither impressed nor offended by any kind of argumentum ad hominem. I sincerely hope you have a wonderful day!

> Benchmarks are not PR they are designed by a variety of institutions completely outside the control of frontier labs.

I don't give a crap about how good a shovel may be in a theoretical experiment when it's digging in sand, when I work with hard earth.

The ones I had a look at are mostly absolutely meaningless to my actual work.

> and what you’re describing is just putting your trust in a very poor quality benchmark.

And here is where we disagree fundamentally, so we can leave it at that.

Ex falso quodlibet

aspenmartin · 2026-06-10T16:02:48 1781107368

> I don't give a crap about how good a shovel may be in a theoretical experiment when it's digging in sand, when I work with hard earth.

I don't know what this means, benchmark tasks are pretty hard and pretty in domain.

> The ones I had a look at are mostly absolutely meaningless to my actual work.

You've looked at 100,000 benchmarks?

> And here is where we disagree fundamentally, so we can leave it at that.

Yes we do disagree, yet one of us has statistics and rigor and one of us doesn't.

OtomotO · 2026-06-10T17:09:36 1781111376

> You've looked at 100,000 benchmarks?

What about "The ones I had a look at" was unclear?

> Yes we do disagree, yet one of us has statistics and rigor and one of us doesn't.

Yup, that's true. So again, have a nice life!

BoorishBears · 2026-06-09T17:40:45 1781026845

"Fable 5" is Opus 4.7, and the Opus 4.7 we got is a Sonnet sized model on a stronger base.

That's where all the regressions and inconsistency in experiences stem from: RL can still only go so far vs having more parameters

OtomotO · 2026-06-09T19:08:44 1781032124

Lol. If you're doing anything non trivial that's not a CRUD webapp but e.g. some physics simulation or high performance GPU code any and all models I've tried suck.

They are not just leagues behind what experts would code, they are not even playing the same game.

Which is to be expected, as there isn't so much physics or high performance gpu code available as there is for your typical CRUD API and JS frontend.

rweichler · 2026-06-10T00:42:23 1781052143

I can attest to this, I had a very simple 20-line shader that I asked Claude to do a basic 90-degree rotation on it, and it just completely got it wrong. Frequently adds pointless abstractions / intermediate variables even when I tell it explicitly not to in the system prompt. I can go on and on, these things just don't understand architecture. And why would they? They were trained on text.

There is something remarkable about turning speech into code (don't need to hunch over a keyboard nearly as much these days, can just talk into a mic) and it's good for first drafts / exploring ideas. But it's obvious to anyone that's paying attention we're hitting the top of the S-curve. It's no wonder the IPOs are around the corner. I mean even Dario admitted he doesn't know how they're gonna substantially increase the context window size. That says a lot.

rweichler · 2026-06-10T03:20:46 1781061646

That being said I think the harnesses are only getting better. And maybe we will get multi-modal models that understand architecture eventually. But the growing-the-blob-of-text training method that's being used now appears to be getting diminishing returns

gruez · 2026-06-09T17:44:01 1781027041

I don't get it, your complaint is that they have catchy names rather than dry names like GPT-5.6? Does OpenAI hype their models less?

Aperocky · 2026-06-09T17:50:51 1781027451

Oh, Far less.

It's getting to a point that it's offputting, and the next step would be to put it into "untrusted" bucket. Opus 4.7 already burned their credibility once, 2 more strikes remain.

aenis · 2026-06-09T18:21:57 1781029317

Not my impression. I felt 4.7 was a regression, but I am again badly in love with 4.8 with the level of insights it produces in design discussions, and how long can it go unattended while producing spec-adhering quality code. There are problems it still can't solve well, from the edges of algorithmics and far from the mainstream, but for lots of stuff it is godlike.

Also, I dont think Boris C. is coming here for PR. He is a tech guy, and this is the best place for tech discussions. Why so cynical? The guy is an engineer.

jwpapi · 2026-06-09T17:53:01 1781027581

I don’t even think that Boris is really just one person. He apparently vibe coded Claude Code and is responding on Threads, Twitter, HN and everywhere.

guybedo · 2026-06-09T18:20:45 1781029245

They're good at marketing, but my first subjective assessment of Fable is that it's really smart.

I've been working with gpt 5.5 and opus 4.8 quite a lot, and interacting with Fable feels like a smart guy just entered the room.

boc · 2026-06-09T22:25:23 1781043923

Yeah idk what people are talking about- it's not marketing. This thing is substantially better than opus 4.8/gpt5.5 from what I'm seeing today.

iillexial · 2026-06-09T20:00:20 1781035220

>Hey! Boris from the Claude Code team!

>TOP 5 METHODS FROM BORIS ON HOW TO SPEND MORE MONEY ON TOKENS

>Boris from Claude just told he doesn't prompt anymore. He LOOPS instead

>"chatgpt has gotten soooo much better with the latest update."

>"codex is the best AI coding product and we want to make it easy to try."

Karpathy about Fable 5:

>"You can give it a lot more ambitious tasks than what you're used to, the model "gets it""

Sam Altman about gpt-5.4:

>In my experience, it "gets what to do"

What a time to be alive. Models are great, but all the slop, marketing, and fakeness around them is just unbearable.

replwoacause · 2026-06-10T05:41:23 1781070083

Yeah, the marketing is cringe and it's a bummer that such a cool and powerful technology attracts such an icky group of enthusiasts. Surely, not all are bad, but man there are lots of goobers who are just AI-pilled hypemen who can't STFU about it.

avaer · 2026-06-09T17:51:19 1781027479

If you truly believe this, you've discovered a superpower over everyone else in the industry.

While everyone else is wasting time and money on the slower, more expensive models, you've found a way to outpace everyone for less money. Everyone else is wrong and you will get rich.

(I don't actually believe the premise is true, I'm just pointing out the logical conclusion to what you're saying so maybe we can reconsider the premise)

xyzsparetimexyz · 2026-06-09T18:19:30 1781029170

Thats not how costs work. You don't get rich off buying a €10 hammer that's the same quality as someone's €50 hammer

atleastoptimal · 2026-06-09T18:05:19 1781028319

> At this point Anthropic is a pure marketing and PR company. Super catchy names like Opus, Mythos and Fable trying to get you to think that these software products are actually super-human

Lol anti-AI bias on HN is crazy. Simply giving your product a quirky name is now being considered manipulative advertising. Is just doing normal PR and marketing something AI companies aren't allowed to do?

ausbah · 2026-06-09T18:31:01 1781029861

when they keep saying “oooh this new model is too big and crazy and totally can’t be released” or “this new model is a 10x game changer totally unlike our previous iterations” it feels sort like boy crying wolf. yes they’re still pretty clearly improving models, but when you’ve hit diminishing returns / more incremental gains and you’re still saying this is sounds like pure PR hype from a company that previously been the “honest good guys” in the room

atleastoptimal · 2026-06-09T18:34:54 1781030094

Their model did find thousands of security vulnerabilities across the companies they previewed Mythos with via project Glasswing. Is it not sensible that, given that emergent level of capability, that they do this gated release structure, as all those vulnerabilities would be exploitable by anyone using a Mythos-level model?

thefreeman · 2026-06-09T17:53:33 1781027613

How can you make this comment before even having a chance to try the new major model revision?

piyuv · 2026-06-09T17:34:49 1781026489

Current AI hype is built on marketing and PR, not capabilities, and has been from the start.

I still remember Sam Altman “begging AI to be regulated” and AGI being “some thousand days away”.

Breed faster horses and hope one will birth a locomotive.

WarmWash · 2026-06-09T20:07:26 1781035646

Don't forget the DoD stint that gave them this recent public boost.

Defy standard DoD precedent going back forever, that every other country has some form of too, and championing it like they are some kind of moral freedom fighters.

Like selling the DoD guns and telling them they can only shoot bad guys with those guns, and that you will be the one to decide who counts as a bad guy...

reasonableklout · 2026-06-09T17:33:59 1781026439

I think this says more about your type of work than anything. For bugfinding/incident response in distributed systems - which often involves extensive use of Datadog/Sentry MCPs and poring over heaps of logs in addition to reading tons of code - 4.8 has been significantly better than 4.6.

nozzlegear · 2026-06-09T18:31:00 1781029860

> Sentry MCPs

Oops, time to reauthenticate for the 10th time!

xpct · 2026-06-09T17:41:35 1781026895

Indeed, hearing "Mythos-class model" felt very icky to me.

b3kart · 2026-06-09T17:52:21 1781027541

https://en.wikipedia.org/wiki/Typhoon-class_submarine vibes

system2 · 2026-06-09T17:43:08 1781026988

You are right; all I noticed was a big-time slowdown. They increased the quota, but I cannot even reach the end of the day with these speeds. .NET coding somehow improved, though.

MattGaiser · 2026-06-09T17:38:33 1781026713

Doesn't this suggest your use case is simply insufficiently complicated?

mawadev · 2026-06-09T18:13:20 1781028800

When the Ai overlord is descending into pleb space to say Hi, you know stuff is real

chis · 2026-06-09T18:14:04 1781028844

Hackernews not blindly hate on AI challenge: impossible

frevib · 2026-06-08T16:35:37 1780936537

This guy dubbed it “get Komooted”, as they pulled the same trick for used-to-be-great cycling app Komoot: https://bikepacking.com/plog/when-we-get-komooted/

The app quality almost immediately went down the drain after the acquisition by Bending Spoons.

insane_dreamer · 2026-06-08T22:05:12 1780956312

I don't like PE players like Bending Spoons, but I have used Komoot extensively for years, for cycling (and more recently hiking), and haven't seen any decrease in quality since the acquisition.

doctorpangloss · 2026-06-08T16:56:59 1780937819

With LLMs, I feel like they'll have the last laugh.

frevib · 2026-05-28T12:38:39 1779971919

“the reality of AI right now is that it only works for coding.”

Kind of…

frevib · 2026-05-27T05:27:06 1779859626

Code is a liability. Saying no is because the engineer wants to reduce complexity, not because she/ he is so subjectively “obsessed” with code quality. The term “quality” is nowadays misunderstood by management. It means the right amount of effort to build the product as fast and for as low as cost possible, taking into account a team of engineers that can easily add and modify code.

This description is the better one: https://www.nair.sh/guides-and-opinions/communicating-your-e...

gfody · 2026-05-27T06:04:02 1779861842

"quality" isn't succinctly definable. Zen and the art of system maintenance quality code is written by an old and wise programmer and any attempt to rigidly codify what it is they did and why is doomed to fail.

peder · 2026-05-27T15:05:04 1779894304

And in the agentic world, that liability is both minimized and amplified. Teams that successfully mitigate AI risks will be able to churn out massive amounts of sustainable code.

simianwords · 2026-05-27T05:35:06 1779860106

You are the archetype the OP is talking about because you repeat aphorisms like "code is a liability" that compress some truth too much and forget the larger picture.

Edit: apologies for personal attack. Didn’t mean for it to come across that way

dang · 2026-05-27T06:29:32 1779863372

Please don't cross into personal attack on HN. You can make your substantive points without that.

https://news.ycombinator.com/newsguidelines.html

p.s. We've had to ask you this before: https://news.ycombinator.com/item?id=47103856. If you'd please review the guidelines and take the intended spirit of the site to heart, we'd be grateful.

Toutouxc · 2026-05-27T05:51:23 1779861083

You got all that from four words?

Also what's wrong with "code is a liability"? That's just 100 % true. The idea isn't exactly novel or revealing, but it's also really fundamental. Every line of code is a liability from day one.

The comment you replied to used that as a reminder and as an opening to an actual argument, it wasn't just a knee-jerk reaction.

thrownthatway · 2026-05-27T05:58:35 1779861515

Code is an asset.

Code that enables a company to generate revenue is an asset.

Code is an asset.

Code that can be sold for more than it cost to generate is an asset.

strken · 2026-05-27T06:26:19 1779863179

Production code is an asset, its maintenance and obligations are an expense, its risks might become liabilities, and companies shouldn't run more code than they need for the same reason they shouldn't own a larger vehicle fleet or more spare warehouse capacity than they need.

I don't think most engineers really disagree with this. Saying code is a liability is technically incorrect but pithy shorthand to communicate that it comes with the associated baggage of maintenance, obligation, and risk; these things suck up money the same way a liability does. Tech debt is also not real debt. It's a figure of speech.

pyvpx · 2026-05-27T06:11:42 1779862302

A building is an immovable asset. It’s also made of things that wear and tear. Its value is derived from its capability to house and the capability to house something extends beyond four walls and a roof.

The asset has inherent liabilities. A codebase can be reasoned about extremely similarly

Toutouxc · 2026-05-27T06:11:59 1779862319

Note how the longer sentences are significantly stricter than the shorter ones. You could maybe add another condition, in the sense that the code has to generate more revenue than it costs to maintain. Then I'd start to agree.

Also note that even when a line of code is generating revenue, it never stops being a liability in almost every sense of the word. Testing it still costs money and time, understanding it costs cognitive power, having it in the context of your LLM coding agent costs tokens, and that's assuming it's a good line of code. If it's bad code (badly named, badly placed, a logic chain that works but has hidden flaws), the costs increase and reverberate throughout the codebase (and your AI coding sessions).

thrownthatway · 2026-05-27T06:18:09 1779862689

Yep, that was pretty much the point of my comment.

Code is a liability - yes with an and / no with a but.

I’m sure volumes have been written amount, and there’s roughly an infinite amount of nuance to be talked about over a few drinks and a smoke.

andwur · 2026-05-27T06:06:46 1779862006

That's an oversimplification. Asset vs liability isn't a binary state but a superposition. An asset can carry liabilities.

Your asset might generate $10k a month in revenue, but at the same time may have a high chance of needing a $100k investment in upgrades and repairs to remain productive.

sevenzero · 2026-05-27T05:50:55 1779861055

Nah man. You got to say "no" a lot. Even in the age of AI. Often times features downright make no sense, the time to implement can span weeks and it would actively damage the product in the long term. I work in a ecom startup and I got to say no so many times due to added complexity for little reward.

frevib · 2026-05-27T06:06:16 1779861976

I think saying no is more important now with AI, as features can be built so quickly now. But there are a lot more costs after the feature has been built. Mostly with AI the code isn’t understood that well, wich incurs a cognitive debt. Then there are extra maintentance and documentation costs. And the costs of carrying around features that add no value.

I can imagine that if you’re a startup and want to try new features quickly, it makes sense to say yes more. But the senior mentioned in the article will also be able to understand that.

wellactchully · 2026-05-27T13:26:10 1779888370

Are you nitpicking that with the right achitecture and safeguards, unaccountable lines of code are perfectly harmless?

Because I think in most regular situations, code-without-adjectives, uncertain commits, or any number of things might be rightfully justified as a literal legal liability for business cases.

I get that you don't like how flat it is, but on a business website, in a world forecasted to be full of black box code, the statement is correct.

Code in a vacuum may not be a personal liability, but it is a professional one in 2026 where there's a gulf between slop and secure code

raincole · 2026-05-27T05:42:59 1779860579

Thank you. I was about to reply the same comment but I couldn't say it as concisely.

The funny part is some top comments are saying this article is straw man and the rest of them are just proving the archetype is very real.

frevib · 2026-05-26T19:52:59 1779825179

CLOUD Act and FISA §702

frevib · 2026-05-26T14:36:39 1779806199

They’re not doing too great atm: https://www.msn.com/en-us/money/topstocks/kyndryl-s-stock-is...

pantulis · 2026-05-27T11:30:53 1779881453

As are all consulting firms, to ve fair

frevib · 2026-05-26T14:34:41 1779806081

> AFAIK Solvinity can't access the data.

Solvinity is the hoster. It can fully access the stack.

crote · 2026-05-26T15:02:07 1779807727

It's even more complicated: the datacenter and the servers are owned and operated by the government, and the DigiD app itself is owned and operated by government-owned Logius.

From what I have been able to deduce, Solvinity is contracted for some kind of sysadmin services - so basically Kubernetes babysitting?

overfeed · 2026-05-26T18:55:34 1779821734

Are you suggesting sysadmin access isn't sufficient to access data?

frevib · 2026-05-13T12:30:08 1778675408

Scaleway has introduced Edge services recently: https://www.scaleway.com/en/edge-services/

No ddos protection yet.

wolvoleo · 2026-05-13T13:11:43 1778677903

I've been very happy with scaleway for many years yes. I can recommend them. Much more professional than OVH and Hetzner too.

frevib · 2026-05-13T06:04:29 1778652269

To the user: friendly message with uuid.

In the logs: detailed technical message with uuid.

frevib · 2026-05-12T19:28:20 1778614100

Please not the schools. We don’t need privacy-invading closed systems with built-in slot machines. We need deterministic open systems where kids’ privacy is protected.

Please not schools…

throwfish3000 · 2026-05-12T19:30:03 1778614203

Chromebooks that run on Google services are already the default 1:1 device in schools. They're cheap, they take a beating and have good battery life.

frevib · 2026-05-12T19:56:31 1778615791

Same here. They’re subsidized by taking kids’ privacy.

bryanlarsen · 2026-05-13T00:10:49 1778631049

That would be illegal in many jurisdictions. And schools in general take privacy very seriously. Most schools won't sign up for google edu without a solid privacy guarantee.

Google is likely very happy to give up on the privacy violations for a few years of a child's life in exchange for getting that child hooked on Google services so they can freely violate privacy for an entire adult lifetime.

frevib · 2026-05-13T05:02:27 1778648547

> without a solid privacy guarantee.

That’s a promise, no technical guarantee. Then there’s Cloud Act and FISA.

> Google is likely very happy to give up on the privacy violations

“likely”, exactly. This can change any time. We’ll just have to trust them. Scrolling through this thread it seems about zero trust in a US ad company who’s specialty is feeding off people’s privacy.

We should by now demanding technical guarantees. Open source, end-to-end encrypted with e.g. an overseer board checking the company. Companies like Proton are doing this.

dormento · 2026-05-12T20:47:35 1778618855

And normalizing google's model of computing, surveillance, locked down platforms etc...

munificent · 2026-05-12T22:38:44 1778625524

I don't know what the "default" is, but as a data point of one: my kids' public school is all Windows laptops.

zythyx · 2026-05-13T03:17:48 1778642268

The default is very very heavily weighted in Googles "Chromebook" favour. Getting a school with Windows (or Mac) exclusivity is a 4-leaf clover. Google genuinely have a pretty good product with Google Classroom though, so it's not completely lost. It's just a problem when schoolkids grow up and end up with new Windows/Mac laptops and have no idea how computers work outside of the web browser.

dpoloncsak · 2026-05-12T20:54:16 1778619256

I'd assume this opens up 'Googlebooks' to compete with the GPU/M Series Premium laptops so schools can provide them to teach things like Photoshop, Illustrator, CAD Design, anything that chromebooks couldn't do, right?

PaulHoule · 2026-05-12T19:57:05 1778615825

The performance of the machine offered at schools seems to get just a little worse every year too... like one of these days they won't have to worry about kids playing Krunker in class because they won't be able to.

Brainspackle · 2026-05-12T19:49:22 1778615362

My kids schools all use ipads

7734128 · 2026-05-12T19:38:56 1778614736

It would be so much better for the student's IT proficiencies if the were some ordinary Linux computers instead. Preferably with limited central managment.

The Chromebooks are probably cheaper than the hardware itself could be, but that's a good demonstration of the issue.

afavour · 2026-05-12T19:59:53 1778615993

It wouldn’t. The central management of Chromebook is what makes the whole system usable. All you’d be doing is sentencing school IT folks to endless, endless support requests.

raphman · 2026-05-12T21:40:10 1778622010

Funny. At my son's school in Germany, students may bring any device they want without central administration (just Wifi and web platforms). It works quite well without inundating IT staff with support requests. (To achieve at least some similarity of systems, you get a partial refund if you buy either iPads or convertible notebooks running Windows. My son's notebook technically runs Windows but he mostly uses plain Debian Linux with Xournal++.)

afavour · 2026-05-12T22:02:12 1778623332

That sounds wonderful for tech literate families. Probably less so for ones that aren’t, how many are loaded down with crappy spyware, I wonder?

sowbug · 2026-05-12T19:58:52 1778615932

Who would run the cloud side, or at least the networked backup service?

jakeydus · 2026-05-12T21:10:48 1778620248

Sorry, I love Linux, but could you imagine managing a fleet of the cheapest hardware possible and also teaching a bunch of 6th graders how to use Linux? School IT workers are already heroes. I don't like Google, but they're a necessary evil to keep those guys from tearing their hair out every day unless we dedicate significantly more resources to computing in schools.

7734128 · 2026-05-13T04:34:59 1778646899

We managed fine with crappy old Windows XP Thinkpads in elementary school. Modern Linux is far easier, and I'm saying the slight challenge would be educational.

bko · 2026-05-12T19:51:21 1778615481

> We need deterministic open systems where kids’ privacy is protected

I don't think we need any computers really. They'll be inundated with computers and technology their whole lives. They'll figure it out. Just keep this tech out of the classroom altogether.

We've had computers in the classroom for over a decade now, scores and learning has not gone up. It's a failed experiment.

davedx · 2026-05-12T20:19:08 1778617148

Why are you opposed to using personal computers for education?

JumpCrisscross · 2026-05-12T20:29:09 1778617749

> Why are you opposed to using personal computers for education?

They'll have computers at home. And the evidence seems to point in one direction: the more exposure kids have to devices, the more stunted their development tends to be. Add to that the class division, where rich kids are increasingly raised with strictly-policed device exposure, while poor kids' classrooms are littered with iPads and Chrombooks, and I think we can start making blanket statements.

daemin · 2026-05-13T06:21:01 1778653261

There's also the point that the rich executives at these companies that make computers for school use send their own children to schools which do not use computers for education.

If computers were that critical to education you'd think those same executives would be loading up their children with all the tech they can afford.

adastra22 · 2026-05-13T05:51:38 1778651498

Because objectively it does not improve outcomes. Sweden has recently reversed course based on student outcome data.

bandrami · 2026-05-13T05:14:06 1778649246

Because the evidence is that it doesn't seem to work very well

afavour · 2026-05-12T19:58:50 1778615930

I don’t think we need math really. They’ll be inundated with math and arithmetic their whole lives. They’ll figure it out. Just keep math out of the classroom altogether.

kostarelo · 2026-05-12T19:59:58 1778615998

More montessori-style please.

colinrand · 2026-05-12T19:41:23 1778614883

I could not agree more. We need less tech in classrooms, not more.

andai · 2026-05-12T19:49:04 1778615344

>deterministic open systems

FreeBSD?

Geezus_42 · 2026-05-12T19:53:51 1778615631

NixOS?