More

r_lee · 2026-06-24T15:21:03 1782314463

from what I know this is very much possible, you can also use tax treaties to transfer taxation to Germany.

e.g. Irish Ltd that is a resident in Germany

you won't have to bother with the naming problems etc. either

r_lee · 2026-06-23T18:57:23 1782241043

nor those who are driving Uber or doordashing

r_lee · 2026-06-22T10:25:02 1782123902

if we are at 10x with AI and near AGI or ASI, then how is it possible that these products (Codex, Claude Code CLI) are still such garbage?

shouldn't this "agentic AI revolution" have long solved this already?

no way they're over there saying "we are on it plz wait" or that "it's too much effort"?

igleria · 2026-06-22T10:52:07 1782125527

This is the biggest elephant in the room I have seen in my decade+ career. At the same time, look how bad Apple is in software compared to its hardware... It's not an AI only problem, it's almost like software in general gets a free pass on being very unsafe or low quality because no one wants to face the same "profit reducing red tape" that civil engineers or similar face.

CharlieDigital · 2026-06-22T11:12:10 1782126730

Anthropic were the progenitors of the Model Context Protocol. Claude Code does not fully implement the client end of the protocol. A protocol; a literal pre-defined spec that an agent should be able to one-shot. Neither does Codex. Codex does not implement MCP Prompts.

(I want Codex to implement MCP Prompts because then we have one central way to ship skills from a server).

The fact that neither platform can implement a protocol given what is functionally infinite frontier model tokens really says a lot. I do not care what kind of random project some influencer can ship with a swarm of 1000 agents. If you cannot make the basics work, it is a farce.

deathbob · 2026-06-22T13:38:54 1782135534

It still boggles my mind that Anthropic would invent the MCP protocol but not fully implement it.

Especially when fully implementing it (prompts, resources, tools) is easily done in harnesses that don’t ship with MCP but allow good extension / modification like Pi.

Claude not being able to see its own usage or self invoke slash commands is also very frustrating.

oblio · 2026-06-22T14:16:37 1782137797

> It still boggles my mind that Anthropic would invent the MCP protocol but not fully implement it.

https://www.joelonsoftware.com/2002/01/06/fire-and-motion/

> Do they just want to force you to keep busy reacting to their volleys, so you can’t move forward?

CharlieDigital · 2026-06-22T16:12:12 1782144732

> ...Do they just want to force you to keep busy

Given functionally unlimited access to tokens with frontier models, there is really no "force you to keep busy"; it should just bake overnight. We're talking about a rather simple and well-defined specification; not something novel and complex.

oblio · 2026-06-22T20:58:34 1782161914

My point is that there is a chance that Anthropic launched MCP:

1. not fully believing in it

2. but knowing the hype around MCP would force other AI labs to implement it (standard enterprise checkbox behavior)

3. thus wasting some of their competitors' development cycles

* * *

And let's be real here, the entire discussion happens in the context of a basic bug in a coding agent, do we really believe that these labs have hit AGI in coding?

Random example:

Go to claude.ai or gemini.google.com (I imagine OpenAI is in a similar situation).

Type a question, press enter. Wait 2 seconds, then turn on airplane mode.

Not only does the connection cut off, but even if you reconnect 20 minutes later, you still won't get the answer.

Their website works in purely sync mode!!!

We knew better than this when HTML was invented, 30 years ago.

Do their products give you the impression than "baking features with unlimited tokens overnight" leads to decent products?

thewebguyd · 2026-06-22T15:28:25 1782142105

> same "profit reducing red tape" that civil engineers or similar face.

I don't think we should ever head toward licensing/a credential body for software development, but I do think now is a good time to have discussions around liability for defective products.

A good start would be to stop allowing companies to disclaim all warranties of fitness for a particular purpose in their EULAs. The joke of Microsoft Copilot applies here where they have a big disclaimer that "Copilot is for entertainment purposes only" while advertising says otherwise. Not even the chrome EULA will agree that its fit for purpose as a web browser. The clause is a get out of jail free card that shifts all liability and risk to the end user.

datsci_est_2015 · 2026-06-22T17:09:26 1782148166

> I don't think we should ever head toward licensing/a credential body for software development, but I do think now is a good time to have discussions around liability for defective products.

Liability is how a credential body would organically grow. It already exists in the security, compliance, and enterprise parts of the software world.

inigyou · 2026-06-22T19:15:55 1782155755

That can be okay. The problems we're worried about come when it's government mandated.

The EU Cyber Resilience Act puts heavy liability on vendors for software vulnerabilities that get exploited, including in open-source components they incorporate. OSS devs are shielded - liability is on the companies who incorporate OSS into commercial stuff.

datsci_est_2015 · 2026-06-22T21:16:11 1782162971

In practice, what’s the difference between a government mandated license and a government that quickly rules in favor of parties who are damaged by companies that don’t use licensed software engineers?

E.g. “Your software caused serious damages to our company / livelihood, and you best hope that it turns up in discovery that you used properly licensed software engineers who were following licensing best practices, otherwise this will be a slam dunk case.”

Genuinely an interesting question to me. Seems like the latter is a better option, generally, but it does lock restorative justice behind a paywall - you have to be able to afford a lawyer.

inigyou · 2026-06-22T21:58:06 1782165486

They don't rule based on licensing. They rule based on damages. Your insurance might rule based on licensing.

forshaper · 2026-06-22T14:34:25 1782138865

How much of all this is due to hardware improving, and software bloating enough to fill the capacity?

thewebguyd · 2026-06-22T15:18:07 1782141487

> shouldn't this "agentic AI revolution" have long solved this already?

Daily reminder that Anthropic took over a year to fix the Claude Code terminal flickering issue despite proclaiming all over the internet that software development as a "solved problem."

Apple forked over $250 Million in a class action over false advertising for Apple Intelligence. When do we start seeing the same for the misleading and outright false claims coming out of the frontier labs about the model capabilities? At this point the marketing is doing more harm than the technology itself because its warping the perceptions of those at the top that make decisions. The only reason tokenmaxxing was ever a thing was because marketing mislead execs and technology decisions were made based on vibes instead of evidence.

mannanj · 2026-06-22T20:50:24 1782161424

Why is not a thing that people track the lies of people as they are public, and tie them to their reputation over time for anyone to find?

mannanj · 2026-06-22T18:13:19 1782151999

As long as a majority of the people of the living class are gullible and naive and sick, entrained behavior from the institutions and media they are made to consume, they stop seeing the misleading and false claims. Or at least they myopically see it short enough to complain about it in an ineffective way, then continue to consume the next big lie or slop. Until something happens that channels that accumulated rage finally into a cause they feel makes things right (assuming they have not already died and the next generation has been groomed to fall for the rich man's trap) and those who's family and next generation is to continue the extraction and trickery hides behind an anonymous personality or system.

jeffybefffy519 · 2026-06-22T11:41:24 1782128484

Because vibe coding is a toy… thats the secret.

You can use it to accelerate development certainly, but that requires careful change->review cycles. The developer still needs to be in heavy control, versus vibe coding having an agent own the code base.

hombre_fatal · 2026-06-22T12:14:58 1782130498

Like anything, you have to decide between polish vs switch to any other task in the queue. If you choose too much from the latter, then polish suffers, yet that's a human thing.

Also, Codex and Claude Code aren't as bad as people say. I think most of the noise is embellished by the "hah see? AI sucks" angle.

It's kind of like how HNers would claim to your face that you can't actually build anything with Javascript and Node.js (JS just sucks too much), then they'd list off a few footguns that were supposed to demonstrate why. In other words, champing at the bit for JS to lead people to catastrophize issues that were pretty mediocre.

geodel · 2026-06-22T14:55:03 1782140103

> yet that's a human thing.

is this joke?

Here we are talking about trillon dollar AI companies who claim AI can fix decade old bugs and create new compilers, OSs and what not. Are parallel agents working autonomously to fix issues as well as create new features not allowed at these companies?

hombre_fatal · 2026-06-22T15:06:09 1782140769

Humans still decide what LLMs do in a code base, full stop.

quikoa · 2026-06-22T19:17:42 1782155862

Yea it's too bad these poor scrappy startups cannot afford engineers to build decent software.

inigyou · 2026-06-22T19:17:36 1782155856

Why not just tell it to do everything on the task list and then tell it to fix all bugs?

coldtea · 2026-06-22T13:19:39 1782134379

>Like anything, you have to decide between polish vs switch to any other task in the queue

Why do you "have to decide"? Let some agents go at both of those, isn't that what they claim people can just do?

>Also, Codex and Claude Code aren't as bad as people say. I think most of the noise is embellished by the "hah see? AI sucks" angle.

Why shouldn't it? They're not the ones making the extraordinary claims.

hombre_fatal · 2026-06-22T14:58:14 1782140294

> Why do you "have to decide"? Let some agents go at both of those, isn't that what they claim people can just do?

Because your code is still marching somewhere in tokens per second. You have to decide where they are allocated: polish or the next thing. Humans still are the ones prompting LLMs and deciding what is done.

> isn't that what they claim? Why shouldn't it? They're not the ones making the extraordinary claims.

Even if I grant that someone else makes excessive claims, why would that let you off the hook to stay grounded?

Though I don't grant it. Maybe if Anthropic claimed that Opus makes all decisions at the company and builds all software without humans doing all the prompting, the critics would make more sense.

Until then, it looks more like a double standard: if software built with AI has any issues, then see, AI is shit and the humans who invoked it had no role in it. e.g. it could be the case that Anthropic's Claude Code engineers just aren't doing as much polish as they should.

Better answer: Someone asked why might it be the case that AI-written software has issues, and it has a real answer. Marketing claims are a different conversation.

geodel · 2026-06-22T15:13:31 1782141211

> Maybe if Anthropic claimed that you could write an unsupervised loop that writes perfect software, the critics would make more sense.

Or to be upstanding, ethical companies that they are. Just put disclaimer after every prompt response and on their website "AI generated code has no absolutely no guarantee of quality or correctness. Human prompter must be held accountable for any mistake or inaccuracies."

Hope it wouldn't be too much bother to these important companies.

thewebguyd · 2026-06-22T15:31:34 1782142294

See, but that would counter act all of their marketing and hurt the feelings of all the execs that desperately want to believe that software development is "solved" and in the near future they won't have to hire those expensive, pesky developers ever again.

geodel · 2026-06-22T16:51:30 1782147090

Two trends I see at work:

1) No more human written code in projects, all code must be AI generated.

2) Developers are responsible for all code AI generated.

Combine that with fear of losing job and you have no one calling out management bullshit on their face.

hombre_fatal · 2026-06-22T18:44:31 1782153871

I don't see how these things conflict. Nor did I get the point you were making in the sarcastic upstream comment.

It is obviously the case that you can both delegate code implementation to AI and also be responsible for it. You are signing off on the code you submit to a project no matter where you got it from nor how it was generated nor who you delegated the task to ("actually my friend wrote it so if it sucks don't look at me").

AI didn't change this, nor will it until there are no more humans in the loop.

tadfisher · 2026-06-22T19:39:44 1782157184

They don't conflict, if the generated code is acceptable. Maybe I'm holding it wrong, or I'm not using the right combination of plugins and MCPs. But if I'm not allowed to manually correct the generated output, then I am forced into a loop of generating corrections until it's good enough to stake my job on. I hope you can see that such a policy would be ridiculous.

lelanthran · 2026-06-22T22:07:17 1782166037

> Because your code is still marching somewhere in tokens per second. You have to decide where they are allocated: polish or the next thing. Humans still are the ones prompting LLMs and deciding what is done.

It sounds like you're saying that, even with the most tokens of anyone in the world at your disposal, you can't really finish what is effectively a glue layer between a server, a set of local files and a user?

Doesn't sound to me that the agents are all that effective TBH.

rjh29 · 2026-06-22T19:52:04 1782157924

Gemini is also buggy as heck and has been buggy for years. For a company of Google's size with "all the power of AI" it's seriously embarrassing.

yencabulator · 2026-06-23T22:43:18 1782254598

I sometimes use LLMs as search engine replacements for finding libraries for a specific niches. Gemini is the only one that seems to routinely hallucinate Rust crates from thin air, where otherwise Google is the one most willing to let their LLM peek at a search index to have up-to-date ground facts. It's puzzling. Definitely feels like it's not "organizing the world's information" there.

fg137 · 2026-06-22T10:53:19 1782125599

You are asking too many good questions.

layer8 · 2026-06-22T16:39:50 1782146390

The issue is that apparently AI coding means that developers stop caring about software quality. Which puts the whole purpose into question.

mnicky · 2026-06-22T12:40:51 1782132051

If the code churn is high the investment to refactoring etc is less beneficial than may be obvious. I don't remember the details but I heard in some podcast that the code base of Claude Code changes so fast that any piece of code won't be there for long..

coldtea · 2026-06-22T13:21:34 1782134494

In other words it's an ever moving vibe fest, with random bugs and misbehaviors each time they roll the dice...

tartoran · 2026-06-22T13:34:49 1782135289

Yes, it’s very characteristic of gen-AI era.

tartoran · 2026-06-22T13:32:56 1782135176

If they respected their users they’d at least pin some versions that are more stable.

ValentineC · 2026-06-22T13:32:24 1782135144

The "AI revolution" feels like it's creating a bunch of ultra-smart AI models are scarily good at cracking most of human-created security (Mythos), but also happen to be careless snobs that just leave litter and mess in their wake.

LtWorf · 2026-06-22T21:53:13 1782165193

We don't really know how much human intervention there is in mythos… maybe it has a very high rate of false positives that get checked by hand before publishing them.

user43928 · 2026-06-22T11:11:20 1782126680

The products generally work just fine on my MacBook.

I have not encountered major issues in either the Claude Code CLI, the Codex Desktop app, or Claude Desktop app.

They generally get the job done. I don't measure disk writes or analyze the GPU usage.

reducesuffering · 2026-06-22T17:38:06 1782149886

Claude Code has been out for just 1 year and has millions of users already, being a major contribution to roughly $40 billion in revenue. By any stretch it is one of the most extremely fast developed products driving the most important workflow for millions of people already.

"Why isn't literally everything about a product that came out a year ago with an extremely fast scaling userbase solved?" is what I hear.

The goalposts will keep moving until AGI is undeniable.

LtWorf · 2026-06-22T21:54:27 1782165267

Or until you people finally admit the king was naked

reducesuffering · 2026-06-23T06:48:39 1782197319

Yes all the F500 companies have been paying eye watering $$$ for Claude Code for half the year because “the king was naked” Those ruthless cost savings corporations, they surely never care about trimming extra spend

LtWorf · 2026-06-23T07:29:57 1782199797

They've been paying people to teach NLP for several decades when it's well known it's all made up.

Why would they act rationally on this one single specific thing?

Zababa · 2026-06-22T11:40:31 1782128431

A simple explanation is that they are "good enough" for most people and they have better things to do. Even if tomorrow I was 100 times as productive, I still wouldn't have time to do literally everything and I would have to prioritize.

coldtea · 2026-06-22T13:22:25 1782134545

You might not.

But the Claude Code team has ONE job.

And they have full access to a platform that they advertise as "humanity-threat" level good, and claim that it can automate everything code related...

Zababa · 2026-06-22T13:53:24 1782136404

I think they have more than one job, they have to balance new features with improving the software itself. And Anthropic has to balance investing resources into Claude Code vs on infra or other things.

Not that I'm happy with the current state of things, in fact I'm quite sad that improvements in capacity to do things doesn't translate into better quality.

troupo · 2026-06-22T16:16:16 1782144976

> they have to balance new features with improving the software itself.

What new features?

> And Anthropic has to balance investing resources into Claude Code vs on infra or other things.

It seems they are doing neither? Their vibe-coders boast everywhere that they no longer even work, but just endlessly prompt Claude Code in a loop. Perhaps that's why there's no polish? Perhaps that's why their spring post about Claude Code issues reads like "these are all issues that would take a junior programmer a day to test and fix before they ever reached production"? https://www.anthropic.com/engineering/april-23-postmortem

r_lee · 2026-06-21T14:15:26 1782051326

what...?

r_lee · 2026-06-18T17:59:18 1781805558

I'm assuming it means that someone used Fable 5 to implement Gemma 4 in WebGPU and it performs at 255 tok/s

r_lee · 2026-06-13T13:11:23 1781356283

yeah as if anyone is actually gonna do that...

r_lee · 2026-06-13T09:41:20 1781343680

there's not much controversy that would pull media attention in green tech or medical research

bigyabai · 2026-06-13T18:32:11 1781375531

Medical research is still plenty controversial: https://en.wikipedia.org/wiki/2009_Aftonbladet_Israel_contro...

r_lee · 2026-06-04T00:19:29 1780532369

Amazon seemed to work just fine before this AI datacenter boom

cousinbryce · 2026-06-04T02:28:25 1780540105

Arguably Amazon is worse since they added AI

r_lee · 2026-05-31T08:47:25 1780217245

IGIN! (I get it now)

r_lee · 2026-05-30T13:51:57 1780149117

yeah, I think this is one of the major reasons as to why we're maybe not doomed?

I'm pretty sure we as a society have gone through periods before where we think oh what if we just get cheap laymen to do it!?

but then in the end, if you're able to get an expert vs. a non expert, and you still profit from the work they do, do you really want to gamble?

its like, we look at Google reviews and credentials for a reason, we want trust