More

mirsadm · 2026-04-13T06:33:25 1776062005

The way Claude/Codex behave is entirely consistent with how every vibe coded project (of mine) has ended up so far. I bet those guys have no idea what's going on and are taking guesses because no one understands the thing they've made.

mirsadm · 2026-04-11T18:19:53 1775931593

How is that going to find anything that interacts across files?

nodja · 2026-04-11T22:55:16 1775948116

You misunderstood.

Instead of asking the model: "Here's this codebase, report any vulnerability." you ask. "Here's this codebase, report any vulnerability in module\main.c".

The model can still explore references and other files inside the codebase, but you start over a new context/session for each file in the codebase.

doginasuit · 2026-04-12T05:25:59 1775971559

Honestly, that's the only way I've ever been able to trust the output. Once you go beyond the scope of one file it really degrades. But within a single file I've seen amazing results.

Eug894 · 2026-04-12T12:36:08 1775997368

Are you not supposed to include as many _preconditions_ (in the form of test cases or function constraints like "assert" macro in C) as you can into your prompt describing an input for a particular program file before asking AI to analyze the file?

Please, read my reply to one of the authors of Angr, a binary analysis tool. Here is an excerpt:

> A "brute-force" algorithm (an exhaustive search, in other words) is the easiest way to find an answer to almost any engineering problem. But it often must be optimized before being computed. The optimization may be done by an AI agent based on neural nets, or a learning Mealy machine.

> Isn't it interesting what is more efficient: neural nets or a learning Mealy machine?

...Then I describe what is a learning Mealy machine. And then:

> Some interesting engineering (and scientific) problems are: - finding an input for a program that hacks it; - finding a machine code for a controller of a bipedal robot, which makes it able to work in factories;

https://x.com/NENENENENE10/status/2042733015281914108

appcustodian2 · 2026-04-11T18:38:22 1775932702

I would think that it is still capable of exploring the codebase and reading other related files like any other coding agent already does.

vmg12 · 2026-04-11T18:36:20 1775932580

My phrasing wasn't clear but you aren't telling it to only look at one specific file but to focus its review on one file. Updated my original comment.

mirsadm · 2026-04-07T21:50:56 1775598656

The numbers only go up to 100% though.

neolefty · 2026-04-07T22:31:11 1775601071

Many numbers already have! That's why we keep coming up with new, harder, benchmarks.

mirsadm · 2026-03-21T13:40:34 1774100434

Not in my experience. They're pretty good at getting average performance which is often better than most programmers seem to be willing to aim for.

eru · 2026-03-22T05:59:18 1774159158

What kind of 'average' is this, if it's better than what seems to be typical?

mirsadm · 2026-03-21T09:56:53 1774087013

I've noticed even experienced engineers have started overestimating how long things would take to build without AI. Believe it or not we coded before AI and not everything took years all the time.

le-mark · 2026-03-21T13:39:45 1774100385

We’ve all worked on projects where it took months to get requirements from the business. Sometimes to see the project cancelled after months of sitting around waiting for them to decide on things.

Coding has never been the roadblock in software. Indeed don’t we experience this now with ai? Vibe code a basics idea then discover the things we didn’t consider. Try to vibe that and the code base quickly gets out of hand. Then we all discover “spec driven development” SDD and in turn discover thinking of specifying everything our selves is an even bigger of PITA?

mirsadm · 2026-03-17T20:10:59 1773778259

I don't know if I agree with that. It's a struggle to get a modern compiler to vectorize a basic loop.

mirsadm · 2026-03-14T07:10:15 1773472215

That is already the struggle. There is too much stuff already.

GoblinSlayer · 2026-03-14T14:24:22 1773498262

All that stuff is shit though.

delecti · 2026-03-14T16:52:23 1773507143

Is all of it shit, or you just can't find the good stuff? "The struggle will completely shift to how to get traffic" is from the business side, and you're experiencing it from the customer side.

mirsadm · 2026-03-08T17:32:11 1772991131

Why? Give it a couple weeks and everybody will forget about this. They'll be earning more money than previously. Job well done.

dgroshev · 2026-03-08T17:39:34 1772991574

Just like everyone forgot about this https://www.wired.com/story/openai-staff-walk-protest-sam-al...

wongarsu · 2026-03-08T17:57:12 1772992632

Employees are the ones with the real power to make this hurt. The customers switching over are easily offset by the DoD contract. But losing talent over this, and having a harder time to attract future talent? That could hurt them

Sam probably expects to solve this by just offering more money. It worked in the past

integralid · 2026-03-08T18:19:02 1772993942

Yeah, they will have to raise salary by 10% to attract people. This will no doubt hurt their bottom line. Poor starving SV text workers will have no choice but to accept working for them, lest they starve.

Maybe my sarcasm is not justified, but I don't think most people care that that work for a company that does unethical things. In fact I think all large companies are more or less immoral (or rather amoral) - that's just how the system is built.

trollbridge · 2026-03-08T18:49:44 1772995784

Given the mass layoffs happening, I don’t think hiring talent is as hard as it’s made out to be.

PunchyHamster · 2026-03-08T18:10:19 1772993419

Traitors to humanity

mirsadm · 2026-03-07T09:55:22 1772877322

100% dependent on the person driving it

mirsadm · 2026-03-07T09:49:48 1772876988

I do this all the time but then you end up with really over engineered code that has way more issues than before. Then you're back to prompting to fix a bunch of issues. If you didn't write the initial code sometimes it's difficult to know the best way to refactor it. The answer people will say is to prompt it to give you ideas. Well then you're back to it generating more and more code and every time it does a refactor it introduces more issues. These issues aren't obvious though. They're really hard to spot.