I found a DoS vulnerability in Coinbase several months ago on Hacker One. It took me literally 30 minutes to find. First time I did this in my life. I could craft a message cheaply which, when sent as the HTTP payload to a specific endpoint, would cause the server to hang for a full 30 or so seconds before getting a response. I could have easily scaled up that attack, cheaply...
I filed a report, they marked it as 'informative' and thanked me, recommended I keep looking for more vulnerabilities, but no payment at all; they said I had to be able to demonstrate major disruption of service... Which I presume is illegal. I literally showed them all the ingredients of the attack, the exact curl commands, payloads, the exact response delay could be easily be verified; you could see the server response slowing down proportional to the degree of nesting in the payload. I could execute it without authentication too; so it was essentially certain that the attack could be scaled but they made it impossible to get a reward.
The hardest part was writing the report which took several hours.
So yeah, 30 minutes of looking for a vulnerability, no prior experience in security research, first project I looked into on Hacker One, ever... A company in crypto sector which is a major target of hackers and takes security relatively seriously.
Imagine how insecure most software is! Imagine how bad most vibe-coded software is especially! Companies might as well run their servers directly inside Kim Jong Un's data center in North Korea.
North Korean hackers probably have a dashboard which shows more detailed and accurate platform analytics than what the founders of the company can see.
Reading the comments here, I'm struck by how many people seem to view code quality as an optional thing. It's almost as if software can function correctly without it. But it can't. Code quality is not a nice-to-have, it's a must-have. For any moderately-complex offering. There's a certain level of complexity that you can never reach without a well designed codebase. Cannot reach no matter how many humans or AI agents you throw at the problem.
The idea that it's a 'nice-to-have' is an illusion. It's like if you borrowed a lot of money to fund your startup and you still have some cash in the bank; at that point, a viable business model might seem like a 'nice to have'.
It's only when the lender comes knocking and you don't have enough money to pay them that a viable business model suddenly becomes a 'must-have'.
The worse designed the code, the more frequently it has to be re-factored. I've worked on long running projects which required essentially no refactoring when implementing new features and I've also worked on projects which had to be re-written multiple times to accommodate requirement changes.
In most half-decent codebases, including LLM generated ones, adding new features almost never requires a refactoring in my experience. It is almost always when you are fixing bugs, improving performance or changing core behavior.
And this code is often full of security vulnerabilities. It's just hacks on top of hacks on top of hacks. You end up with 100K lines of code full of weird fallbacks, doing something which could have been done more reliably with just 1K lines of code.
I think author's comment about preferring systems which make invalid edge cases impossible rather than implementing fallbacks is hugely important. With the fallback approach; you end up implementing fallback on top of fallback on top of fallback... Each fallback seems to increase the amount of code exponentially and somehow it always creates new problems. This should almost be a 'General law of system design.'
Fallbacks reduce the risk of failure but make failures more complicated and harmful when they do happen.
As a software engineer, like the new coding environment which is being created by AI.
Big tech companies have created infinite work for me. The human developer has become a critical component of code execution. The human needs to always be present to handle the nearly infinite number of difficult unhandled exception cases which are guaranteed to occur from time to time.
The software engineer is no longer like a laborer, but more like a security guard who sits at his desk drinking coffee most of the time and only steps in on rare occasions when something goes wrong.
My experience with LLM's is that the "fallback" issue is probably one of the most serious issues. I have seen zero talk about it outside of this thread. I'm not sure its even being worked on. The llms have this terrible drift towards always making something happen even if that thing is not even related to the task at hand. "failure" in the sense of simply throwing an error/exception is something models seem highly resistant to.
I cant tell you at this point how many times I've seen them do something like
Fail > make up values > maybe log it > keep working silently with increasingly corrupt data.
This isn't going to work because the LLM doesn't have enough context. Many security issues involve a failure mode which cuts across multiple parts of the code. A PR which seems perfectly valid on its own may be the missing piece which opens up a vulnerability. Each component may be fine on its own, but brought together, the system is vulnerable.
Think of a machine with interlocking gears; each gear may itself be perfect and may fit perfectly with each other, but then if a tiny pebble comes between them, the entire machine breaks. Maybe the problem here is that the final gear was too close to the ground and would catch stray pebbles kicked up by the wheel in front of it... The LLM couldn't know this unless it understood the full context in which the change occurred; not only the code, but the environment itself.
In a poorly designed codebase with hundreds of thousands of lines of code, it's impossible to have the full context of the code even. The architecture would lack proper separation of concerns to allow one to effectively establish an appropriate defense perimeter. In a poorly designed codebase, every part of the code can harbor a vulnerability.
It's like; if you don't have a proper access control layer which is automatically and declaratively enforced for all your endpoints, every endpoint will have to enforce security restrictions on their own; duplicating similar-looking code over and over. If one endpoint out of 1000 incorrectly enforces a security restriction, that could be a critical vulnerability.
Of course you don't just check the diff. Rather in your CI infra, it's important as part of every PR, it needs to be given the full repo to check if it introduces any issues. This works wonderfully on github, even with non SOTA tools like gemini-code-assist.
why do you think it's not possible to have full context of codebase? modern harnesses excel at finding all the right codepaths, even in a large codebase.
One thing that I'm certain of, is that there are always costs to writing more lines than what is necessary to solve a specific problem.
I've experienced this before AI, and I've experienced this, magnified, with AI.
For a well-designed project written by hand, you will work faster writing the code for new features by hand than the same project written with AI from scratch, using AI to write the new features.
But... If you have a well designed, human-written codebase, you will be faster if you generate the code for new features using AI than if you do it by hand... And you can maintain that speed for long periods of time if you use fine-grained prompts. What matters most is the quality of the codebase.
You can achieve the same degree of maintainability using AI from the beginning but you would have to make fine-grained prompts.
The gap is about making good engineering decisions. It just so happens that the value of good decisions compounds over time.
I feel like what should have happened with AI is that teams should have started to put more effort into planning and pre-implementation discussions. Second thing, team leads should have felt more comfortable to reject large Pull Requests.
When I see a large 10k-lines PR, I feel a sense of panic. In a corporate setting, I also feel a kind of pressure to approve; the more work was done, the more pressure there is to approve the PR. This is why I think up-front pre-implementation discussion and alignment has become essential.
You really can't have people going rogue and weaponizing their AI-generated lines to gain control of a project through the duality of brittleness + complexity.
This reminds me of how the founders of the so-called 'open source' cryptocurrency project I joined suppressed my work in the community.
They monopolize opportunities, suppressing natural-born entrepreneurs; force us into very narrow roles and fire us if we step out of line ever to slightly. Even when it is beneficial to them.
IMO, we should get rid of trademark laws. They didn't mind their LLMs ripping off people's copyrights. Why should anyone uphold trademarks?
If I work at Google and want to represent myself as Google, I should be able to.
I feel like, even if I don't work at Google, I should be able to use the logo. It's the consumer's mistake for inferring a relationship. I'm just showing a logo of a well known company and letting their dumbass jump to a conclusion.
The current situation reminds me of how far we've come from old ideals of delaying gratification today in order to have more later.
It seems like this ideology has been corrupted into a short-sighted "Establish a monopoly position as soon as possible at all costs, don't worry about tomorrow."
It's ironic because monopolizing a sector by investing heavily and suppressing profits used to be a long term move but it seems to have become a short term move as investors are racing each other.
I think the discrimination aspect is downstream from this fact:
> We follow 3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors. Each job application was assessed by an AI hiring tool built by a single third-party vendor.
3.4 million people applying to just 150 employers... Who are all using just 1 platform. WTF. This is where the discrimination is happening. Why the f do 3.4 million people feel forced to apply to just 150 employers and why the f do all these 150 employers feel forced to use just one platform. WTF.
I realise this but it's still incredible to think because that's about 22k applicants per company.
Even if that's just part of each company's total hiring pipeline, it's clear; something's wrong. I don't know how long this study has been running but 22k is a lot of people, even over a year. These companies are too big. That's the problem.
Capitalism develops into monopolies pretty inevitably. Economies of scale just make monopolies more profitable. Monopolies also get in bed with the state and do dirty tricks to stamp out competition. But monopoly capitalism has been the case for over a 100 years now.
It's crazy to think that if Elon Musk hadn't mentioned Schmidhuber, most people would have no idea.
It's nauseating how all the researchers who happened to work for big tech got tons of media coverage but Schmidhuber and his team were getting zero coverage yet they made massive contributions. I bet there are many others not mentioned.
Nobody even knows about Frank Rosenblatt. It's insane how distorted our perception of innovation is.
Even science has been corrupted. It makes one doubt every story we're told about who invented what.
Very Trump-like statement - "Not many people know this, but ...". Yes, I lot of people know this. Any class that even says a little about the history of NNs will talk about Rosenblatt and the Perceptron.
> Any class that even says a little about the history of NNs will talk about Rosenblatt and the Perceptron.
Sure. I think it starts to get more interesting when the influences that Rosenblatt explicitly cites in his seminal Perceptron paper (e.g. Hayek) become part of the discussion (which rarely happens in my experience).
I filed a report, they marked it as 'informative' and thanked me, recommended I keep looking for more vulnerabilities, but no payment at all; they said I had to be able to demonstrate major disruption of service... Which I presume is illegal. I literally showed them all the ingredients of the attack, the exact curl commands, payloads, the exact response delay could be easily be verified; you could see the server response slowing down proportional to the degree of nesting in the payload. I could execute it without authentication too; so it was essentially certain that the attack could be scaled but they made it impossible to get a reward.
The hardest part was writing the report which took several hours.
So yeah, 30 minutes of looking for a vulnerability, no prior experience in security research, first project I looked into on Hacker One, ever... A company in crypto sector which is a major target of hackers and takes security relatively seriously.
Imagine how insecure most software is! Imagine how bad most vibe-coded software is especially! Companies might as well run their servers directly inside Kim Jong Un's data center in North Korea.
North Korean hackers probably have a dashboard which shows more detailed and accurate platform analytics than what the founders of the company can see.
reply