Hacker Newsnew | past | comments | ask | show | jobs | submit | frumiousirc's commentslogin

Thanks for continuing to engage in the community despite such horrid responses from a few.

I think everyone / every project needs to adopt a strategy consistent with their values.

Unfortunately, I see the choice space here as having "developer effort" anti-correlated with "negative repercussions".

On one end of the distribution, a "hair trigger ban" strategy is low-effort for the developer but will have some fraction of false positives and some fraction of those impacted will complain to "the socials" and some fraction of those complaints will gain traction and, as we have seen, can unfairly taint the project or worse. Responding and managing the false positives also requires developer effort, unless the developers can sustain a "fsck the haters" attitude.

On the other end of the distribution, the developer can spends substantial effort to engage each submitter to ascertain and correct bad behavior, educate them on how they should engage other humans as a fellow human in this LLM era.

There is developer effort needed of different types along this distribution.

A divide-and-conquer strategy might go something like this:

- Rank each submission in some low dimension space (llm<-->human, malicious<-->helpful)

- When enough samples are collected, perform clustering in this space to determine stereotypes, name these clusters, and develop mitigating strategies and implementations as needed.

Mitigations from easy/extreme to hard/accommodating could include:

- Hair trigger ban button.

- Copy-paste a link to an explanation in a comment before closing and/or banning.

- Customized explanation in comment before closing and/or banning.

- Link or customized explanation of what must be done to move the sample to a more favorable category and close/ban if resistance or silence is returned.

- Ongoing engagement in the face of resistance or silence.

This "meta development" program to provide such a system/facility could of course be highly automated with LLMs, fighting fire with fire.

(Despite the length of this reply, it was written entirely by a random human on the internet and not an LLM).


I think we can learn about the extent to which this is an adversarial relationship from fighting email spam. By that, I mean the attackers adapt to exploit loopholes in the system, and different attackers have different profiles (eg obviously fake looking for fools vs spear phishing).

Which is to say, your system sounds good but I expect much more complicated defenses are needed.


Yes, the spam arms race is a really good analogy. In that light, my thoughts are aligned with heuristics that might be applied with procmail or in the original, pre-learning, spamassassin.

A fight-fire-with-fire is to insert an LLM to judge and/or respond to new pull requests and issues. This brings its own risk as it lets anyone who can make a PR/issue inject a prompt. It would also put one more wedge between the real human contributors and the real human developers.

A "humanity score" could also be an ingredient. GitHub or 3rd parties, could maintain a score of how human an account is. The "humanity" of all text produced by an account could be judged by LLM and/or humans. This could be centralized or based on a web-of-trust. Actually, I'd also like to have such a thing for reading HN and reddit comments.

But still, any system we can dream up can be attacked and we are back to an arms race.


> Basically, think of this not as the CLI program saying to an agent "answer me this question" or "edit this file for me", but rather, the CLI program popping open a mini "guided + 99%-of-the-time automated" TUI coding-agent micro-IDE "inside" the workflow, in about the same way that git pops open your EDITOR inside `git commit`.

Isn't this simply having your mechanistic script call `claude "Prompt that is well honed to provide a mini, guided, 99%-of-the-time automated LLM action to $THE_THING"`? And, possibly including some `--allowed-tools`?


> The target codebase is very large.

But, does every prompt need the entire codebase?


How could it not? Can you ever guarantee accurate answers about a book you haven't entirely read?

"microsoft", okay, off to a great start. First click in the ToC that looks interesting and something I'd actually like to know as a RUST outsider:

    Common Python Pain Points That Rust Addresses
But then number one:

    1. Runtime Type Errors
    
    The most common Python production bug: passing the wrong type to a function. 
    Type hints help, but they aren’t enforced.
Uh, okay. This rarely if ever has been a problem for me and I don't usually even use type hints.

Then comes calling out the existence of None, the GIL and packaging as common "pain points". None of these have posed any problem to me essentially ever. Packaging used to be honestly annoying but since uv hit the scene, not at all.

I should have known better and stopped after reading "microsoft".


> They do need to have some form of dedinator

And some dedotaded wam.


> Based on the article here, and Firefox's mythos article, they had found bugs with Opus 4.6 as well but mythos is finding more that it missed.

It's not quite apples-to-apples. It was Opus on Firefox 148, Mythos on 150. A better test of Mythos vs Opus would have been to apply Mythos to Firefox 148. Or also re-apply Opus to Firefox 150.

Do we know all the Opus+Firefox 148 bugs are fixed in Firefox 150? Do we know the number of new bugs introduced per Firefox release?


> Do we know all the Opus+Firefox 148 bugs are fixed in Firefox 150? Do we know the number of new bugs introduced per Firefox release?

That may be parsable from their bug tracker, though I don't know of all bugs raised by mythos are public.

I'd be particularly interested in how many of the bugs found existed in 148. Assuming most or all of them weren't newly created bugs added in 149 or 150, the comparison should still hold even though Opus and Mythos looked at different releases.


> When my Emacs opens a markdown file it immediately converts it into OrgMode format.

I want that. Can you give some details?

A search finds modeverv/markdown-to-org which looks 80% there but activates based on a yank or converting an already loaded markdown buffer. Perhaps it can be made to apply on opening a .md file.


> When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

You draw made up lines on made up plots and call it evidence, obviously.


If I'm a malicious actor that gets root, can I killswitch the killswitch?


you're on the other side of the secure door already

killswitch is to prevent you from gaining root


Once you’ve got root, you don’t need to exploit compromised code to do whatever you want.


LSMs say otherwise


ring0 loadable kernel modules disagree.


Only to the extent there is not a deeply embedded core, of course. Or SMM


Neither of those are LSM though right ?


Or malloc(), or open()... They kind of discussing it in the thread on how to prevent this from a malicious actor (or from footgunning yourself), but my understanding it is not all that plain and simple...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: