That directory is huge already! I guess the index.md helps the agent find what it needs, but even the markdown file is very long - this would consume a ton of tokens.
Also I wonder who/what decides what papers go in there.
In the blog post, the agent is allowed to do its own search.
Having a "indexed global data collection" of the markdown would be a kumbaya moment for AI. There's so much data out there but finite disk space. Maybe torrents or IPFS could work for this?
I'm actually sort of working on this! https://github.com/ctoth/propstore -- it's like Cyc, but there is no one answer. Plus knowledge bases are literally git repos that you can fork/merge. Research-papers-plugin is the frontend, we extract the knowledge, then we need somewhere to put it :)
Awesome! TIL about Cyc, and it's quite intriguing. I'd been thinking about how being able to integrate Prolog or similar tools might be a valuable endeavor (although I've yet to write anything in Prolog myself).
> The full setup works with any project that has a benchmark and test suite.
so having a clear and measurable verification step is key.
Meaning you can't simply give an AI agent a vague goal e.g. "improve the quality of the codebase" because it's too general.
I am sure this would works well in general. There is a challenge wrt to how to make them communicate effectively to e.g. 1) avoid duplicative work and 2) allow them to combine/overlay each others' findings to yield even better results
it's written to _actively_ avoid any signs of AI generated code when "in a PUBLIC/OPEN-SOURCE repository".
Also, it's not about you. Undercover mode only activates for Anthropic employees (it's gated on USER_TYPE === 'ant', which is a build-time flag baked into internal builds).
There are probably different reasons for different people. I can definitely see the angle that trying to specifically pretend to not be AI when contributing to open source could be seen as a bad thing due to the open source supply chain attacks, some AI-driven, that we've been having, not to mention the AI-slop PR spam.
But, I also get Anthropic's side that when they're contributing they don't want their internals leaked. If it had been left at that, that's fine, but having it pretend like it's not AI at all rubs me a little bit the wrong way. Why try to hide it?
>There are probably different reasons for different people. I can definitely see the angle that trying to specifically pretend to not be AI when contributing to open source could be seen as a bad thing due to the open source supply chain attacks, some AI-driven, that we've been having, not to mention the AI-slop PR spam.
But none of the other agents advertise that the commit was done by an agent. Like Codex. Your panic should apply equally to already existing agents like Codex no?
Author here.
I've seen the docs you linked to: Slurm uses "gang scheduling" to mean something specific (timesliced oversubscription where jobs alternate on shared resources).
I'm using the term in its broader CS sense: all-or-nothing co-scheduling of related processes across multiple processors [1].
This is the definition used across the K8s ecosystem e.g. Volcano [2], Kueue [3], and its Coscheduling plugin all define gang scheduling as "all or nothing" allocation.
I still stand by the origianl claim:
Slurm allocates multi-node jobs atomically, while vanilla K8s doesn't.
its default scheduler places pods as resources become available, leading to partial allocations and deadlocks for distributed training.
It's just a terminology clash. Thanks for the comment anyway.
In a few years everyone will be talking like this -- humans and LLMs alike. We're not there yet but our LLMs masters will train us soon enough.
I am only half-joking. Kids talking to LLMs to get homework done, people use it for therapy or companionship, for work, even to "Google things". Pretty soon you'll find yourself at a bar, wanting to call your friend a dumbass for saying some stupid shit and instead you'll hear yourself say "You're absolutely right, Jim! ..."
Also I wonder who/what decides what papers go in there.
In the blog post, the agent is allowed to do its own search.