More

mkagenius · 2026-04-11T05:55:08 1775886908

Not just sides, the vents are much sharper.

Few years back, I tried to look on reddit for complaints regarding this - barely anything.

mkagenius · 2026-04-09T20:01:57 1775764917

> The biggest problem is the coding agents don't "Fail fast and loud". They fail deceivingly.

GPT 2 and 3 used to fail fast (and loud coz we could easily see it lying)

dataviz1000 · 2026-04-09T20:33:21 1775766801

My next exploration will be "Coding Agents: fail slow, silent, and deceivingly".

After one month working on using Claude to create trading strategies, the one thing I learned; if the strategy looks like it can profit, it is a lie. The trading strategy agent doesn't find trading strategies that work, it is really a bug hunting agent.

mkagenius · 2026-03-30T15:47:10 1774885630

Has to be taken in light of median income too right?

mkagenius · 2026-03-28T20:46:55 1774730815

I had the exact same idea but for AI agent harnesses.

I even created an app to explain it - https://news.ycombinator.com/item?id=47381803 (deleted the app as got no traction whatsoever)

Idea was that, the ai models like opus 4.6 and codex 5.4 have become so good at trying new ways to attack a problem, that even just Bash() tool is enough.

Continuing the idea, infact even File() operations are enough.

Again continuing the same line of thought, even just a Tape is enough. Given enough time, codex and opus will achieve your target.

mkagenius · 2026-03-28T20:37:12 1774730232

I think you are absolutely right. (had to)

mkagenius · 2026-03-27T06:00:05 1774591205

This is a bit restrictive, doesn't take screenshots. So you can't "say take screenshots of my homepage and send it to me via email"

It doesnt allow egress curl, apart from few hardcoded domains.

I have created Cronbox in the cloud which has a better utility than above. Did a "Show HN: Cronbox – Schedule AI Agents" a few days back.

https://cronbox.sh

and a pelican riding a bicycle job -

https://cronbox.sh/jobs/pelican-rides-a-bicycle?variant=term...

mkagenius · 2026-03-25T14:16:47 1774448207

Had used cactus before - https://news.ycombinator.com/item?id=44524544

Then moved to pocket pal now for local llm.

mkagenius · 2026-03-23T22:12:28 1774303948

Fwiw, my pixel 8 runs Qwen3.5 4B with 2 tok/s speed. Via pocketpal app. Somehow cactus app didn't work.

mkagenius · 2026-03-19T16:14:11 1773936851

holding off update seems like reasonable step till the patch comes. I also run a .local for apple containers though not docker.

mkagenius · 2026-03-18T21:40:57 1773870057

You could do that with say Claude Code too with rather much simpler set up.

OPs question was more around sandboxes though. To which, I would say that it's to limit unintended actions on host machine.

monkpit · 2026-03-18T22:49:20 1773874160

I want to be proven wrong, but every use case someone presents for OpenClaw is just a worse version of Claude Code, at least, so far.