Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
bisonbear's submissions
login
1.
I benchmarked Opus 4.8 vs. GPT 5.5 on 2 open source repos
(
stet.sh
)
3 points
by
bisonbear
10 days ago
|
past
|
discuss
2.
I used autoresearch to improve my AGENTS.md, measured against real tasks
(
stet.sh
)
8 points
by
bisonbear
16 days ago
|
past
|
7 comments
3.
A brief investigation into the GPT-5.5 regression claims
(
stet.sh
)
1 point
by
bisonbear
24 days ago
|
past
4.
The Opus 4.7 reasoning curve - Medium is the best default?
(
stet.sh
)
1 point
by
bisonbear
31 days ago
|
past
5.
GPT-5.5 low vs. medium vs. high vs. xhigh: the reasoning curve on 26 real tasks
(
stet.sh
)
2 points
by
bisonbear
36 days ago
|
past
6.
GPT-5.5 vs. GPT-5.4 vs. Opus 4.7 on 56 real coding tasks from 2 open source repo
(
stet.sh
)
4 points
by
bisonbear
43 days ago
|
past
7.
I ran Opus 4.7 vs. Old Opus 4.6 vs. New Opus 4.6 on 28 Zod tasks
(
stet.sh
)
2 points
by
bisonbear
56 days ago
|
past
8.
Coding evals are broken. CI is green while AI code quality goes unmeasured
(
stet.sh
)
1 point
by
bisonbear
59 days ago
|
past
9.
Agents.md is the highest-leverage code you're not testing
(
stet.sh
)
1 point
by
bisonbear
64 days ago
|
past
10.
Your AI coding benchmark is hiding a 2x quality gap
(
stet.sh
)
3 points
by
bisonbear
3 months ago
|
past
11.
Things I Learned at the Claude Code NYC Meetup
(
benr.build
)
2 points
by
bisonbear
4 months ago
|
past
12.
Claude vs. Codex in the Messy Middle
(
benr.build
)
1 point
by
bisonbear
5 months ago
|
past
13.
Spacetime as a Neural Network
(
benr.build
)
11 points
by
bisonbear
5 months ago
|
past
|
5 comments
14.
One agent isn't enough
(
benr.build
)
18 points
by
bisonbear
6 months ago
|
past
|
2 comments
15.
Context Engineering: The New Skill for Working with AI Agents
(
benr.build
)
1 point
by
bisonbear
7 months ago
|
past
16.
The New Math of Building with AI
(
benr.build
)
2 points
by
bisonbear
7 months ago
|
past
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: