More

kqr · 2026-05-22T10:11:29 1779444689

Does this mean the non-reserved tickets will get scarcer and command a higher price?

kqr · 2026-05-21T20:00:52 1779393652

Not quite true. Higher volatility makes both CAGR lower and holding costs higher. Lower-volatility investments are thus closer to their naïve historical mean. See also: https://entropicthoughts.com/do-stonks-go-up

kqr · 2026-05-21T10:52:20 1779360740

I was able to delete the target ship in the tutorial by selecting it and pressing delete, when I wanted to delete a mark next to it. Is that intentional?

epaga · 2026-05-21T11:13:13 1779361993

Whoops, that is a debug functionality that I accidentally left in. Thanks, fixed!

kqr · 2026-05-21T10:39:51 1779359991

For what it is worth, I ended up playing the tutorial mostly in real time, i.e. without pausing, because that felt more authentic. Under that constraint, I did have to play it three times to get proficient enough with the tools to succeed, but it was a lot of fun to figure it out under time pressure too.

Something that helped me a lot was exploring how the firing solution changed when I changed various parameters in the TDC. I don't know if there's a way to build that kind of exploration into the tutorial. Maybe by splitting it up into segments, asking the user to handle only one measurement at a time, and illustrating how the firing solution changes?

It is, however, annoying to start up the tutorial and click next-next-next-next to dismiss all the text when playing around with its scenario. Maybe a "dismiss tutorial text" button somewhere in the start of the tutorial?

epaga · 2026-05-21T11:26:37 1779362797

Great feedback, thanks - and I'll think about a good way to let you just dismiss the tutorial text entirely...

kqr · 2026-05-21T11:43:51 1779363831

Something else it took some time to understand was how the TDC evolved the firing solution over time. I really appreciate the yellow ghost ship on the map, but maybe there could be four more ghost ships indicating computed positions one minute, two minutes, etc. into the future? Possibly with increasing transparency.

epaga · 2026-05-21T15:34:00 1779377640

Loved this idea, so I put it in the game. Check it out and tell me what you think. :)

kqr · 2026-05-21T19:23:41 1779391421

Yeah, I noticed when I tried the second tutorial mission. That mechanic worked out great!

My first attempt at the second mission I had the periscope up too much and forgot to dive so the escort vessel rammed me. Maybe tell the player to pay attention to the detection-o-meter?

Second time I was very careful and managed to get hits with four torpedoes on the first salvo. I had to bring up the scope to watch. I had no expectation of even getting close so it was INCREDIBLY satisfying. Then at first it seemed like I'd escape but I lost track of the escort and then got blown up by depth charges.

I'm absolutel going to try again, though. This was a lot of fun.

kqr · 2026-05-21T10:31:06 1779359466

For me, part of the fun is in figuring out the "why" behind the "what". I really enjoyed getting a rundown of which tools were available to me, and then I get to figure out on my own how best to use them.

kqr · 2026-05-21T09:30:55 1779355855

> It's interesting how even 5 tok/s is still much faster than you'd typically type, but feels glacially slow for an agent.

Calling the token rate the rate at which they "type" is a bit misleading. They also do virtually all of their more complex reasoning in tokens, so 5 tokens per second is also their thinking speed. And thinking at 5 tokens per second is glacially slow.

This is why faster versions of strong models do so well on reasoning tasks like playing text adventure games[1]. Their output isn't better on a token-for-token basis, but they get so much more thinking in during a given time window, they get more opportunities to find the right conclusion.

[1]: https://entropicthoughts.com/updated-llm-benchmark

michibertel · 2026-05-21T10:04:53 1779357893

How many tok/s does an average human think?

madwolf · 2026-05-21T10:10:06 1779358206

Most of my thinking is non-verbal. I don't think in sentences. I CAN think in sentences and internally rationalize my actions and explain them and sometimes that's beneficial (rubber duck debuggin, sometimes it's good to verbalize and explain something) but usually I don't do it

kqr · 2026-05-21T12:30:38 1779366638

This question gets into information theory way beyond me, but I suspect it depends a lot on the task at hand. Human brains aren't very effective at combining sources of statistical variation, but they're great at other things. I'm personally most impressed by the cerebellum. It is highly trainable, yet if we tried to translate the things it does to maintain locomotion, proprioception, coordination of movement, etc. into tokens would probably result in a high token rate.

kqr · 2026-05-15T05:13:44 1778822024

That's interesting. Do you know what the fraction was of sprints that ended without all the work items associated with them being completed?

mplanchard · 2026-05-15T17:38:35 1778866715

At least for our team, quite low. We had a very comfortable buffer, which allowed us to deliver in time to coordinate with the rest of the org. It may not have been a bad strategy overall from the top-down perspective, but it wasn’t a lot of fun as an IC

kqr · 2026-05-13T11:18:08 1778671088

Good article! Looks like a great blog too. Subscribed to the RSS feed. Thanks for referencing.

kqr · 2026-05-13T05:21:16 1778649676

If this were the end of the story, that would be a correct interpretation of the situation.

At Amazon, something like this is likely a closely watched experiment. They knew it would incentivise waste. But they don't know what the other effects will end up being. Nobody knows -- this thread is full of loose speculation. So Amazon runs the experiment and collects the data.

----

The annoying thing about goals and incentives is that they can either be phrased in terms of input metrics (behaviours within our control) of output metrics (the outcomes we want). Input metrics are bad because they lead to skewed incentives and gaming the metrics. Output metrics are bad because they're largely affected by chance and external circumstances. (This indeed means a goal cannot be SMART on its own, because A and R are typically in tension.)

Amazon knows this. Their WBR structure is essentially about trying to set goals and targets for input metrics, and then carefully observing how input metrics correlate with output metrics. They're using a semi-scientific process to tease out the causal structure of their business. I would assume this token target is followed very closely to learn exactly what its effects are on output metrics that drive revenue and cost.

For more on this, I thnk the best public writing is Carr's Working Backwards and Chin has written about it on Commoncog too.

raxxorraxor · 2026-05-13T05:34:37 1778650477

I don't think this strategy is a viable experiment. Far too many uncontrolled variables for a very shallow complexity of "input variables" how you call it.

Simpler explanation management has no ideas and goals and this is a replacement strategy. Because they too are affected by "experimental metrics" to a degree, but that doesn't excuse this trite "science".

Any "answer" this would provide wouldn't be of higher quality than this speculation.

kqr · 2026-05-09T06:02:08 1778306528

> Try it yourself. Pick a topic that is important to you. Try searching Polymarket for probabilities, versus asking Claude about it. I wager you’ll prefer Claude’s take, even if it is less accurate. For one thing, Claude can speak to issues that are not properly resolvable forecasting questions.

I thought this was the very thing we wanted to avoid by creating reputation or money based prediction platforms rewarding statistical accuracy. We already have plenty of pundits speculating inaccurately about vague things they don't know much about.

We don't need AI to get more of that!