Release date seems like a terrible x axis with how much more compute they are using. Not to mention while
I like what METR is trying to measure, it is an uber specific metric. And frankly, me just complaining, they’re prompts I feel do most of the work for the AI. I’ve never gotten as detailed instructions as they give the AI for the task
Whilst true, if you had unlimited compute 5 years ago, we wouldn't be anywhere near Mythos level purely because the technology behind the models wasn't refined enough.
I think people are typically referring to the task-completion time horizon at a fixed success rate [1]. That has had pretty robust exponential scaling for many years now.
You’re probably right on crime. But I will say that both SF and Chicago(most of my experience) local train systems are constantly filled with homeless people with severe mental health issues. Generally , without fixing that a large segment of the population are never going to want to ride public transport.
I’m not saying you’re wrong. But man haves lots of people who don’t know what a war crime is really devalued the accusation. So much so I read yours and I just assume it isn’t.(again idk)
Yeah I don’t find this article particularly insightful. If we don’t have troops on the ground to prevent attacks in the straight, it would be always be vulnerable despite superiority. Shit if we don’t control the land, they could drop a bunch jet skis with bombs in the water in the middle of the night. The straight is only 21 miles wide at some points
E2EE means end-to-end, where the ends are the participants in the chat. They can read it on your phone, but not on their servers. They need their app to separately transmit the plaintext to their servers to read it.
reply