Claude does not have a "theory" of anything, and I'd argue applying that mental model to LLM+Tools is a major reason why Claude can delete a production database.
Well, humans also routinely accidentely delete production databases. I think at this point arguing that LLMs are just clueless automatons that have no idea what they are doing is a losing battle.
They’re not clueless they just don’t have a memory and they don’t have judgement.
They create the illusion of being able to make decisions but they are always just following a simple template.They do not consider nuance, they cannot judge between two difficult options in a real sense.
Which is why they can delete prod databases and why they cannot do expert level work
Not sure if you are being pedantic but mathematics is quite different from other fields because it is highly structured, reasoning is explicit and it contains a dense volume of high level training data. Results are able to be verified easily due to its structure.
Even then, they are most effective in assisting and are not able to produce results independently. If you have proof otherwise I would love to read up on it
I like to think of LLMs as idiot savants. Exceptional at certain tasks, but might also eat the table cloth if you stop paying attention at the wrong time.
With humans, you can kind of interview/select for a more normalized distribution of outcomes, with outliers being less probable, but not impossible.
I mean maybe it’s a losing battle today, but it is correct. So in a few years when the dust settles, we’ll probably all be using LLMs as clueless automatons that still do useful work as tools