> asked LLMs to compile list of 10-20 writers considered canon in each decade since 1800, then identify all their notable works and years of publication. After some iterations with coding agents I got over 2,000 works by 200 authors.
Wait, so the source data is just LLM hallucinations? It makes sense to use an LLM to build the data collection, but not to build your source data.
This is in my opinion a better use of tech that has an error rate (hallucination), you just assume that its a fuzzy search, and sample the results to see how you did. I'd like to see a few from the results for sure!
It feels a lot like storing your data as an essay in a Word doc instead of a spreadsheet. It can work and all of the math is probably correct, but it's very much the wrong tool when the structured data was right there to be used instead.
The structure data is scattered all over the place. This does the very important thing of aggregating them, and bringing them together. If you had to manually do that it could take weeks.
Missing entries don’t get corrected by looking at the LLM output. That only helps when the LLM makes something up from thin air or mangles the output.
Of course it’s not the kind of question you can get an objectively correct answer for, but you could come up with the correct answer for a given methodology.
You can only correct for missing entries by doing the same work you’d need to start from scratch. But after that you now have a second list to consider.
What do you mean by due diligence here? Manually checking 2000 citations sounds a lot harder to me than just pulling the data from a reliable source to start with.
Wait, so the source data is just LLM hallucinations? It makes sense to use an LLM to build the data collection, but not to build your source data.