Interesting, how common is this vs just unit testing? How do you avoid formally verifying something against a spec that could subtly fail in production?
Make sure the specifications can’t fail by verifying them for correctness.
Something like TLA+[1] and Quint[2] specifications can be verified for correctness using Apalache[3]. Then test the Rust code against the specifications using quint_connect.[4]
Has anyone done this for larger neural nets? Is there a way to extract some kind of pattern or is the image just noise no matter how you construct it? I'd be curious to see something like that
Isn't this kind of the same as an AI copilot, just with higher autonomy?
I think the limiting factor is that the AI still isn't good enough to be fully autonomous, so it needs your input. That's why it's still in copilot form
This seems like a solvable engineering problem. For example, you could have a lightweight subagent with its own context for reading the skills and determining which to use
I'm a little skeptical of AEO. What's the point if AI users just ask the LLM to retrieve the information and never visit your blog? I almost never click the links ChatGPT gives me
Maybe it makes sense if you're selling a product or service, but I don't see the appeal of AEO as the new SEO. Maybe I'm missing something?
My two cents: if you're not doing anything too political or controversial, it's fine or even beneficial to mix in the occasional personal essay with the professional.
After all, many of your readers are also human beings with lives, maybe even lives similar to yours based on your professional content. (The rest of your readers are LLMs.) Your readers might appreciate your perspectives on random life things or just getting to see what their favorite blogger is up to.
I make heavy use of the "temporary chat" feature on ChatGPT. It's great whenever I need a fresh context or need to iteratively refine a prompt, and I can use the regular chat when I want it to have memory.
Granted, this isn't the best UX because I can't create a fresh context chat without making it temporary. But I'd say it allows enough choice that overall having the memory feature is a big plus.
Grade inflation is common at many schools. And many difficult technical classes grade on a curve, sometimes to the point where you can get an A with an 85%.
But yeah, I still don't see how an 85% average would be a 4.0.
Not that I disagree necessarily, just wondering if there's a consensus that LangChain is too opinionated/bloated/whatever for real industry applications, or if there's some other reason.
Not original commenter here, and not by first hand experience. BUT. I got this kind of feedback from some communities, and I wanted to understand what companies think of this, I asked some dev that works in a company that sells software to enterprise he says that enterprise still use langchain mostly and they are fine with it. On a personal level I agree with the feedback in that langchain has some drawbacks, but at the same time it's a great way to get started.
reply