Hacker Newsnew | past | comments | ask | show | jobs | submit | khasan222's commentslogin

For sure every time you use ai you’re sacrificing understanding if you don’t plan out and understand how exactly the ai is going to do the work you asked it to do.

The same output that is such a bad thing in this article can also be used to gain context, by making a thorough plan with your ai first, reading through the plan and proposing changes just like you would with a real developer.

You can also use this output to have the ai write a journal as well. The journal can be as detailed as possible and essentially a ledger of all of the changes your ai has made to the code. This allows not only for your teammates reviewing your pr to gain greater context, but also can be used by yourself, or even the ai itself to figure the why behind a particular implementation was done the way it was, far into the future even.

Lastly how many of us ever deploy code without actually checking the feature works e2e? I would gather not many of us do, I don’t, because even though we may have a greater understanding of the code, we can make mistakes in the code or in our logic. And I keep coming back to why would we treat llms any differently? I believe we should be spending our energy thoroughly manually testing a feature to make sure when we brainstormed we actually did get every edge case, and it works well.


I think most people test at least a happy path of their code end to end. I think we can all agree that your last sentence is far more aspirational than bare minimum standard practice. (“I believe we should be spending our energy thoroughly manually testing a feature to make sure when we brainstormed we actually did get every edge case, and it works well.”)

I did one small side web project by only writing spec tests and prompts and testing the results in a browser, never reading nor editing a single line of generated code. It was something for home and so low stakes, but it worked remarkably well and was much better tested than the typical 2022-era home project of mine.


My last sentence is definitely aspirational, it is how I try and go about it, but for sure I make mistakes. However your comments about writing spec tests was interesting to me.

Honestly I don’t even write tests manually because of coverage checks. Being that the coverage check is not something easily manipulated, I always tell the ai, don’t ever change configs, and make the coverage pass whatever I set it to, most times > 95%. I just tell the AI, make this coverage pass.

I find tremendous success with this technique, or anytime really I can find an objective way for the ai to test its work.


What does “much better tested” mean to you?

If you don’t read the tests to check they confer your intent or specifications, they’re more like tautologies than tests, you know?


I don't understand the comment.

I wrote the tests. That was how I expressed the spec and my intentions.


Ah. Most Claude users let Claude write their tests for them and I assumed you were too. Sorry.


> I always wondered why people don't also ask the AI to generate code comments/documentation, summaries of those documentation, overview of the system, and re-review them all for correctness for the changes they asked the AI to do.

I now on all of my projects have an ai journal that stands as a ledger for every change the ai has made, and why it was made. I don’t read it that hard personally because I spend so much time planning with my agent before letting it code. However I have found it very useful in sharing code between people, or having Claude look through the journal to gain context when modifying or adding a feature.


I’m not very familiar with Go, however after looking at the repo I can’t help but notice there is no infra to ensure code quality. Do others see the same thing, because if so that is the real problem

Yes I agree for sure llms write terrible code when left to their own devices, but so do most engineers. Which is why we have so many tools to help keep a certain level of quality. Duplication checks, tests, linters, other engineers.

I find whenever you make an llm repo without these checks, and more, it will write like an enthusiastic junior engineer, wrong and strong. However a junior engineer would be hard pressed to get 95% coverage on a codebase, the ai is more than willing and does it in a few minutes. We can use things like this to our advantage, how many people have ever seen a repo with 100% test coverage? With ai this is very possible, with people not so much.

LLM’s writes terrible code, we know this, but when dealing with humans that write terrible code we have many techniques. We should be using those same techniques to keep the llms honest, but more importantly verifiable.


Go has a built-in tools that mimic formatting + linters. Also LSP is a first class citizen in Go. I don't know what other "code quality" infra there is out there aside from formatting and linting.


Completely agree! People tend to forget we are non deterministic too! Yet we are able to write code fine, and fairly reliably by using tools that can help keep us fairly honest.

I think most problems with ai tend to be around can you deterministically test the thing you are asking it to do?

How many of us would never ever show work, without going to check the thing we just built first?


> can you deterministically test the thing you are asking it to do?

Of course: have it write tests first; and run them to check its work.

Works well for refactoring, but greenfield implementations still rely on a spec that is guaranteed to be incomplete, overcomplete and wrong in many ways.


You can't ask something to check its own work without external reward/penalty. It'll cheat.


Weirdly, and i fully think this is just some cognitive bias I don't have the knowledge to name, the ai seems very happy to please me. Like when it gets something done in one shot, it seems very happy to do so.


It's because expressing emotion tests well in RLHF (reinforcement learning, human feedback), which is the layer on top of the next-token-predictor LLM. As a bonus, it helps manipulate operator reactions to incorrect output, and improve engagement (aka token use).

The "thought process" of an LLM only exists as inference response to next token prediction prompts. It's the illusion of emotion.


Well if the spec is incomplete it sounds like you should lower scope for the AI, and then go from there. I wouldn't be too keen to give a junior engineer free reign and expect awesomeness


I actually now think ai prompt writing in the IDE is completely overkill nowadays.

IDEs are made for just a human to interact with code. I think the paradigm of forcing these tools that weren’t built for this to do this, is us trying to fit a square peg in a round hole.

Call me old, but don’t put ai in my ide. My ide was made for a human, not an ai. For the established players for sure it makes sense since they already have space on our machines. But for the new ones imo terminal, or dedicated llm interfaces are where it’s at.

If I’m writing code sure suggest the next line. If the machine is writing code, let it, and just supervise properly. and have the proper interface that allows the strength of each


My IDE has nicer tooling for things like diffs, and has all of my LSP's configured which the harness can utilize


Yeah but that is how many tools just to get it to work, and how much burden on your PC. it just seems simpler to me to just use as few tools as possible to accomplish the goal.

Also shameless plug, I wrote something about this very thing. https://khalah.medium.com/getting-ai-to-work-right-27b750dba...


I was just thinking the other day how ai will be way worse than social media in terms of influencing people. Before though I was thinking of times people have been convinced by chat bots they were god or a genius.

Social media was so powerful because it convinced people their world view was popular and correct even if it definitely wasn’t, but at least then it was actually happening somewhere. This is just completely made up, to feed into people’s world view, and they don’t even have to find the perfect figure they can make one up.

We’re definitely going to suffer a lot before it gets better. Interesting times we’re living in.


It was definitely often made up before as well, just by rooms with 100s of humans working full time.


Ngl I’m reading this article after having used ai to build a beautiful front end that is pixel perfect.

Yes ai can’t see, it only understands numbers. So tell it to use image magick to compare the screenshot to the actual mockup, tell it to get less than 5% difference and don’t use more than 20% blur. Thank me later.

I built a whole website in like 2 days with this technique.

Everyone seems to have trouble telling ai how to check its work and that’s the real problem imho.

Truly if you took the best dev in the world and had them write 1000 lines of code without stopping to check the result they would also get it wrong. And the machine is only made in a likeness of our image.

PS. You think Christian god was also pissed at how much we lie? :)


It's hard to interpret comments like this because we all have different standards and use cases. So it would really help if you could link to it. Even in a roundabout way if you want to avoid the impression of self-promotion.


I built a few websites, most of them it wouldn’t be wise to place on here. But someone emailed me about this, so I’ll do my best to help I did build https://hartwork.life for a friend with a design from open ai (pre google stitch which is my current preferred tool)

Here is the line from my Claude code to get something like this. Keep in mind I didnt use mcp for playwright with this particular implementation but it is my preferred method currently. Tha

CRITICAL - When implementing a feature based off of an image mockup, use google chrome from the applications folder set the browser dimensions to the width and height of the mockup, capture a screenshot, and compare that screenshot directly to the mockup with imagemagick. If the image is less than 90% similar go back and try and modify the code so that way the website matches the mockup closer. If a change you make makes the similarity go down, undo it, and try something else. be mindful the fonts will never be laid out exactly like the mockup, please use blur at a max of 10% to see if the images are closer matching. If you spend more than 10 cycles screen-shotting and comparing, stop and show the user how similar they are mentioning any problems

The more text the harder it becomes and it’s why we really need the blue because fonts are almost always rendered differently.


Thanks. I would say yeah, it's not too bad, but it is also a pretty simple site.

There are some interesting issues that probably relate to your workflow, like the nav links are different sizes, the icons too. And the resolution of some of the images/icons on a MacBook is poor. But I suspect that's because a simple ImageMagick raster diff will fuzz over those kind of differences.

I wonder if you can make some tweaks or find a better representation than pure raster screenshots to fix this. Can't really deal in vector images because AI sucks at outputting those, and you can't print a web page to SVG.

There was a super niche website framework that only used SVG a while ago. Would be funny if that kind of thing makes a takes off just so AI can do better.


Oh yeah the natural images are also low quality. Email me I’ll share with you something else I built


I feel like 2 days to build this is a bit much given the simplicity. I think the point stands.

I will grant you that this is more tasteful than most of the AI sites I see. It’s a good looking little site but nothing here screams, “AI really accelerated this.”


Thank you. Yes took a bit but still way faster than by hand. There are other store pages that are also implemented. This 1 page took me like an hour lol.


Some feedback:

1. The main page asks for an email to be notified when the hoodie is available to buy, but I can add the goodie to my shopping cart and proceed to check out 2. The product page mentions a 6’ model but there is no model in the images 3. The check out page says “there are no payment options, please contact us”


Thank you for the feedback. The store isn’t done waiting on the friend for actual products with their images etc.

If you really want to see what I was messing with email me. I’ll share on non public forums.


Please share what you created! I think people have very different views for what is a good interface, or a tolerable one. I think as a front-end developer and designer I notice a lot of problems most people don't care about.


https://poolometer.com

I built this frontend with Sonnet 4.5 last Fall and I’m about to “launch” it

I used only prompts, but those prompts included ChatGPT’s research on Memphis design ;)

Using codex for front end design is like asking the valedictorian mega nerd to paint your portrait. Gemini and Claude are both artists.


Sorry to be the one to break it to you, but no, you did not design a website just fine with AI. It’s not even just “good”. It’s average. Painfully average, to the point of it being easily mistaken for a scam.

Very bad results—as expected from an AI.

Nothing to brag about here.


I completely disagree. Making a average website is the goal of most businesses that are selling an actual product. His website looks modern and welcoming and does not distract or take away from the actual content. This exactly what most people should aim for. Some actual constructive criticism is some o the icons in the example log mood look weird on my phone, with really small emojis overlapping the face emoji


No one should aim for average, that’s an incredibly defeatist way of looking at it. Besides, design matters. I know HN is frequented mostly by people with very little interest in such topics, but design absolutely matters.

And yes, while the author’s website is perfectly passable, it is by no means “good”. People pick up on that, they might not know they do, but they do. Design wouldn’t be an industry and a school by itself if it didn’t matter and just the average were good enough.


A lot of people don't make websites for a living. If they are a small business and have other things to worry about in terms of actual work, being able to prompt for a clean, professional website frees up their time and means they don't have to use additional funds to hire a developer.


But you can use squarespace/shopify etc for that…


Thank you, agreed, I will iterate on this more.

When I shared this I wasn't thinking about the marketing site -- I meant to show the product itself. Given the feedback here I no longer think it's a good representative as-is, especially with the generic SVGs / rounded cards


I can't help but think you and the other commenters reducing this to slop didn't even try the product. I thought it went without saying that I wasn't posting to show a marketing landing page.


Nah


Well it works but it also looks like every other generic bootstrap based website with not even an original palette choice. Great for a project like this, unusable for any client work


Is this a critique of the marketing page or of the product itself?

I didn't intend for this to be about the marketing page -- what you say is true of just about every marketing page. They're prevalent because they are good at distilling information without overwhelming the user. But I agree I can do better and will work on this more, I really appreciate getting this feedback

Most people look at pool chemistry/maintenance as painfully overwhelming, so for everyone to say this looks boring or mundane is a bit validating. No one has (yet!) said they don't understand the product, it's purpose, or it's value :)


The palette looks a lot like the basic colors from Bootstrap is more the thing. Which is what models tend to do a lot of the time, because you know, that's what's been learned.

Also: - why shadows somewhere and not for other cards? - why so many different font sizes with no hyerarchy? - the paddings and margins are inconsistent and don't convey visual rythm. Sometimes there's too much space and sometimes they are too cramped.

ecc...

Is this an ok amateur website you couldn't have made this quick? Yes. Is that a sufficient value proposition to say that Ai has solved frontend? no.

On a side note: Would you pay the actual real price of these models to achieve this same results, if they weren't subsidized by delusional billionaires? Up to you to respond.


Though it's somewhat clear from the use of tiles with the icon colours and the choice of border colours and all, I quite like it. I would have expected the colour theme from the navbar to be repeated because that's a more non standard palette. I would do that, maybe use a different tile layout (use a tile shape resembling a pool tile? Or even a rectangle signifying a typical pool shape) and create some vector icons for them using the navbar colour scheme.


Some more serious critique of things I noticed within 30 seconds:

- Text isn't selectable on the page.

- The tooltip in the "day 1" to "day 14" cards gets cut off by the border (I see this mistake ALL the time with AI-generated frontends btw)

- It's sparse and very long. I think the information could be condensed in half the size, and it would improve the presentation. This is personal preference though.

- The playbooks' "mark complete" are not persisted on reload or navigation.

All in all, it's functional and quite decent. I agree with the other people saying it looks generic, but I disagree on it being necessarily a bad thing for this kind of product.

I know nothing about pools so I can't comment on the accuracy of the playbooks. It's nice that there's so many of them, but given the LLM vibe of the text I'm slightly suspicious.


I see that you haven't finished the Automatic Sensor Automation. If you need help with that, contact me, I have experience with embedded product development and I like working on interesting projects :)


Why don’t these llm’s just allow you to pick from a set of standardised templates and then allow you to customise it from there in terms of both functionality and design?

What you have got as output is what I also get as output from llm’s - they suck the soul out of everything. Which is fine in the right context but that shouldn’t we as a species strive for in design imo.


Sorry but this website screams AI slop to me. Very sparse, lots of cards and random icons and rounded corners, looks like a few messages in to a Claude code session


I intended to share the product rather than the marketing page. I mean, I didn't intend to share this at all yet because it's not done, but when I saw people asking for examples..

But yeah, marketing site looks like a marketing site. I'm realizing now that a lot of my app's internal design/flair is missing from the marketing page -- so I appreciate your looking/commenting


Looks as if AI sucks at frontend tbh.


Is this a critique of the marketing page or of the product?


Hey, one thing I made with this technique is hartwork.life a simple Wordpress store for a friend. I used open ai to design it for me, and then used the techniques above to get Claude code to implement the proper designs.

I am still trying to learn how to wrangle Claude properly, but I have this Claude.md[1] for that I used to make the website. In particular one of the last rules about using imageMagick for comparison.

I haven’t touched this website in a bit (waiting on client) so now I use playwright mcp for the screenshots and the browser interactions.

[1] https://github.com/panda01/hartwork-woocommerce-wp-theme/blo...


I started with a boilerplate but AI has been huge at letting me get what I want in terms of frontend building when I was never talented at design or css.

I built https://bridge.ritza.co (demo@example.com username and password if you don't want to sign up) as a trello/linear replacement without looking at a single line of code and it's both good enough for me and doesn't have the obvious AI frontend 'look' as it was copying from the starter.


Highlighted text "no per-seat pricing" is unreadable in dark mode on the home page (dark blue on black). It's surprising for me to see someone use this as an example of decent design because I'm somewhat sure this front page text coloring was never seen/reviewed by a human.


> doesn't have the obvious AI frontend 'look' as it was copying from the starter.

Check out the other reply and scroll down a bit…


I mean the app itself, not really the landing page if that's what you're referring to?


Yeah you’re right my bad.


What’s the point in saying you built something beautiful and not showing it?

Share it. I used Claude earlier to test out its design capabilities and what I got as output was flat and tasteless.


It's kind of wild in terms of how it will use different random designs, even given a specific style guideline. Even if you tell it to use a given framework like MUI or Mantine, it will stray largely from format.

I don't mind working through a lot of the UI myself, but it's definitely a shortcoming IMO... that said, being able to scaffold boilerplate or testing harnesses for for complex UI has been really nice overall. I came up with the following component as an image zoom component, where I can separately control the zoom in/out in under a couple hours... took longer to setup the CI/CD stuff than the primary component logic.

https://tracker1.github.io/image-zoomer-too/


Eh, for many reasons I am not posting it here. It is a passion project for something and would lead to problems if I post it here. That being said I was trying to share the technique.

The reason for the post is that even without the actual website one should be able to envision the technique and how it may or may not work. Also if you look above recently I added links to the Claude.md for another thing I was working on for a friend that also had to solve this problem.

Just want to give people the tools to use ai well from my own findings


Software developers have been calling their stuff "beautiful" for years now. It's bullshit. Almost none of it is beautiful. They just mean it looks like whatever is trendy at the time.


The Skeuomorphic stuff in the late 90s was beautiful.


Is Claude making tasteful and thoughtful recreations today? Or just cloning aesthetics and missing intention?


I am a backend guy, so forgive my ignorance, but for web based apps I am confused what "pixel perfect" even means. I can build a site to look one way on my computer, it will most likely not look the same way on whatever device you use to access the site.

Feeding the model images for my local computer sounds like a recipe given my experience with the tools to have it over-optimize for the wrong end device.


Pixel perfect means it looks EXACTLY like the design comp.

It goes completely out of the window if the browser window isn't the exact size of the mockup.

You might charitably say that pixel perfect means that the implementation intersects with the design comp at some specific dimensions but where are the extra rules coming from, then?

It's an archaic term that conflates the artifact produced by an incomplete design process (an artist's rendering of what the web page might look like) with the actual inputs of the development process (values and constraints).


"Pixel perfect" is about attention to detail and consistency. Margins, padding, or the combination of these inside other containers will stick out when they're not consistent.

Here's an example that I personally encountered: what if you have a <h1>Text</h1> and it has a certain left margin. Then another heading except it has a nested button component (which internally comes with some padding). Then the "Text" in both aren't aligned from section to section and it is jarring.


Yes can you share the front end that you created using this technique?


Imagine being a luddite in the big 26


We built the frontend for https://brooked.io and https://app.brooked.io using only prompting so I agree!


You could argue that what you built isn't novel or complex in any way -- (politely) it's basically a clone of hundreds of other SAAS homepages. i.e. its a perfect use-case for AI.

Perhaps the results would be different if you had a specific novel design or interaction in mind, and you wanted the AI to implement that exactly as you wanted.

edit: My point proven by the other examples from this thread. Same format, same "feature cards" etc. https://bridge.ritza.co/ https://poolometer.com/


I think it’s ok that it’s similar to other SaaS websites. It wouldn’t exist if it weren’t for LLMs and it gets the job done and looks decent.


It shows.

The landing page looks like every other AI slopped product page out there.


It’s funny how there are a bunch of responses to this post all showing off their great AI designs that are literally the same thing with different (each horrible) color palettes.


It's really weird. I don't even care if they used AI as part of their development process. But most AI™ developed stuff is just so insanely soulless crap I can instantly can tell and instinctively close the tab.

If you're AI developed software was so great, I couldn't tell it's AI.

Like I cannot wrap my head around how anyone vibe slops and think "No, this is good. I will now proudly show this off."


  > Your data,
  > irectl in you
  > spreadsheets
I'm guessing the third word is "directly“? The D is cut off. And the grammar is wrong, should be "in your spreadsheets" - maybe that is another letter cut off?

Go back to human devs.


The last time I tried to make AI built a drag and drop UI, it failed miserably. Things wouldn't line up or even didn't work at all. Any tips for that?


Ask it to take control of a browser using something like Playwright and use the UI itself like an end user would and evaluate whether it is a good experience.


> Yes ai can’t see, it only understands numbers.

I've also used AI to build frontends that I'm more than satisfied with, and I think it can "see" perfectly fine. The frontier models are multi-modal and pretty good at vision. You can hook up your coding harness to your browser which will take screenshots of your rendered frontend and modify the code accordingly.


> Ngl I’m reading this article after having used ai to build a beautiful front end that is pixel perfect.

Was about to say the same thing


That is a clever trick!


It does indeed if it’s in plan mode, kind of just not as detailed. Me being an engineer I can spot certain things or question assumptions it may make


It was amazing to me how bad cursor is with using the same model I use in Claude. Even with little knowledge on how to test the llms I was able to get very minimal mvps. But I find the real trick is to have the proper tools to reign in the ai.

Thorough CLAUDE.md, that makes sure it checks the tests, lints the code, does type checks, and code coverage checks too. The more checks for code quality the better.

It’s just a bowling ball in th hands of a toddler, and needs to ramp and guide rails to knock down some pins. Fortunately we get more than 2 tries with code.


Cursor needs a paradigm shift to remain relevant, what was spectacular at first now is just banal and better done by other tools.


Crying. I’m stealing this.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: