Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, I understand a little about Unicode in this kind of problem, but a code point is an individual logical item even if it is composed of multiple bytes; being a kind of 'string' in itself. I should have asked more carefully, what would be a better system in your view?

Thanks for the link, will check it out after Christmas.



I personally believe that Swift's strings where graphemes are the smallest indexable unit are the gold standard for writing logic that might truncate multilingual text. It's still not perfect though, they add overhead and updates to Unicode might change behaviour so there's that but it should handle most cases gracefully.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: