Hacker Newsnew | past | comments | ask | show | jobs | submit | moring's commentslogin

The article shows nicely how "every byte matters" is false. First, it starts off by talking about the cost of a new field, when the actual topic is array-of-structs vs. struct-of-arrays. Then, this:

> How much of an impact can this have? > Reading is:alive (1 byte) Across 1M Monsters

You aren't reading one byte here, you are reading 1M bytes! Of course, optimizing the access to 1M bytes is something to consider. Optimizing the access to one byte isn't.

The article is definitely worth reading IMHO, but it really needs a better headline!


Even more so, it shows that SoA data structure means you can add fields to your 1M monsters with little impact.

This is valid for sequential scanning of the data. The CPU will fill whole cache lines at once with the arrays that do get used and the algorithm touches all the field instances in the array.

Now think about random access to single struct instances instead: the CPU loads a cache line worth of data for each field and uses only one element out of the whole cache line. This is much worse than a compact structure representation of the same data.

SoA is not universally better.


This sounds similar to relational databases vs document oriented databases, at least when I briefly looked into database like MongoDB when such things were all the rage 15-20 years ago.

For the internal web site that customer support people used a document oriented database would be great because that wants to load everything about one customer and pretty much doesn't need anything else until the user is done supporting that customer.

For the dozens or periodic reports that needed to be generated relational was way better. A given report generally only wanted a small amount of per customer data but wanted that for all customers.

A little bit of searching and LLM querying suggests that nowadays there are databases that are good at both kind of tasks, in particular Postgress with JSONB, at least at the scale we were looking at (maybe 30k or so customers), but maybe really big operations would need more specialized software.


The Array-of-Struct vs Struct-vs-Array organization is actually more similar to row-major ordering vs column-major ordering, i.e. the data structure that analysis databases use to optimize for aggregate calculations. Document databases are not really comparable because they don't impose structure on the data; with document databases you just have a tree of JSON elements, which is neither AoS nor SoA.

Or, another name for the same thing columnar: storage vs. row storage.

No it's not always better and I didn't mean to imply it was. I was simply saying that the article argues against its title.

In both cases you want to think about locality of the next read and structure the data accordingly.


> SoA is not universally better.

This is an important part of Data-oriented Design: the representation of the data should be pragmatically tied to its access patterns, not dogma.

Richard Fabian's DoD book gives the example that (x,y,z) is almost always better as a classic array-of-structs rather than a struct-of-arrays, because if you're accessing one dimension, you probably are want to process the other two dimensions at the same time:

https://www.dataorienteddesign.com/dodbook/node9.html#SECTIO...


> you can add fields to your 1M monsters with little impact.

Great for this access pattern, but I wouldn't make a general statement like that. This is the same thing as row-oriented vs column-oriented databases, OLTP vs OLAP. SoA is weak if you are adding/removing monsters more often than accessing a single "hot" field.


> SoA is weak if you are adding/removing monsters more often than accessing a single "hot" field.

Why is that? Genuinely curious. Does "weak" mean that it performs worse than AoS, or that the gains aren't as significant versus AoS?


It's because removing a monster with 20 fields from an SoA structure means resizing 20 arrays. Removing the same monster from an AoS array involves resizing a single array, which you're going to process in a very cache friendly way.

I'm not sure why anybody would at the same time be implementing SoA AND resizing 20 arrays for a single delete, those things seem to be on either ends of the "I care about performance" spectrum.

The point is that a simple SoA implementation requires this - each field in the monster struct is an item in 20 different arrays. So, removing one monster means removing that item from those 20 arrays.

Now, as others have suggested, you can have a more complex implementation, where instead of removing the monster's fields from those arrays, you just mark them as "dead" or whatever and then skip them when consuming the relevant arrays, with some relatively small extra bookkeeping overhead. Of course, this comes with its own drawbacks, especially if the number of monsters is very dynamic and you are memory constrained.

The point is not to say that SoA is never good for performance, it obviously and certainly is, probably even in most cases. It's just not always best for performance, this was all.


> So, removing one monster means removing that item from those 20 arrays.

Removing from an array is not the same as resizing, which is what I commented on. Resizing is a very deliberate, bad, choice.

If you need to support deletes, you can do this without resizing an array. Either by tracking object lifetimes and inserting tombstones, or by swapping to fill in deleted objects. Both of them retain good performance characteristics. Both of them are easy.

This is not "simple vs complex" this is "I misunderstand vs understand SoA"


Assuming ordering isn't a concern, can't you just have a field called "removed" and skip those when iterating?

Or swap it with the last monster, and keeping an index for the last monster alive.


Sure, but these schemes might have their own drawbacks depending on the exact use case - especially if you have a very dynamic number of monsters and constantly add and remove them (say, some kind of bullet hell style game).

Then you have to read the "removed" field on every field read on every operation.

SoA is only useful when you don't read multiple fields for most operations.


Two fields should be fine, actually. The way caches are organized you are very unlikely to thrash with the lookups (due to n-way associativity) while only keeping relevant data in the cache at the same time. You still have roughly the following layout (in the cache), where A is the field and V is valid:

  | A1 A2 A3 A4 | A5 A6 A7 A8 | ...
  | V1 V2 V3 V4 | V5 V6 V7 V8 | ...
The former access pattern still yields a clean cache layout where no unnecessary data is loaded (which is the most costly operation here by far) as opposed to

  | A1 V1 B1 C1 | ... | A2 V2 B2 C2  | ...
In the general case there will exist a number of fields for which SOA layout will be worse if all are accessed close to each other, but for just a validity indicator this should not be the case. I think your statement is not wrong, but also not 100% correct.

This is on par to linear search being faster than binary search for small n. As soon as caches and branch prediction chime in many rules of thumb just change. Most importantly, however, is that a distinction between small and large n basically _needs_ to happen at that point.


Presumably they're referring to resizing the arrays.

Array resizing is avoidable with an embedded free list if ordering is of no concern.

If you take out ordering, then lookups on your SoA are now a search, and n-field lookup on an entity is now a JOIN operation.

The smarter you get about it, the closer you get to an OLAP db

Which leads to my theory… I feel like Bevy could be implemented on top of an in-memory DuckDB and get away with it


Depending on your access patterns, maybe you could have a hash table mapping entities ids to indexes in your SoA. Perhaps that's viable if looking up a single entity is not typical to your use case?

> Which leads to my theory… I feel like Bevy could be implemented on top of an in-memory DuckDB and get away with it

Haha, it certainly does sound viable.


Yes. I think one of the big advantages of SoA is that you only pay for the fields you're currently using. If you need a field somewhere, you can add it and only pay the cost of iterating it where you need it.

Every Struct Matters

> You can't have an extension system that (...)

Yes you can. Extension systems of today have multiple problems that prevent that. The basic assumption that has to go, though, is that a core application like VSCode can be written once, then be extended to infinity without the core evolving. That's an assumption you see everywhere in extension systems, and it restricts everything to "features or security, but not both".

Taking your examples:

> run a locally installed linter

VSCode and its extensions have certain files opened. The linter can do much less if it gets read-only access to those files, but not write access and no other files, not the open internet or something.

This has then to be coupled with those permissions being displayed before installing, allowing them to be reviewed by users as well as plugin repo curators. Basically listing those permissions as declarative metadata.

Because then a user or curator won't see "this plugin can read and write all your files" but "this plugin can read (but not write) the files being opened by VSCode". If the plugin wants to exfiltrate those files, the permissions would also list "this plugin can send HTTP requests to totally-legit-site.ru" instead of "this plugin gets arbitrary internet access".

Main lession: permissions are WAY too coarse. But if they are fine-grained, they will soon no longer match the evolution of extensions, so the core system has to evolve too.

> view the status of docker containers

"This plugin can view the status of all docker containers started by other VSCode extensions in the same VSCode window".

> users will scream and cry about extensions being limited

Are those the same users? We might need two different products here, "feature VSCode" and "secure VSCode".


I don’t know what linters you use, but the ones I like are the ones that show you problems in the workspace stably, not just in the files that happen to be open and altering as files open and close.

You can always improve, but pretending like there’s an easy solution is lazy - if it was easy it would have been done.


This was bad wording on my part. I wrote "open" but that should have been "files in the workspace/project". Really, "open" WRT files is so overloaded already, they can be in the workspace, have an editor tab open for them, or have an active file handle, to name just three.

> You can always improve, but pretending like there’s an easy solution is lazy - if it was easy it would have been done.

I claimed that it is possible, not that it is easy.


Im highlighting that defining that sandbox policy cohesively in a way that works for all the different extensions types you’d want to support and doesn’t overwhelm the user with permission fatigue is difficult as to be impossible.

Browsers have a different problem - they protect different websites against each other. The IDE should probably protect you against extensions being able to access arbitrary files on disk, but even that’s difficult (eg a bundled linter often wants to read user defaults in a central location. But protecting even further than that is even harder, especially as here where the access was to the actual repo not anything else.


These kinds of permissions lists have been mostly a failure in history. Users see a massive list of permissions, or permissions constantly changing between updates and just ignore them because there’s no way to reasonably audit them or take any action on them.

Securing VS code would require making malware that has access to the system impossible, not just making it add a permission to the permission list.


Ideally, the permission list is meant for curators which end-users trust and can rely on.

Also, historically, permission lists have been fine-grained but too coarse at the same time, meaning they were "fine" in the wrong way, based on what is easy to implement instead of what the user needs.


Most of the recent compromises have been from trusted people who had their accounts compromised. So just picking someone you trust doesn’t work out.


To do things that a human could have done in theory, but did not do because it would have been too expensive.


> It seems like one needs a big machine farm and a vast corpus of training data with a lot of manual curation to get started creating a competitive LLM, plus whatever technical expertise that I don't even know about. The stuff that makes LLMs exist now and not earlier.

"big machine farm" reminds me of folding@home, which needed the same and got it.

"manual curation" is what Wikipedia did, as well as the free software community.

"technical expertise" is present in the free software world too. It is sparse since it is sparse in the world as a whole, but it exists.

"no Linus Torvalds figure" might be the main problem ATM.


I also thought of these after writing my comment. The main problems that I see with these solutions are:

- Training seems to need a lot of data available at the same time, which is difficult to handle on commodity hardware.

- Manual curation can be a mind-numbing task, it might need to be gamified somehow.

There is a chance that the curation could be higher quality than the current corporate stuff. Pretty sure that it's not an intrinsic property of LLMs to write like TED talks.


Bonus points for the kafkaesque responses you get as the end-user when you try to actually pass that information upstream where it could be fixed...


It seems hard to believe that a one-hour delay on such a counter is impossible to achieve, and one hour would reduce the risk from "catastrophic" to "serious problem" in most cases.

Also, if implementing a cap is a desired feature that justifies trade-offs to be made, then it is psosible to translate the budget cap (in terms of money) back into service-specific caps that are easier to keep consistent. Such as "autoscale this set of VMs" and "my budget cap is $1000/hour", with the VM type being priced at $10/hour, translated to "autoscale to at most 100 instances". That would need dev work (i.e. this feature being considered important) and would not respect the budget cap in a cross-service way automatically, but still it is another piece in the puzzle.


Eh, suddenly turning off all services in your account because you hit your cap is just as much a DoS type event - just of your services, not your wallet.


So? Many would prefer a DoS-type event over spending $WHATEVER_THEIR_HARD_CAP_IS. This is kinda the definition of a hard cap, so you would place it sufficiently high that DoSing your system is indeed preferable.

Also, doing this on a per-service basis doesn't seem that far-fetched to me, so you'd only kill that service and get at least some chance that the rest of your system remains usable.


It’s the trade offs.

If you have an actual enforced cap, those services will be disabled until you resolve the cap - which depending on the latency for usage updates, may be hours after you pass the cap, and hours after you resolve the issue.

Or you have ‘warnings’, and your services keep working, but you spend more $$.

Previously, people seemed to be more worried about service outages than raw $$. Now it’s the other way around.

It’s a common issue with disk quotas in on-prem systems too, and they tend to cause a lot of similar types of problems in both directions.


> If you made a local-first, P2P version of Figma what would break first?

The guy who has to keep it running day by day, next to the other 30 local-first systems.


What is there to run? There are millions of apps that don’t require maintenance, this was the default before SaaS.


Every app need maintenance if it's connected to the internet. Security updates at minimum.


> The deeply unexpected thing about that, to me, is, if they hate some parts of the process, why are they keeping them?

Why are you assuming that they are given a choice? In my experience, whenever a team is trying "agile" in some way but hate it AND are given the choice, they drop it ASAP and are 100% convinced that they are better off without it. Those that hate it and don't stop doing it, are doing so because they are forced to.


> Why are you assuming that they are given a choice?

Because self-determination was the defining component of every agile deployment I've ever been involved in or personally driven. I don't think I can really picture what something calling itself agile yet lacking this component would even look like, honestly -- hoping you will forgive me this failure of imagination.


> whenever a team is trying "agile" in some way but hate it AND are given the choice, they drop it ASAP

Isnt that in itself "agile"? And I specifically dont mean following a religous ceremony plan etc but recognizing that a part of their process isnt working and then changing it. To me thats the entire point of actual agile. You try a process, it doesnt work, you analyze, and adapt.


I think there is: It is the line between "not spending extra money to make sure it works" and "spending extra money to make sure it won't work".

There is a related problem with warranty: an inferior third-party replacement part may cause damage to higher-quality original parts. There is a line here between "making sure you don't have to deal with follow-up damage caused by inferior parts" and "preventing the use of inferior parts". This is a bit more blurry because most cases won't be clear-cut, and dealing with them will be a burden on the original manufacturer.

I think it is important that we reward the nice players as much as we punish the bad ones. A blanket "all companies bad" just means that no company has an incentive to be anything less than bad.


This may sound absurd, but it was a mental breakthrough for me to realize that all these "what-ifs" (multicore 6502 computers or whatever) are really simple to simulate in software only, with a bespoke simulator, WAY above register transfer level. There is really no need to deal with the peculiarities of FPGAs or Verilog; just write an instruction-level simulator for the ISA you choose (that's really simple for the 6502), make it run several instances for multicore architectures, and there you go for your custom computer.

I mean this as a hint for people who are similarly stuck with RTL tinkering when they actually want to tinker with system architecture.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: