This hasn't been my experience, ROCm is usually not only a bit slower for me (~32 t/s vs ~43 t/s on the main model I use), it is way less reliable; any upgrade in kernel version or AMD driver and suddenly everything is broken
It can be tricky to get/keep ROCm working, but around 7.2 it became reliable and as fast as or faster than ROCm 6.4.
And, I think the first response time of ROCm is pretty consistently faster than Vulkan, even if Vulkan has a slightly higher token rate. Though I don't see that big of a different on token rates, either. Honestly, though, I haven't done enough real testing to know for sure. The benchmarks Donato Capitella posts (https://kyuz0.github.io/amd-strix-halo-toolboxes/) have been my guide on what to run in what way, and the performance of most things that can run on the Strix Halo are Fast Enough(tm) such that I don't agonize about performance. When Vulkan was all that worked with llama.cpp, that's what I used. Now that ROCm is reliable, I'm using ROCm. ROCm feels faster, maybe just because it processes prompts faster and starts typing the answer fast (at a rate faster than I can read it, so when it starts answering is the more important metric even if faster token rate would lead to it finishing faster).
In short: If ever I'm doing something that will take many hours to complete, and I need to optimize it, I'll do some tests first to be sure I'm using the optimal path. Otherwise, as long as ROCm is working, I'll probably just keep using it.
Sure but on the flip side, skills not being in context means that for many harnesses, the model simply never finds them. Whether MCP or Skills are "better" depends extremely heavily on the context management functionality of your harness because if you use a relatively naive harness (i.e. one that implements MCP and Skills in a straightforward way), MCPs will generally be more effective, especially if your model is local-only (i.e. dumb), but at the cost of context.
That'll work great until your first customer from a CJK or RTL language writes in, "Hey, how come I can't type in your app?", or the blind user writes in "Hey how come your app is completely blank?" then you'll be right in the middle of the "Find Out" phase
These strategies are fine for toy apps but you cannot ship a production app to millions or even thousands of people without these basics.
A pervasive "Someone needs to do something!!!" attitude is why. Americans will forever wait for the school principal to come and get everyone into trouble
There is a lot of direct action happening right now in Minneapolis, with people keeping watch on every block. I agree this level of organizing should be happening nationwide.
The country that lived through pervasive mass state surveillance by secret police for 40 years is unsurprisingly quite cagey about digital centralization of records, even so many years later
Transit companies are pretty bad at PKI infrastructure and internet security combined with the inefficiencies inherent in German bureaucracy / anti-centralization as well as the inherent insecurity of the SEPA model sometimes make crime possible
> Germany has a tendency to wish something into existence with a law
After living here five years I've finally realized the same thing - Germany is the country of Rules, often well-intentioned, but no one actually follows them. It's especially damning when those rules actually are important and would protect regular people esp. around labor and housing, but oops zero meaningful enforcement. Wish we'd have 1/10th the rules but people had to actually follow them
reply