I dont really get the hype with all the N1X thing when in reality this is the same almost 1 yr old GB10 that was released with the DGX Spark and proved to be quite a disappointment
It is great for inference for single user/single session. it is not replacement for graphical accelerator, that run several concurrent inference sessions in parallel.
Basically the same tradeoff as macmini with unified memory.
The RTX GPU laptops run very hot. Even though they are pound for pound better, it’s just runs too hot for local llm usage for me at least. Prefer Macs for this. A lot of AMD cards also run cooler. I wonder if undervting would help with smaller models and heat.
I mean the GB10 is pretty efficient for the power it has, but imho is nowhere near the power efficiency of Apple Silicon (it was never intended to be a chip used for mobile devices). I guess this is kind of the movement Apple did with the A12Z and the Mini but... the other way around?
I think its gonna be another failure as we are used to see with the PC market these days.
It's probably more that LLM inference speed comes from having a large amount of fast RAM. And fast RAM is brutally expensive right now.
At this point, your cost-efficient options include used 3090s, "frankenrigs" using recycled data center cards, and a handful of "workstation" class cards, where the originally high margins and the long enterprise purchasing cycles have kept prices from going up too fast.
In contrast, a lot of these "personal" AI systems are basically a GPU-like core wired to larger amounts of slow RAM. Which is still semi-affordable. Generally speaking, they make for OK chatbots but extremely slow coding agents. Whereas you can run a modestly useful coding agent at reasonable speed on a 3090.
So yeah, a lot of these systems are bit scammy. But not because it's a secret conspiracy to protect data center cards. Rather, there simply isn't enough fast RAM in the entire world. So they'll flog you disappointly slow RAM instead.
TL;dr: Might be useful for some use cases, but benchmark very carefully.
I mean, isn't introducing safety guardrails as part of the system prompt actually a REALLY bad idea? This way you basically fully rely on the model to follow the rule, but its clear that even frontier models like Opus will start ignoring these things after a certain context length...
In our company we are just running agents inside isolated containers with isolated network access so it cannot even SSH or fuck up anything even if it gets access into it... That's the only and safest way... inconvenient, true, but the only safe option.
PS: At the same time I've observed this way actually people uses the agent in a more reasonable way, e.g. producing helper scripts to help them with their daily stuff, produce very specific things, create simple PoCs, but they don't commit to vibe-code all the functionality in their corresponding software products.
Cannot agree more with Josef on how dangerous this is for our intellectual property; Of course there laws and mechanisms in China for the government to obtain any information retained by their companies under any possible justification, but the US does so, and thanks to the Cloud Act they can simply decide to do the same with any of the big players sitting in their territory (even to servers located out of their territory).
So, taking into account >80% of European companies rely either on Amazon, Microsoft or Google to store all their most private and business sensitive data, is this any different from all the data we are possibly leaking already? Same with AI, same with the phones and payment systems we use on a daily basis...
Sometimes I just have the impression that this has nothing to do with protecting our intellectual property but rather with finding an enemy and focus on that while pretending everything else is fine... and a blogpost from the owner of Prusa Research talking about their main competitor is a good demonstration of that.
What's frustrating is that Prusa isn't too far removed from how Bambu works today. Prusa-Link (the onboard firmware) allows you to do very basic job control but has essentially zero machine control and very little telemetry. All the major functionality is behind their PrusaConnect cloud service, which they've now added a paid tier to, and which they've been promising for years to open source in order to allow print farms to run offline.
I love Prusa printers and all my machines are Prusa, but they really do need to get their software situation sorted because in it's current form, it's somewhat hard to distinguish from the operational reality of Bambu - if I want to use all the features on my XL, I need to send my files to Czechia first.
I don't want to sound like a Prusa shill or anything, but if they see Chinese state-sponsored competition as an existential threat (and I see no reason to doubt either claim), then I imagine that, from Prusa's perspective, it's a losing proposition either way ("damned if you do, damned if you don't"): anything they develop in the open directly strengthens their competitor and doesn't elevate the playing field for the benefit of all (unpunished licenses violations) ; anything they keep to themselves turns Prusa further and further away from their ideological stance, more akin to their enemy and less relevant.
It is in my opinion reasonable to call out any violations of any law or any violations of the users' or companies' privacy as they are spotted. And everyone is best suited to spot issues in areas or fields in which they operate.
I mean, one of the very first things I would do on a such powerful device is to run a voice-controlled agent with access to all the IO the Flipper has and let the agent take over the device to do whatever I want.
I can imagine having your agent of preference writing python scripts on the fly for whatever scenario you have in mind based on your spoken desires is like... literally a dream device, at least for me.
My daily driver is a HarmonyOS NEXT device these days.
I had to find a bunch of workarounds to have payments working (I ended vibe-coding my payments app in ArkTS, don't ask) and messaging apps, and well, I use it with almost 0 compromises on a daily basis. It feels like a breeze of fresh air to know there are other devices and platforms out there that, even if seen as the bad guys here in the western world, can be used as a way to escape the established monopolies.
Maybe I should go for Graphene as a safer option to free myself from GMS and Google/Apple in general, but that would require me buying a Pixel device from Google... which I don't like to be honest.
The biggest mistake is that people trusted a company that, in reality, isn't that different from Apple. Just because everyone claimed Android as the true open source alternative to iOS, when only AOSP was that.
Google (before the sell-off) promoted a morality in 'don't be evil' that was a stark contrast to other tech firms. The adverts they carried were minimal. Their "free" stuff was top of the line, better than people were getting from paid services.
Apple (under Jobs) sold themselves as counter-culture, they used popstars (unironically), and design, to sell the idea that if you were your own person, or followed fashion, then you bought Apple.
I think the goodwill from those days still provides the foundations of their cultural position now. Although they chip away at those foundations.
OpenAI looked like it could follow Google's early model, until it didn't.
The writing was on the wall for "don't be evil" when Google started the process of acquiring the much reviled DoubleClick back in 2007, nearly 20 years ago at this point. That's longer than most people reading this have been in the tech industry; a generation has never seen Google be anything other than increasingly extractive and monopolistic.
They built products people like, and specially Apple has good reputation for building reliable, long-lasting and easy to use stuff for most people, leading to a heavy user adoption. But heavy user adoption without the proper regulation and company ethics leads to, well, monopolistic practices.
i mean Apple kind of used that position for building a good reputation. their whole thing is/was how secure their devices were and how they had human verification on all apps that went through the app store with a clear intents file (a file the describes exactly WHY an app needs permission for bluetooth/etc), and a secure enclave that prevented even the FBI from getting in (while apple refused to give them a backdoor). Hackers and tinkerers will find a lot of these measures to be an annoyance and authoritative control, but a lot of people just want their phone to a product, not the user.
These kind of things just make me want to use Graphene even more, or literally any platform that isnt the monopoly ones. Somehow I think AI and vibecoding, even if it may sound as an unpopular opinion, will allow people to build free ecosystems and actually usable devices that dont rely on the usual providers.