Hacker Newsnew | past | comments | ask | show | jobs | submit | cptskippy's commentslogin

Why do you assume Broadcom has a ton of IP for AI SoCs but hasn't done any of the other work around data center scale deployments?

They have. That's why OpenAI was able to get a working demo in 9 months. But going from a small scale system to a full fledged data center deployment is likely much harder.

I don't know how much of the things outside of the chip Broadcom has vs Google's proprietary tech that is not shared with Broadcom.

Nvidia's Vera Rubin has 6 unique chips working together in a single rack.[0]

[0]https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...


I’m just happy to see diversity here; sometimes I feel like Nvidia is going to eat the world, with buying other fabs and branching out - or up, I guess - from chips and racks to models, frameworks, and end user stuff.

I thought most of the Google tpu magic is on wiring up these chips into supercomputer like clusters with specialized interconnects and whatnot. The chips themselves are less interesting in isolation.

I know nothing of what is happening here but Broadcom has a lot of IP in high speed/low latency data transfer from chip to datacenter scales.

Where are all of the emissions coming from? I keep hearing that automobiles account for a small % and that trucking and shipping account for the majority, but you're saying shipping is only 1%.

There are two ways of doing the accounting, and the more common one is from the producer side such as by industry, by country, by use.

Our world in data is linked in a sibling comment, for the breakdown of the transport side. As is the California ARB inventory. There are other national inventories.

One thing to be very careful about is people making arguments for a national or local policy, that uses worldwide inventory numbers rather than an inventory applicable to where the policy applies. I see this a lot with local old New Leftists trying to argue that their old Toyota Tacoma isn't a big deal, but everybody had better become vegetarian right away, because worldwide beef accounts for a much larger proportion than cars (but locally cars dwarf anything from food production)

And the production side inventories are very poor at making consumption level decisions, because people always complain that we've merely shipped all our production emissions from manufacturing to China. In reality there are great Our World In Data pages showing that yes, cars really are much bigger emitters for Americans than exporting emissions to Chinese manufacturers.

So my favorite inventories of climate emissions are consumption based, and show that lifestyle is one of the biggest drivers of climate emissions in the US:

https://coolclimate.berkeley.edu/maps

There are rich cores of cities that are very low emission, surrounded by wealthy suburbs with sky-high emissions, and then rural areas with very low emissions. EVs have the chance to change high emission wealthy suburb life into low emissions. But if we simply legalized more housing in the wealthy city cores, it would allow a lot more people to choose to have lower emission lifestyles right now without technology change, while also spurring massive economic growth.


That isn't true about private automobiles and I would be interested to know where you have been hearing it. It must come from some party with a vested interest in making car emissions seem like a non-problem. In reality, car emissions are the main thing that Americans and most other rich countries should be trying to address.

Here's a graphic of the latest data available for California, as an example: https://ww2.arb.ca.gov/sites/default/files/images/2023_scopi...


Not sure if there's more recent stats, but 2020 data: https://ourworldindata.org/co2-emissions-from-transport

Maritime shipping is only 1%. Road transport is about 20x that.

Our energy aggregator is a non-profit Community Choice Aggregator with over 250,000 members that ensures 50% of the energy they purchase comes from renewables and 75% of all energy purchased is carbon free. And for an extra $0.00750 p/kWh you can opt into your consumption coming 100% from renewable sources.

It's an interesting idea. Tools like Grammarly exist to help with business communication. I wonder if there's a space for a Social Media or online writing assistant to help people. I for one could probably benefit from a tone shift away from acerbic troll.

I've been running qwen3-5-9b-q4-k-m and qwen3-6-27b-q6-k simultaneously on an Intel Arc Pro B70 with a lot of success.

https://github.com/cptskippy/battlemage-llm-gateway

Opencode has been a huge productivity accelerator. I have two Hermes agents that I'm training to support my workflow with pretty good success. One is a personal assistant who manages my backlog and keeps me on task, follows up with me on items, and will put together research briefs. The other I use a general purpose coder and research and it's about 50:50 with the tasks I've given it. In fairness though, the task it failed at left me scratching my head to figure out as well.


Interesting setup, thx for sharing.

How many tokens/sec do you get with 27b? Are you using MTP?


I haven't done any in-depth synthetic benchmarks but I had my Hermes agent run some and I ran a couple directly on the LLM Gateway that showed similar results.

Hermes reported 18.45 tok/s consuming the llama-swap endpoint across the wire. Locally I got 19-19.1 tok/s on the gateway. I'm running the Qwen 3.6 27B Q6 model (qwen3-6-27b-q6-k) off LM Studio and it's less than 0.3s to first token.

It's not good for conversational use cases as it can take 1-2 minutes to respond to a prompt.

I have two Hermes Profiles running, one is a personal assistant that manages my backlog and provides me morning reminders, solicits for evening updates, and will run overnight research projects for me. The other profile is a coding helper for personal projects. I can ask it to make changes and it will churn for 15 minutes, submit a PR, and notify me that the PR is ready to review. It's faster than me at basic coding tasks.


What's the value running the smaller model too? Why not just the big model for everything? I note both are dense, as well.

Tokens per second. The difference between 8B and something like 16B is not as big as you might think in practical usage and 8B is a lot faster and interactive than 16B but there are certain things where it is useful to farm it out to the large model.

Exactly this.

Creating conversation titles and parsing HTML/JSON don't benefit from 27B models.

The B70 can run both models comfortably side-by-side so it makes better use of time and resources.


Agree. For local coding help, latency often matters more than raw benchmark quality. A slightly weaker model that answers immediately changes how often you reach for it.

Does Intel make decent GPUs now? I must be out of the loop...

I'm using an Intel Arc Pro B70 which has 32 GB of VRAM. It's estimated to get ~35-45 t/s at $21-27 $/t/s. An RTX 5090 is ~61 t/s at ~$33 $/t/s.

So in terms of raw power Nvidia is effortlessly still king, but in price-to-capacity Intel is best in class.

Intel's Battlemage GPUs also natively support SR-IOV and GPU partitioning which allows you to isolate workloads. This is useful in homelab environments if you have workloads that benefit from GPU acceleration. I was able to split the B70 into 4 virtual GPUs and hand them out to Frigate NVR, Plex, and other workloads.


They released a few good value GPUs for LLM inference about a year ago: more memory than AMD and NVIDIA consumer GPUs, not too expensive, but also not great tokens/watt.

I am not sure whether you can find those in stock anywhere.


I think the op was suggesting the contribute to FOSS rather than shaming people who have contributed greatly for not contributing more.


Eh... not exactly.

MinWin was the response to Longhorn. When most of the major goals of Longhorn failed to ship and those that did resulted in Vista, Microsoft did a reset. The MinWin project was a massive cleanup effort that promoted cleaner API boundaries and layer separation that defined a minimal bootable NT core at the bottom with reduced overall dependencies.

WinRT was introduced as an alternative API/runtime layer alongside Win32. Both WinRT and Win32 used COM concepts and ran ontop of the NT executive. WinRT was a modern async first object oriented natively sandboxed capability-based runtime that supported built-in projections over manual COM.

Microsoft tried to encourage everyone to adopt WinRT and the new sandboxed App Model on Windows 8, Windows RT, and Windows Phone. It used modern concepts and was more secure than the uncontrolled legacy surface area that Win32 exposed. They shipped those devices with Metro, a new "desktop" interface and didn't allow Win32 Apps. Unfortunately they shot themselves in the foot by shipping full Win32 based Office on Windows RT. This demonstrated that yes Win32 could run on ARM. After that, things fell apart and Microsoft decoupled many WinRT features from the WinRT/UWP model.

WinUI is an interesting UI framework that sits on top of this stack and is decoupled from it. This allows it to be updated independent of the operating system.


MinWin was a kernel refactoring mostly, still during Windows 7 days.

Missed the parts about the multiple reboots of the WinRT API surface between 8, 8.1 and 10.

The deprecation of .NET Native and C++/CX, replaced by tooling without feature parity to this day.

The set of Win32 APIs not available in UWP, even after the 8 and 8.1 (UAP) reboots.

WinUI 2.0 features not yet available on WinUI 3.0.

The pivot from Project Reunion, that six years later hasn't yet delivered on the goals from the BUILD 2020 announcement.

Microsoft is its worst enemy when it turns the hardest advocates in tale tellers from past wars.


Is there a reason to advertise Tikz like this?


I think I'm agreeing with you but its also not something easily dismissed. The DRAM Cartel has been found to be distorting the market on numerous occasions by various regulatory bodies. There is a boom-bust cycle that occurs with DRAM and Flash memory. The Cartel claims they always lose despite the fact that demand seems to always steadily rise.

The pandemic caused once such boom-bust that resulted in a rather large downturn in demand in 2022-2023 referred to as the pandemic hangover. During that time demand dropped following overspend during the pandemic and members of the cartel drastically cut production at times to keep prices above cost. Even after the demand recovery began in 2023, the cartel members were slow to increase production and made little to no investment in production capacity in 2024-2025. Creating a shortage.

The AI hype cycle has exacerbated the shortage by creating speculative purchases and then panic buying. Remember the shoe company that pivoted to AI?

So Cartel market manipulation is partially to blame for the over 100% increase in prices and the shortages.


He said complicated code bases. LLMs are great at producing small snippets of code to address very targeted problems.


Great on small snippets of code, passable on larger pieces of code, great at finding vulnerabilities in large pieces of code, terrible in Zork. All-in-all, a jagged frontier that defies a simple sarcastic characterization.


Very kiki, not very bouba, as Aphyr rightfully stated.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: