Hacker Newsnew | past | comments | ask | show | jobs | submit | NitpickLawyer's commentslogin

I struggle to see how the 3 examples go together. Your exposition implies a connection, but I struggle to see one. The best I could do is that it has to do with rights and responsibilities?

The first example is clear. And it has pretty much carried on, as the "right to property" and "the responsibility to cover damage to other's rights".

The second example, even though you wrote it as Uber vs. the cab driver, is more about Uber vs. the municipality. By the fact that almost all over the world people wanted Uber (or the other brands) over the imposed limitation of their municipalities, shows that the deal was wrong. In places where it was artificially limited, people have showed to prefer the alternatives. It has little to do with Bob the driver, and more to do with Alice the mayor who decided unilaterally that a taxi cab should require a 100k/yr medallion. That's what's changed, and society accepted it.

The third example is weirder still. Again you pose it as AI provider vs. average Joe, but here I struggle to even see what rights / who's rights are being infringed upon. I don't see any. While we generally have a right to work, there is absolutely no right to work in a certain industry, if the industry doesn't have demand. If someone else doesn't need your output, your right to work in that particular field has absolutely no basis in reality.

Unless you want to go back to the places and regimes that decided who works where, modern society has no place for such thinking. A right to work protects you from employers choosing not to hire you because of things that you are (race, age, gender, etc.) It absolutely doesn't protect you at all against "people don't need elevator operators anymore". And I say this as someone who's worked in this industry 20+ years. If tomorrow people don't need software done by hand anymore, tough luck for me. But it's absolutely not the problem of rights. I don't have a right to demand people wanting my services. That's not the social contract at all.


1st example was the progenitor of what eveolved into strict liability. (If you make money putting stuff into the stream of commerce, you're liable for unintended and evenunforseeable downstream damages. 2nd example is an illustration of that longheld legal precedent's being curiously ignored (nevermind the cost savings was a bum rush and livery costs are now higher than before the innovative advent) 3rd is a call to at least litigate who bears the downstream effects. Or perhaps we should just cancel public health measures and employ pestilence to solve the problem *organically.*

> If you make money putting stuff into the stream of commerce, you're liable for unintended and evenunforseeable downstream damages

So if you’re a business offering poor quality services, and I come along and start offering higher quality services, I owe you damages for the impact I have on your business?


Morally, maybe? It's what people tend to implicitly assume when a large chain displaces local mom & pops. You can argue it's for the greater good in the long-term, but that doesn't settle the question of the immediate injuries. Is it the fault of the stock boy who lost his job that he worked for a less efficient employer? Maybe?

The whole encyclical's argument is that morality requires an accounting and response to the pain inflicted upon each individual, and human morality is a distinct set of rules and norms than economic, physical, or even civil laws. I think it also follows that it's not just, e.g., Walmart or OpenAI who bares some responsibility for ameliorating temporary suffering. And to the extent people use the encyclical as fodder in the usual anti-corporate rhetoric, then that's unfortunate.

And this is coming from the Catholic church. It turns alot of people off who in isolated contexts often perceive hypocrisy, but in its charity it has always considered the personal responsibility of those receiving it. It understands the struggles and inherent tensions that comes from trying to square individualized justice & mercy, selflessness, and the "greater good".


No, the 2nd example has nothing to do with that. You're drawing a false equivalence.

> you're liable for unintended and evenunforseeable downstream damages.

so the people vs. otis, the people vs. IBM608, and so on? Has it ever worked?


you gravely understand #1 if you apply it in a blanket manner. You are not liable for all damages and consequences, only a vary narrow subclass.

People especially wanted uber because uber charged below market rates by subsidizing rides with vc money.

Maybe. But the fact that they're still in business shows that different people value different things. Be it rating schemes, payment alternatives, choosing their music, choosing their cars, one click hailing and so on. The people have spoken, the social contract has changed.

>> But the fact that they're still in business shows that different people value different things.

No it doesn't. It shows they could undercut the market, monopolise it, and then charge more once they'd killed the competition.


Except NYC taxis and the taxi cartel is still trash compared to Ubers, despite Uber being out of its subsidization era.

that's just goalpost shifting

The argument was "governments restricted taxi availability so Uber won" and now you've mott-and-bailied yourself down to "people want to pick music they listen to on the ride"


> over the imposed limitation of their municipalities

This was really just a few cities in the US. There's no artificial taxi scarcity in Houston or London or Tokyo.

You might reflexively say London has strict regulations, but it regulates safety not imposing an artificial cap. That's a NY/Boston/Chicago/Philly thing.

Uber won because:

1. on-demand app

2. VCs subsidized rides to destroy taxi companies by driving the customer cost to well below provider cost.


> 2. VCs subsidized rides to destroy taxi companies by driving the customer cost to well below provider cost.

Not sure about other regions but in NYC this is 100% the case. Ubers used to be nicer cleaner newer cars, better drivers.. for less than a taxi. Now they are about 4x what they cost in the 2010s, with cars about as dirty as a taxi and equally surly drivers.


> The DeepSeek provider on OpenRouter is only the 5th-cheapest for V4 Flash

You might have the default settings on your account, which limit Deepseek as a provider. If you disable that feature you see them on openrouter as well (and they serve it at the same cost as their own API).


I just checked my settings and I have everything enabled. https://openrouter.ai/deepseek/deepseek-v4-flash?sort=price (per-1M price) shows DeepSeek provider as #5. https://openrouter.ai/deepseek/deepseek-v4-flash/pricing?sor... (effective price) shows them as #3. The effective price will change your total cost since each provider has a different price for input vs output vs cache, so what's #1 and #5 for one person could be #5 and #1 for somebody else, depending on their workload.

However, I just double checked, and OpenRouter's pricing page for Flash v4 with DeepSeek provider shows a cache hit rate of $0.0028, which is the same as on DeepSeek's official API pricing page ($0.0028), so they do seem to be the same price, (assuming DeepSeek is able to pin your specific OpenRouter requests to the same DeepSeek server). OpenRouter adds 5% to that cost, but still it might be cheaper than the other providers.

Also just found out OpenRouter has a new feature "Response Caching" where they can cache identical requests and return them immediately with no billing. The entire request must be identical, though, not just a prefix, and you have to enable this feature. I don't know who would need to send multiple identical requests, but it's better than nothing?


Interesting, it seems we have some providers offering dsv4-flash cheaper than ds themselves. For the full model it's the other way around, all 3rd party providers are 2x+ more expensive.

The cheaper ones are fp4 and fp8 whereas I assume DeepSeek provider is unquantized, so that probably accounts for it. DeepSeek also doesn't necessarily have the cheapest hardware, other providers could be using it as a loss leader, etc

I belive no sane provider, antropic and openai included, serve BF16.

Side note: I suspect Antropic was experimenting with changing quant level based on server load a few months back which is what caused that major quality drop we saw then.


> But pretty much everyone thinks Artemis is Starship HLS or bust.

Right now it seems like it's Axiom or bust, with their suits. The suits have missed a lot of milestones, and there's not much point in going to the Moon without suits. Latest NASA OIG report put them somewhere in the 2030s at best...


Can always use the existing ISS suits modified for the moon.

Heh, I don't think so. The ISS suits have their own problems, are custom made for the ISS, are bulky AF, and "modified for the moon" might take longer than expected.

The Moon is frankly a totally different environment. Lunar dust is notorious.

Also the ISS suits themselves are being replaced, by Axiom, because they are failing in near-fatal ways periodically.


Been thinking about this quite a bit. I think every lunar airlock will end up with a wash station and will end up using basic water sprays to get the job done. Soil compactors can be used outside to compact the dust and prevent it from being an issue during moon walks as well. I think once a moon base is established most moon walks will be on-base where dust would have been dealt with in the first few years and then after that it's basically like walking on compacted soil so it won't be loose.

And if you're wondering why they might use water for this well... water will be abundant and basically every habitat will have a municipal water hookup. A moon base might quickly find itself with several hectares of water and no where to put it. Might even end up injecting it into the ground where the water will be able to maintain it's liquid state (moon crust quickly reaches room temperature only a few dozens of meters underground).


We can turn the regolith into Lunarcrete too. It's moon concrete. Mars with nice sidewalks.

https://techport.nasa.gov/projects/118526


Yup. That gets us to the stone age on the moon. I'm a big fan of bootstrapping the moon's industrial capacity straight to the industrial age. We need not only concrete production but also steel, water, etc. The faster we bootstrap all this stuff the faster the moon will feel like any other place you can go. I think there's gonna be hardware stores on the moon within the century.

Your definition is closer to ASI than AGI. And that's the explanation for your first sentence: it's not well defined because you ask 10 people and get 12 different definitions. And it gets even worse if you ask experts in the field :)

Then you have the process of drifting definitions (or, more colloquially moving the goalposts). Hassabis has said this himself: his definition of AGI has shifted. And we know that's true, because we have his definition from 2010 when he started DeepMind. His definition then was much much "simpler", and there are arguments to be made that we already have that. But, alas, he's changed the definition. As did most of us. Seeing the progress will do that to you.

Even going by your definition, even adjusting it for "General" instead of "Super", it's still not clear. What's better? Is a poem written by a nobel laureate better than one written by a lit student? Probably. Is one written by a nobel laureate better than another written by another nobel laureate? Maybe? Is the one scribbled on a card by your 5yo for your birthday better? It most certainly is better for you. And so on...

We're not dealing with easy to define things here. Hell, I could make arguments that every word in Artificial General Intelligence is so hard to define or ambiguous that you'd never reach a consensus between a group of people. There are good arguments to be made in ever each direction. That makes it by definition not well defined. It's all ... relative :)


> So how do you accurately price these tokens at all

Like anything else in the economy: at the point where enough customers can pay you, and not enough will go to the cheaper competition.


> at the point where enough customers can pay you

> (other than through price-discovery: which is slow, messy and fuzzy)

I notice a distinct lack of reading or comprehension (from everyone around me now, not just this comment) which worries me. I worry if LLM's are to blame. No one reads anymore...


I imagine a number of Hacker News members might be devs that haven't dealt with terms like “price discovery” before, so we should all try to show grace.

My apologies. I was taught that term in 7th grade econ. I assumed it was common vernacular.

Even among college graduates, only 17% could correctly answer a basic econ question (pre-AI) [1].

I looked up one ~1,000 page econ textbook, and it does not seem to mention price "discovery", or at least the only uses of the word "discovery" were about things like a scientific/oil discovery [2].

"Even high-achieving students demonstrate relatively lower understanding of economics compared with other subjects." [1]

[1] https://www.carolinajournal.com/opinion/why-americans-flunk-...

[2] https://assets.openstax.org/oscms-prodcms/media/documents/pr...


I really like the amount of exploration going on right now in this space. Even if this particular project (or the many terminal trackers/mergers/splitters, session managers, etc) don't end up being the thing, exploration is useful and might inform the next platforms.

The IDE has been "static" for most of the past ~20 years, with obvious improvements, but they were always incremental. The kind of exploration we see now is a bit more extreme, and I like it. It also seems like a lot of people are looking for alternatives, and I like some ideas. Even the funky ideas (I once saw a post comparing and proposing IDEs to follow RTS games UI) are interesting. Who knows what might stick.


Excellent post. Thank you.

I love looking through awesome-web-desktops. Most aren't infinite canvases but they are canvases, canvases of programs. There's fun stuff. UI paradigms being cooked up. I'll pitch in particular AetherOS, as being a neat web desktop that also is interestingly connected, networked, which is neat. https://github.com/syxanash/awesome-web-desktops https://bsky.app/profile/aetheros.computer

I do think we need to ask a little more "what next?". Taking Niri as a real desktop example, it's just so good, such simple but enjoyable new bones for a doing computing atop. So close to what was but so unique and nice. It intuitively connects me to the many many what ifs all around, makes me feel like there's such imminent possibility.

Especially today, who can make a UI that can be spoken well with, that is conversationally capable: that frontier feels barely explored. 9p is by far far far the most agentic desktop we have ever had, looked at this way. So beyond how it looks and works, how do computing surfaces express themselves?


Agreed. I'm actually excited to hope for a cambrian explosion of IDE experiments.

LLM powered visual diagramming of the code as you work? The ability to edit the diagrams and have tje LLM apply that back to the code? Visualisation of test coverage over the UI you are working on? Allowing you to attach user submitted videos of bugs directly to tests in the code?

I don't know if any of that is a good idea, but I really hope a bunch of people try.


Yeah, I agree. I think we are in an interesting phase where people are rethinking the “IDE as one fixed rectangle” model. Cate is one attempt at that from the spatial side: persistent canvases, terminals, browsers, notes, agents, docks, tabs, splits, worktrees etc. It may not be the final shape, but I think there is room for more experimentation around how long-running AI/dev workflows are organized.

I was exploring a similar approach, but not focused on AI, my idea was basically group projects by workspace, where each workspace has a path and is related to a project, you can spawn terminals, editor and web browser windows in this workspace, the web browser cookies and such should be associated to a workspace, that way it will not leak between workspaces and also this allows you to have different sessions opened in different workspaces.

Unlike Cate, the windows of the terminals, editor, browser, etc, each one was handled similarly like Niri tiling scrolling window manager, that way you can use the keyboard to move around, where you can group windows in a column or split them, have different sizes, is not quite where you have a free form, but an horizontal collection of windows that you can scroll.


I’m building something like this right now.

I’m already using it as my primary terminal emulator and have recently just been adding LSP support to the code editor.


I would love to have something like this I used itermocil when i was on macos, that was limited to iterm windows. On Linux, I have been playing/exploring with Hypr but without much success so far.

Now with AI I find myself in need for a space that can combine multiple repos into a single "project". For example for debugging an issue across the system, or asking it to verify if FE/BE communication schema has any mismatch, or describing the complete feature flow from one end to another.

Is Cate's canvas per git-repo or can I add multiple?


Maybe Repomix?

Since the 3rd party providers on openrouter have all converged on much higher prices in serving these models (both mimo and dsv4), there's obviously a question on how/why are they lowering the prices so much.

It's possible they've finally integrated cheap(er) chinese chips. It's also possible they're just subsidising inference for real-world usage data. Interesting either way.


> how/why are they lowering the prices so much

Like I responded to someone else:

- Cheap electricity - Cheap, domestically produced GPUs - Efficiency research. (a lot of it from Deepseek's research)

Also, the Chinese government wants the AI to be as accessible as EVs so everyone will use it.


Also if this is on the path of anything the Chinese do in the physical goods world, inference will be rockbottom cheap in a few years because they'll invest in the hell out of energy, GPUs, research, etc. The same thing they did with EVs.

Only artificial barriers will keep people using some of the frontier stuff in a couple of years. No costs will justify.


> there's obviously a question on how/why are they lowering the prices so much.

Same reason they release some of the models for free: They are trying to capture market share.


The difference is that releasing the model for free doesn't have ongoing cost for the company. Providing cheap tokens is very expensive - specially if you don't have access to the latest transistor node chips. So I think the parent comment is right, there's something else at play allowing DS and Xiaomi to offer these nearly free tokens.

LLM providers can't "capture" anything. People loved Claude Code because it was cheap and good. Not cheap anymore? People switching to Codex, DS4 etc.

Their only moat is maybe being SOTA but that only lasts so long before everyone else catches up.


This is why they are pushing more for non-tech folks to use their products with desktop apps. They are not going to switch on a whim.

I mean there is a minor moat. Most people don't enjoy switching providers or models. If you can get people to trust you'll stay near frontier, they'll stick around even when you aren't the best. Claude is a prime example of this

I switch models all the time.

/model in OpenCode

There is no "moat" for me. Using the standard chat applications as a normal conversational/question has a little bit of moat as its able to cross reference existing conversations, but I disable that mostly anyways to prevent as much data retention as possible.


Electricity in China are much, much cheaper than in U.S.

Also, DSv4 has access to Huawei Ascend GPUs that have native FP4 that allows all-native FP4+FP8 mixed compute that is more efficient than emulated FP4. Less so for 3rd party providers.


National security, training data

> But is the capability difference enough [..]

This is the (m/b)illion dollar question, isn't it? I think there's also a question of what do you think capability is exactly, and how the difference manifests itself.

On the one hand, when something becomes "good enough" that's a clear capability threshold. On the other hand, what's the limit of those capabilities, and equally as important, how does capability reflect on reliability?

We've seen "local models" lately improve on capabilities where they're "good enough" for some tasks. Reliability of solving those tasks is a bit harder to measure/benchmark/test. It'll get better as more people work with those models. But, something I've noticed in the past ~6months is that the frontier models are gaining a lot in both the breadth of capabilities, as well as the reliability of solving those tasks that they're capable of solving. I think this is where scaling (both compute and data) is showing, and where having more compute is simply better (more parallel exploration, more training data output, more broad data, etc).

There's also the problem of benchmarking true capabilities. The popular ones are getting old, and aren't as reliable as they used to be (not even touching on the subject of benchmaxxing, just thinking about their saturation, even with honest intentions).

So the question then becomes what will users prefer? Do you get the best of the best, or the one that's good enough? There might be a market for both, honestly. Not everyone does SotA stuff. And a lot of what people used to do in a company is probably mundane enough that a "good enough" model with "good enough" reliability can probably handle (w/ some supervision ofc).

What I'm more interested in is if things like Thaalas succeed and they get to provide local hardware that runs models "burned in silicon". That would be interesting, because speed and all the advantages of local models are a "quality" on their own. For example, right now I'd pay ~1k$ for an external hdd-sized block that can run a ~32B model that's popular right now, even knowing that it can only run that model. I have no idea if that's feasible or not, if it makes sense from a financial pov. But I'd buy one. And local inference on dedicated chips doesn't need to be "oss only". I'm sure oAI / etc would probably take the risk of licensing one of their -mini / -lite models provided that the risk of the weights leaking is small enough (and it probably is).

> This keeps a ceiling on how much or how fast the frontier labs can raise prices.

I generally agree, but from a different perspective. Up till now we've seen that the 3 labs influence each other's price points. When gpt5 came out at a radically smaller price, the others lowered them as well. Now with opus being SotA for coding, w/ 5.5 close behind, they've raised them back. Google seems to follow slowly. But there's hope that, being 3 top labs + 2 trailing (xAI & Meta), there'll be pressure once again. If any of those trailing labs manage to get to SotA again, the prices will drop once more. Some people say that open source also provides a pressure here, but I'm not yet convinced of this. There's still a question of who'll serve the models, at what scales, etc.


Your experience might be a bit dated, depending on when was the last time you tried it. MTP (which is a flavor of spec decoding) is showing really solid improvements on local models, even on consumer hardware.

In fact, as the article mentions, you get the biggest gains at low concurrency (so local should apply), with diminishing returns for higher concurrency (if you think in terms of unit of compute, it's probably better to serve more requests in parallel and get more throughput that way).

Eagle3 was great at low context tho, and this seems to improve things at high context. That's really cool, and hopefully it'll turn oout to be useful at those lengths. Eagle3 is also training dependant, so you could try training your own, if your use-cases diverge enough that 3rd party "generalist" models don't suit your needs. (in general nvda, redhat, etc. have provided general eagle3 models for popular families).


The reason speculative decoding shows diminishing returns in batched workloads is because the principle of both is the same.

Speculative decoding predicts a group of tokens and verifies this group using the main model in one pass instead of decoding each token separately. Eg. for this group, the weights are loaded from RAM per group instead of per token: roughly the same computation is performed but not the same memory movement (and other overhead like kernel launches).

Batching utilizes the same mechanism, so speculative decoding is essentially an attempt to batch a single stream using prediction. An attempt, because the verification may reject some tokens if the prediction was inaccurate.


Thanks, appreciate the info. For whatever it’s worth regarding recency, I’m testing the main llama-cpp branch that was pulled and built on 2026-05-25 running unsloth/Qwen3.6-35B-A3B-MTP-GGUF:Q4_K_M, my hardware platform is M1 Max 32GB VRAM. Is there a different fork or quant I should be using?

Wasn't there a 2000 source leak a while ago? I remember some exploits coming out after the leak.

Yes but it could not be legally used by anything.

OP said source available was acceptable. not even asking for compiler access which is also widely available.

Windows has always been more than modular enough for any repurposing and there were licenses that were not tied to specific hardware so you could use them even today.

Which is to say no one is stopping you from building a COPILOT.VBX for VisualBasic 3.0.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: