Hacker Newsnew | past | comments | ask | show | jobs | submit | kingstnap's commentslogin

Given that MiMo is as cheap as Deepseek ( previous discussion: https://news.ycombinator.com/item?id=48282814 ) multiplying that by 3x for ultra speed is still shockingly cheap.

MiMo and DeepSeek are not cheap. Anthropic and OpenAI are expensive for what they provide.

You don't consider Input $0.435 Output $0.87 cache read $0.003625 per million tokens for near frontier intelligence cheap?

No. They still have enormous profit margins on inference with these prices.

Their margins doesn't impact my own assessment of end user pricing as cheap.

Any source to backup this claim, pretty please?

Source? There are a countless number of providers serving open weight models for fun and profit.

I highly doubt there is any margin on those inference pricing.

> I highly doubt there is any margin on those inference pricing.

And yet, OpenCode Go offers DeepSeek flash 6 times cheaper than DeepSeek itself. And they claim they are still profitable.


It’s near the frontier meaning it’s the best intelligence for the price.

It’s not even close to frontier meaning it’s the best intelligence.


I hardly notice DeepSeek being inferior to Claude Opus unless I have it working on tricky and under-defined problems. That is, I trust Opus to reason much better when it has the choice. Otherwise, IME DeepSeek is far cheaper and more effective for anything where the solution is even somewhat obvious.

Out of curiosity, what is your stack? And is this in a legacy project or new one?

I have tried using deep seek flash and pro but they make amateur mistakes. Sonnet level at best.

However v4 flash is absolutely amazing as a generalist model and it’s what we’re using on a product built on top of LLMs. I wish I could code with it but it’s not going to happen anytime soon


I've used it across many new projects as well as many legacy ones. It does make amateur mistakes so you can't leave it unsupervised for hours like I do with Claude, but it's so much cheaper that weeks of heavy usage haven't even cost me $10 yet. Only other downside IMO is that Pro is pretty slow, even compared to frontier models; only around 120t/s IIRC.

Yes I also noticed it is pretty slow, which sort of defeated the purpose of using it for me.

Usually I'm working on a large task, typically with Opus, while also having a bunch of smaller tasks in their own independent worktrees. Those still need supervision, but less. My goal was to get deepseek to drive the cost of those down, but it was too slow and unreliable...


Yes, I could tolerate the unreliability better if it were faster, but it's really not. So it's too slow for me to actively supervise it, but too unreliable for me to trust it unsupervised. The shitty middle. I often have multiple of them open at a time and check my terminal every few minutes to lead them along. Mostly works.

Energy is likely more abundant in China. I am not sure about compute, but that must be part of reason for such drastic price differences.

They're leaving us in the dust on solar, while our current administration is still trying to put people in the ground to dig up more coal and die of black lung. https://en.wikipedia.org/wiki/Solar_power_in_China

They're building more coal than anyone.

Also more nuclear than anyone, which one must assume you hate, because preferring solar requires you don't actually understand thing


Energy from coal in China decreased last year. The change is happening very quickly.

They also don't have to inflate profits for a coming IPO.

The Chinese "Neijuan" is real & well reported: https://www.reuters.com/business/autos-transportation/what-i...

It is another thing the BigLabs accuse open weight models of benefiting from distillation & other techniques & essentially avoid higher training costs (which typically bleed into bills end users pay for inference).

Ex A: https://www.anthropic.com/research/2028-ai-leadership

Ex B: https://www.reuters.com/world/china/openai-accuses-deepseek-...


We buy cheap Chinese goods all the time. Absolutely nothing wrong with that.

In this case, at least it’s threatening multimillion dollar salary jobs instead of entire towns of working class people in America or Mexico.

And the Chinese labs actually release their weights. You could call it… open AI.


Lololol.

Big labs ripped videos off YouTube without caring about the ToS, and grabbed as much published literature they could get their hands on, regardless of legality (Books3, The Pile). The goal of "democratizing human knowledge" by way of thinking machines is far too noble to worry about frivolities like copyright and authorial consent, they said. Until it was their output being exploited, and their earning potential threatened.

We just had years of US model providers arguing it was fine to rip off the world’s cultural output for their own profit, why should their work be treated any different?

True, but why would end users care about that? If anything, training on synthetic AI output is more ethical than on scraped human works (of course, not to say the Chinese labs aren't doing the latter)

Chinese are also simply better at making a lot of things cheaper, e.g. solar panels or electric vehicles.

The fact that this article seems to honestly recommend people run 5 different type checkers on library test suits really reflects the tacked on feeling of Python typing.

I am not sure it is recommending more than it is commenting on the current state of developing public-facing APIs in Python.

The downstream users that import the package either have to ignore checking its exported types altogether, manually stub it, or have a subpar development experience to varying degrees.

This is something I saw the other day with some package that provided comprehensive stubs for an untyped library. The .pyi file was littered with comments about quirks from the numerous type checkers (five now).


No, it reflects the nature of misunderstanding Python by people who think their system is better, have no idea how Python in production actually works, and just publish things like the article to make themselves feel better.

Typing is not a huge issue, period. In Python, if you pass a wrong type to something, program just throws exceptions. Exceptions are not the end of the world like people make it seem. Functionally, finding errors during the process of taking code and compiling it with type checking is no different than taking code and just running it against a set of tests, which every production code has (or should have)

The only waytyping ever saves you from it is by being absolutely strict - every type defined has a finite range of values, and every operation has bounded domain and range. I.e if you have a string field, its not enough that its a string, you also must define the total number of characters that string can have, and values for each character, along with more complex rules on sequences of characters.

If you have this system, (something like Coq comes close), then if your program compiles, its by definition correct. But even the strongest proponents of typing don't really want to do this, because they realize how long it would take to write code.

The simple truth is that Python is easy and flexible enough to work in that you don't even need type checking. An LLM can effectively function as a type checker for you if you care enough. For any errors that you encounter due to lack of typing, its ultimately way faster to fix with Python than it is to spend time writing strongly typed language.


It's ridiculous. They should have made it an explicit part of the language. The interpreter knows about types already, it's crazy that they couldn't just let the user make the types explicit rather than implicit, and have the interpreter enforce that.

The interpreter knows types at runtime, not at parse/compile time. The interpreter already does a lot of dynamic type checking. It has a much stricter type system than e.g JavaScript; JavaScript will pretty much always convert operands to produce some result (even if it's just NaN or the string "object Object"), while Python will often just give you a type error.

The interpreter doesn't know about static types.

I agree that they should've made typing more a proper part of the language and not left it in this weird half-defined state of "standard syntax and some standard typing imports but undefined semantics". But it's not just a matter of enforcing existing types.


Next to no one would be using less than the subscription price given how expensive Opus API is.

For those that don't know about this. Phi was announced with a paper called "Textbooks are all you need". What they did was use GPT 3.5 and created synthetic textbook chapters and exercises.

They also did some more interesting work like showing very small models can be coherent as long as you have very simple children's book style training data (TinyStories is pretty famous).

Lots of these ideas are still used. Learning facts at scale with active reading is an ICLR 2026 paper from Meta AI that does a lot of similar work.


This is something I had thought about some time back where I was thinking about the feasibility of somehow using the upper and lower registers inside a multiplier as general purpose storage for fun / seeing if you could make them more compact.

Anyway here is a fun pattern you get when you multiply 8 bit unsigned integers. Not all pairs of (upper bits, lower bits) are reachable, and it has a lot of distinct patterns.

https://i.imgur.com/Gb3HDR0.png

(Should I host the image on GitHub Gists so it doesn't vanish?)


OpenAI's Dota 2 adventures were super hype back in the days.

OpenAI Five doesn’t really know how to play games in general — it only knows how to play Dota.

The only game that matters.

This seems like an interesting case to test AI agents on.

Like we had weird examples like C compilers and Bun. This is a much more interesting example because its highly nontrivial.

AV1 exists, Dav1d exists. Lets see AI take the AV2 spec and Dav1d code and try to make a working high performance AV2 decoder.


> Lets see AI take the AV2 spec and Dav1d code and try to make a working high performance AV2 decoder.

That sounds like one of these high-risk, high-reward things that are great for people / projects / companies who have nothing to lose, but is not a great baseline strategy for an established market player. AV2 is here with support from aomedia and its members. AV2 will be used, and we need a production-grade decoder regardless of where AI is at, so it makes much more conservative business sense to use established approaches (language: c/asm, devteam: ffmpeg/dav1d) as a starting point. While that's happening, we can dabble in AI and other risky stuff and see if it helps. If so, great, and if not, nothing lost.


What's the high risk? That you end up with useless code?

I didn't mean that the Dav1d people should yolo vibe code Dav2d. My point was this this is a very interesting possible experiment since there is no existing Dav2d contamination in the training data.


The reduction is in cached inputs. I've commented about this before but many labs, except Deepseek and Xaomi now, absolutely scam you for cached reads.

You are basically paying out the nose for a few seconds of VRAM residence if you are giving significant money for cache reads.

The very nature of autoregressive language modeling is that every single output token produced "reads" the cache.

So in principle the price floor for a cache hit is the flat cost of 1 output token.

Now in reality it has to be more than that because you are occupying VRAM with the cache that forces out other users. But it can still be really cheap.


No one is producing one output token though.

And using up gpus for that cache is a pretty big opportunity cost. I highly doubt it's done in vram. That would be insane for the one hour caches.

So its memory + the time it takes to unload/load into vram + the extra cost per output token

Is it a scam? Idk


I'm surprised they don't dwarf them by orders of magnitude.

You can make arbitrarily random ETFs with all kinds of weird positions. Crypto, bonds, derivatives of all kinds, international stuff.

Plus you can give them funny names like $NANC.


You named the two biggest platforms [0], YouTube and WhatsApp are the social media.

This is kinda like asking if Saudi Arabia and Russia are petrostates lol.

[0] https://en.wikipedia.org/wiki/List_of_most_popular_social_pl...


They were made into social media, but when they were aquired they were just video hosting and sms replacements with little (no?) social engagement aspect.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: