Iran, Gaza, Cuba, Irak, Afghanistan, Yemen, Lebanon... These people do not only suffer their tyrannical governments, but they must suffer also the war actions of the US and its allies.
You know that there are regular people living in these terror states that have to suffer not only their terror states but the US? It's not that I feel pity of the terror states, but of the regular people. It's a very easy distinction that for some reason (racism?) people is troubled to make.
Hyper presidentialist state that allows one administration (and realistically one person) to start a war against another nation without having authorization from congress.
Why do we ignore all the human right abuses the US perform abroad? Iraq, Afghanistan, now Iran, Gaza and Lebanon through Israel, support to Saudi Arabia (which would not exist without the US), El Salvador... And inside it's also horrible with its treatment to immigrant.
That should be at least comparable (if not worse) than what China is doing.
El Salvador is blessed by evil criminals put away from the streets. It took thousands of those who you defend for a whole country to be free to enjoy tranquility and security. I was born there and I know better than you calling us evil
I am not telling that imprisoning the criminals is a bad things, but the conditions in which this has been done and how they're treated in prison is against human rights by any measure.
This is how china tried to justify its genocide against uighers. Was theboutrage against that just politically motivated? Or do americans only care about ethnic cleansing when theyre not the ones doing it
Competition with the Soviet Union gave all the workers in the world better conditions, also advances in science and technology... (And risk of mutual destruction ;)), even if the USSR wasn't good.
For those who rely on open source models but don't want to stop using frontier models, how do you manage it? Do you pay any of the Chinese subscription plans? Do you pay the API directly? After GPT 5.5 release, however good it is, I am a bit tired of this price hiking and reduced quota every week. I am now unemployed and cannot afford more expensive plans for the moment.
I have $20 ChatGPT subscription. Stopped Anthropic $20 subscription since the limit ran out too fast. That's my frontier model(s).
For OSS model, I have z.ai yearly subscription during the promo. But it's a lot more expensive now. The model is good imo, and just need to find the right providers. There are a lot of alternatives now. Like I saw some good reviews regarding ollama cloud.
I've been on Kimi K2.5 on openrouter for a couple of months for anything I can't run locally. Really is dirt cheap for how good it is. Haven't assessed K2.6 yet but the price is higher so it needs to be more efficient, not just more capable.
But more broadly: openrouter solves the problem of making a broad range of models available with a single payment endpoint, so you can just switch around as much as you like.
How do you find the token speed of open router with kimi?
I have tasks that used to take ~3-5min with Sonnet 4.6. With OpenRouter Kimi, the same task takes 10+ min. It's also just obviously slower in opencode sessions. The results are good, and I love the lower cost, but the speed can be frustrating.
Have you considered... not subscribing? You can ask the top models via chats for specific stuff, and then set up some free CLI like mistral.
If you're trying to make a buck while unemployed, sure get a subscription. Otherwise learn how to work again without AI, just focus on the interesting stuff.
I just want to try to make something useful out of my time, that's why I'm subscribed to Codex at the moment. 20€ is affordable, not really a problem. But yes, maybe I would do me a favor unsubscribing and going back to the old ways to learn properly.
I'm "working" on some open source stuff with minimal AI. But I will probably cave in at some point and get a subscription again, the moment I need to spin up a mountain of garbage, fast.
At home I currently use MiniMax via OpenRouter - it’s pretty good and very cheap. They have a subscription plan, but I’m not ready to commit to it yet.
Another way to keep the ability to try out new models is to buy a reseller subscription like Cursor’s.
I tried OpenRouter but I feel the money flies even with these models, it is not comparable to a subscription but yes, it's very good for trying. Maybe I should test other models alongside GPT 5.5 to see which one fits me.
I'm also unemployed. So far the models that I've used the most are Kimi and GLM. I haven't done that much agentic coding though, I've mostly used them for studying math and general conversations and I'm generally happy with their performance.
I had Claude make me a quick tool to combine my Claude Code token usage (via ccusage util) with OpenRouter pricing from the models API
I'm on Max x5 plan and any of the 'good' models like Kimi 2.6, GLM, DeepSeek would have cost 3-5x in per-token billing for what I used on my Claude plan the last three months
So unless my Claude fudged the maths to make itself look better, seems like I'm getting a good deal
I was thinking the same. How can it be than other providers can offer third-party open source models with roughly the similar quality like this, Kimi K2.6 or GLM 5.1 for 10 times less the price? How can it be that GPT 5.5 is suddenly twice the price as GPT 5.4 while being faster? I don't believe that it's a bigger, more expensive model to run, it's just they're starting to raise up the prices because they can and their product is good (which is honest as long as they're transparent with it). Honestly the movement about subscription costing the company 20 times more than we're paying is just a PR movement to justify the price hike.
Anthropic recently dropped all inclusive use from new enterprise subscriptions, your seat sub gets you a seat with no usage. All usage is then charged at API rates. It’s like a worst of both worlds!
SSO Tax is a large part of it, controls around plug-in marketplace, enforcement of config, observeability of spend. But it’s all pretty weak really for $20 a month.
And Microsoft are going the same route to moving Copilot Cowork over to a utilisation based billing model which is very unusual for their per seat products (I’m actually not sure I can ever remember that happening).
For at least a year now, it has been clear that data quality and fine-tuning are the main sources of improvement for mediym-level models. Size != quality for specialized, narrow use cases such as coding.
It’s not a surprise that models are leapfrogging each other when the engineers are able to incorporate better code examples and reasoning traces, which in turn bring higher quality outputs.
If all you're looking at is benchmarks that might be true, but those are way too easy to game. Try using this model alongside Opus for some work in Rust/C++ and it'll be night and day. You really can't compare a model that's got trillions of parameters to a 27B one.
I often do need in-depth general knowledge in my coding model so that I don't have to explain domain specific logic to it every time and so that it can have some sense of good UX.
You should try it out. I'm incredibly impressed with Qwen 3.5 27B for systems programming work. I use Opus and Sonnet at work and Qwen 3.x at home for fun and barely notice a difference given that systems programming work needs careful guidance for any model currently. I don't try to one shot landing pages or whatever.
From what I understand, ~30b is enough "intelligence" to make coding/reasoning etc. work, in general. Above ~30b, it's less about intelligence, and more about memorization. Larger models fail less and one-shot more often because they can memorize more APIs (documentation, examples, etc). Also from my experience, if a task is ambiguous, Sonnet has a better "intuition" of what my intent is. Probably also because of memorization, it has "access" to more repositories in its compressed knowledge to infer my intent more accurately.
SWE-REbench should not be gameable. They collect new issues from live repos, and if you check 1-2 months after a model was released, you can get an idea. But even that would be "benchmaxxxable", which is an overloaded term that can mean many things, but the most vanilla interpretation is that with RL you can get a model to follow a certain task pretty well, but it'll get "stuck" on that task type, or "stubborn" when asked similar but sufficiently different tasks. So for swe-rebench that would be "it fixes bugs in these types of repos, under this harness, but ask it to do soemthing else in a repo and you might not get the same results". In a nutshell.
well, your own, unleaked ones, representing your real workloads.
if you can't afford to do that, look at a lot of them, eg. on artificialanalysis.com they merge multiple benchmarks across weighted categories and build an Intelligence Score, Coding Score and Agentic score.
A small model can be made to be "comparable to Opus" in some narrow domains, and that's what they've done here.
But when actually employed to write code they will fall over when they leave that specific domain.
Basically they might have skill but lack wisdom. Certainly at this size they will lack anywhere close to the same contextual knowledge.
Still these things could be useful in the context of more specialized tooling, or in a harness that heavily prompts in the right direction, or as a subagent for a "wiser" larger model that directs all the planning and reviews results.
My experience with qwen-3.6:35B-A3B reinforces this, gonna give this a spin when unsloth has quants available
Gemini flash was just as good as pro for most tasks with good prompts, tools, and context. Gemma 4 was nearly as good as flash and Qwen 3.6 appears to be even better.
Appreciate what y'all do! We were slacking about how many HGX-B300 it would take to run Kimi and it looks like we could actually fit 2-3 Kimis on a single HGX.
Opus 4.5 mind you, but I’m not too surprised given how good 3.5 was and how good the qwopus fine tune was. The model was shown to benefit heavily from further RL.
reply