Hacker Newsnew | past | comments | ask | show | jobs | submit | EagnaIonat's commentslogin

It's a well known technique. I first heard about it from Barbara Oakley, so there is probably some neuroscience research done about it.

Can you clarify what you mean?

If you check HF you will see its Apache2 and the datasets were also permissive.

It's one of the few models on the market where the creator indemnifies it against copyright claims.

https://research.ibm.com/blog/granite-ethical-ai


Oh sorry. Do we have the sources like Nvidia's Nemotron?

You could have found in 5 seconds. The weights are also open sourced as well.

https://github.com/ibm-granite


Maybe I suck but I didn’t find that in 5 seconds. Or with more time.

I meant the full training datasets and the complete recipes to make the models.


The training datasets are listed there and are all open source.

> the complete recipes to make the models.

You mean the weights which most companies don't release. Again you can find from that link.


Where is the list?

No I didn't mean the weights, but the source code to make the weights.


I'd like to assume you are not trolling and just want everything handed to you.

The granite site covers everything you keep asking for. Granite is made using lm-engine and the details are there.

Without the weights you are not going to be able to build to the same level of accuracy without some serious work.


I want open source to stay a label for actual open source software. That means handing everything to me yes.

No, it doesn't.

if you are googling you can find so many open source dataset. Also use kaggle, they're also having training datasets which we can use.

I know, but that is not my point.

You never had one. You tried to be clever and failed.

I don’t know. Since you are perhaps clever, can you show me the training datasets and recipes so I can replicate this model locally? I have access to good HPCs.

I think it’s fair if you use a bit more than 5 seconds as someone stated above. I would gladly be proven stupid.


https://github.com/ibm-granite

https://huggingface.co/ibm-granite

I think if you were genuinely interested, you could have found this yourself.


This is not what I asked?

if I can't reproduce the artifact, is it really open source?

If IBM themselves can't reproduce the artifact do they have the source?

I guess not

> specifically ask it to think hard before doing anything that gets close to the production data, etc.

Standard rule is you never let your developers at the production instance. So I can't see why an LLM would get a break.


I tested with Gemma4 and it sent it into an endless loop.


LLMs don't see words. They see tokens, which is why previously they had a hard time counting the r's.

You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.

Although the more recent LLMs have built in approaches to answer this question.

Gemma4 for example:

-----

The user is asking for the number of 'r's in the word "strawberry".

* S

* T

* *R* (1)

* A

* W

* B

* E

* *R* (2)

* *R* (3)

* Y

There are 3 'r's.

-----

GPT-OSS-120b example.

-----

The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.

But we may also note that "r" appears three times.

Thus answer: 3.

We can provide a short answer.

Thus final: There are three r's in "strawberry".

----


Doubt if you can make a dumb model smart by feeding it proofs



Sohnds like a great way to fill up the context before you even start.


Yes, what's your point? That is literally what it does - it adds relevant knowledge to the prompt before generating a response, in order to ground it me effectively.


My point is that this doesn't scale. You want the LLM to have knowledge embedded in its weights, not prompted in.


It scales fine if done correctly.

Even with the weights the extra context allows it to move to the correct space.

Much the same as humans there are terms that are meaningless without knowing the context.


Would it be possible to make GPT3 from GPT2 just by prompting? It doesn't work/scale


Bit of a straw-man there.


> It lowers the cost for experimentation. A whole series of “what if this was…”

Anecdotal, but I've noticed while this is true it also adds the danger of knowing when to stop.

Early on I would take forever trying to get something exactly to whats in my head. Which meant I would spend too much time in one sitting then if I had previously built it by hand.

Now I try to time box with the mindset "good enough".


> Edmund McMillen the creator resolved to never publish on Apple platforms again

It was only temporarily banned. It's currently still on the App store since 2017.


> 0 people use emoji to create a bulleted list.

I haven't seen this yet, but I guess the only reason I haven't done it is because it never crossed my mind.

What I have found an easy detection is non-breaking spaces. They tend to get littered through the passages of text without reason.


> If you have a Thunderbolt or USB4 eGPU and a Mac, today is the day you've been waiting for!

I got an eGPU back in 2018 and could never get it to work. To the point that it soured me from doing it again.

These days for heavy duty work I just offload to the cloud. This all feels like NVidia trying to be relevant versus ARM.


> This all feels like NVidia trying to be relevant versus ARM.

Except it's done by a third group, tinygrad, so it's more non-nvidia people wanting to use nvidia hardware one Apple hardware, than "nvidia trying to be relevant".


Yeh, Nvidia couldn't give less of a fuck about consumers. And egpu is inherently only consumer targeted.


FWIW Nvidia already supports UNIX OSes and AArch64 with their drivers. CUDA and CUDNN could be working overnight if Apple signed the drivers.


Thanks for the correction. I guess my PTSD on trying to get this running before is bias'ing my response.


When MLX comes out you will see a huge difference. I currently moved to LMStudio as it currently supports MLX.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: