Hacker Newsnew | past | comments | ask | show | jobs | submit | ljouhet's commentslogin

Real question: what tool do you use? (for long/complex documents with tables, code, maths)

- marker (with --force-ocr) gives me the best results

- Mistral OCR (seems really great, but I never managed to get it work)

- Mathpix (tried a long time ago)

- docling (gives me garbage, I must use it wrong)

- Unlimited OCR (will try it)

- ???


- Azure Document Intelligence (has an option to return markdown too including headers and footers).

- AWS Textract


Exactly. They're both very expensive and prone to surprising you. Sometimes in a good way, sometimes in a bad way. I'd rate them 85%, but you have to run a test because they both fail in different ways on the 15%.

poma-ai has really great chunking techniques that chunk the document based on the document structure/heirarchy.

We use it on 200 page IEEE standards that are notoriously complex, filled with tables and diagram. Highly reccomend.


Most of my aliases contain `--` for the same reason, `git--progress`, `grep--rIn`, `nvidia--kill`, `ollama--restart`, `rsync--cp`, `pdf--nup`...

Easy autocomplete, I know there won't be any collision, and which command is mine.


Kinda makes no sense to me: so you don't use '--' as a prefix, you use it in the middle of an alias, so you first have to autocomplete, say, 'gi' not to 'git' but to 'git--progress'. What does that alias do? Doesn't it call git with some args? If so - why not just alias it to git?


Great hack!


Something like

    ollama run hf.co/ngxson/GLM-4.7-Flash-GGUF:Q4_K_M
It's really fast! But, for now it outputs garbage because there is no (good) template. So I'll wait for a model/template on ollama.com


It's available (with tool parsing, etc.): https://ollama.com/library/glm-4.7-flash but requires 0.14.3 which is in pre-release (and available on Ollama's GitHub repo)


I have a regular scroll wheel and it moves two items each time. Totally unusable for me.

(great idea, though)


TLDR: "I drive an ambulance and I could save more people if I could drive faster, so speed limits are bad!"


First line of https://t3x.org/klong/prime.html

"The braces around an expression denote a function. Because this function contains a single variable, x, it is a monadic function or a monad."

I never understood that about monads, even if it's litterally their name.


A monadic function in APL-family languages is not related to monads from category theory, which are the ones you see in Haskell, nor to Leibniz's monads.


In this context it just means "one parameter function".

It looks like every apparently free variable in a Klong brace expression is actually bound as a function parameter.

This is so in basic algebra in that we can think of, say, x^2 + y^2 as a two parameter function, even without writing out the full f(x, y) = x^2 + y^2 notation with the f(x, y) head.

A two parameter function would be called "dyadic" in the jargon which calls one argument functions "monadic".


It is a satire, right?

I don't know what to think anymore



uv is an incredible tool ; ty will be also. It's insanely fast

For now, I have some false negative warnings :

'global' variables are flagged as undefined `int:unresolved-reference: Name ... used when not defined` (yeah, it's bad, I know)

f(*args) flagged as missing arguments `lint:missing-argument: No arguments provided for required parameters ...`


don't forget ruff checker and formatter


Who defines "value-aligned, safety-conscious project"?

"Instead of our current complex non-competing structure—which made sense when it looked like there might be one dominant AGI effort but doesn’t in a world of many great AGI companies—we are moving to a normal competing structure where ..." is all it takes


Most likely the same people who define "all natural chicken" - the company that creates the term.


I actually lol-ed at that. It's like asking the inventor of a religion who goes to heaven.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: