Hacker Newsnew | past | comments | ask | show | jobs | submit | anglerfish's commentslogin

OP here, and thank you kind sirs and ladies for you feedback.

I'd just like to answer the recurring objection: yes, our visual experience contains a lot of frames and that seemingly refutes my MNIST example; however, you do forget about the other part of a supervised dataset, namely labels. Do we have a label provided to each thing we see in our life? Obviously not. How much time do you need to familiarize yourself with a new entity, like an unknown glyph or symbol? Can't provide a concrete example, but I guess a single math class was enough for all of you to recognize all the digits the next day. You can test it right now by looking into some unknown alphabet and then looking into it again upside down - you'll recognize it perfectly, except for mental rotation issues (which occuur even for well-known letters and symbols).


> I guess a single math class was enough for all of you to recognize all the digits the next day.

I'm curious what makes you think that. My experience with what's going on at my sons school is telling me that the children spends a massive amount of time on getting recognition of digits and letters right.


When I was living in China I had difficulty recognizing handwritten 9's and 1's. Their 9's are less half-circular than Westerner's are, and the 1's have a long stroke the top that looks like a sloppily written 7 to me. I would frequently look at a handwritten number and have to analyze what it was.


> How much time do you need to familiarize yourself with a new entity, like an unknown glyph or symbol?

I think you look at your subjects wrong - don't pretend your computer is an adult (which had learned most of its life) - rather consider him an infant learning letters/digits/objects for the first time. Doing so you might come across a similar learning curve to the one you have described. With the additional case we (or at least I) don't know how to bring the computer to the level of a fully grown man.

As for the problem presented in CNN, if the problem is not having the structure, why not gray-scale the structure as a secondary level for the CNN?

I'm not really from the field so excuse me if this was complete BS


What is the upside down experiment meant to prove? Seems to me that the mental rotation issues indicate that mental image processing is not very rotation tolerant, but rather needs a hardwired (and slow) counter-rotation step added to cope with rotated symbols, which you could just as well tack onto a neural network. Am I missing the point?

Also, being able to consciously recognize letters is relatively easy, but the normal reading process, with which people recognize well-known letters and instantly unconsciously convert them to sounds, does require quite a bit of repetition of those letters before it starts to kick in...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: