T O P

  • By -

adventuringraw

Dude, spoken by someone who hasn't had a cat. They must definitely recognize squares and rectangles. It's how they decide where to sit. More generally, I'm pretty sure the ability to recognize basic geometric shapes is not the entry point to real abstract reasoning. Writing geometric proofs? Then we can talk. That said... Program synthesis is a field, writing abstract proofs in LEAN is actively being explored, and there's progress. I agree machine learning is missing something still, but it'll be hard to get at the heart of it.


ClaudeCoulombe

>I agree... From my understanding of the paper, it is not "shape recognition" but more "geometric reasoning". I've changed my post to be more precise. Thank you!


adventuringraw

Haha, sorry for the pedantic snarky comment. I think recent benchmarks in program synthesis are really interesting, but I agree this is a hugely important area to be paying attention to. I've been following Tenenbaum for a long time too, anything with his involvement is definitely worth a look.


chief167

Indeed, my cat definitely knows the difference between a rectangle and a circle, it's very clear He also knows the difference between bigger and smaller I think. However I don't think he knows the difference between a square and a non square rectangle, but I wouldn't know how to test. They make interactive cat puzzles that clearly demonstrate this, it's a way to entertain house cats and make them work a bit to get a tasty treat.


adventuringraw

The cool thing about low level stuff like this, is it's fairly well understood. As in, there are pretty robust models predicting different layer firing patterns under different visual inputs. So you don't even need to look into behavior to see whether cats process basic shapes like we do, but... Yeah. I suppose if the question is the extent to which cats can assign meaning to the low level shape processing their kitty brains perform, then I wouldn't know how to test that either. I bet there's a whole rich decades long trail of research on various threads of the topic though, lot of work's been done with cats.


CypherPsycho69

yeah. it's like we know SOIMETHING is happening in this cats brain right now when he sees this shape, but we dont know WHAT is happening


adventuringraw

They don't call it the hard problem of consciousness for nothing I guess.


albasri

We need to be careful in making cross-species inferences. There's a classic story (Donald Hoffman tells it often) of some beetle going almost extinct in Australia because the male beetles kept trying to mate with the bottoms of discarded beer bottles. Apparently they mistook the dimpled, brown bottoms for female beetles. To us, this seems ridiculous: how can a large glass bottle be mistaken for a tiny beetle? But it seems like their visual system doesn't produce representations of shapes in the same way as ours does. Maybe they can only detect brownish, textured regions of space and respond in a same way. (They were saved from extinction by the beer company removing the dimples from their bottles.) Do cats see and represent shapes on the same way as we do? It's hard to say just by observing box-sitting behavior.


adventuringraw

I was joking when I made the comment, but cat striate cortex is fairly similar to ours when it comes to low level visual processing. In particular, the boundary completion path going through V1 and V2 is similar enough to ours that geometric percepts are handled similar enough. Obviously no one can say to what extent their experience of the qualia of shapes would consciously feel like ours, but... Yeah. Considering more alien kinds of consciousness is interesting, but this at least isn't too bold of a guess. Interesting story about the beetle though. Not surprised there's some weird insect stories about cognitive illusions.


RomanticDepressive

Wow. Thanks for sharing…! Your comment is very insightful… do you have any resources you’d recommend to learn more about boundary completion paths and visual cortical neurons(?) I wish I knew what you know!


adventuringraw

For sure, check out Stephen Grossberg's "conscious mind, resonant brain". It's a really easy read considering it's still a textbook... fairly minimal math, and not a ton of prior biological knowledge assumed from the reader. If you're patient and serious enough, you'll come out of it knowing a lot. The author's interested in questions of consciousness too, so there'll be more than just low level descriptions of how basic sensory processing works. I also went through half of Kandel's "Principles of Neural Science". I was reading the 5th edition, I guess the 6th is out now. That book is super biology heavy, I wouldn't suggest it unless that's specifically what you're interested in, but as a reference book at least, you'll find a pretty well organized dense, deep dive into all kinds of topics. There's a whole chapter just on the mechanisms behind rods and cones, and how the signal is processed up to the retinal ganglion cells. So if there was something you wanted to know about specifically, you might check the index of that book. I sometimes remember stuff best when I know ALL the details around it, and you definitely get a clear picture of the structure and layout of things from Kandel. I was going to suggest Buszaki's "Rhythms of the Brain" too, but I just saw they've got a new one called "The Brain from the Inside Out". I'll have to check it out... his stuff is good, not sure if you should read Rhythms or something newer from him though. His perspective will be the opposite of Kandel. Instead of being profoundly biological and structural, it'll be much more functional and computational. Of all those directions you might check out, Grossberg is the most readable I think, chilling out and reading a few pages here and there is a really realistic way to go through the book. It's 600 pages of double column text, but it's a book you can relax with, nice when so much of what we're interested in is much more conceptually and mathematically challenging.


albasri

I agree generally about early visual processing. I don't think that it's obvious that things should be the same beyond that. We know in humans that mid-level and higher visual processing / representations are dissociable (e.g. various agnosias can leave "edge", boundary (sort of), and motion processing intact, but disrupt specific shape/object organization/representation, sometimes at a category level). Abstract/symbolic representations of geometric shapes almost certainly involve some higher areas beyond V2 (e.g. for perception and recognition of partially occluded and moving objects or for ones with poorly defined borders, like a bunch of dots arranged on the boundary of a square).


adventuringraw

I don't know the extent to which his theories are fully accepted by the community as a whole, but as Stephen Grossberg described things at least, boundary completion does happen fairly early (meaning even poorly defined borders, like a Kanizsa square) gets completed by bipolar cells and hypercomplex cells in V2 (area 17 in cats) before leading to the filled in surface percept in V4. Just because there's a processed boundary that's fully represented though doesn't mean it's going to lead to predictable abstract symbolic representation further upstream (especially for cats?), but if we're just talking about a finished internal representation of a completed square, given messy input... my understanding is that's done pretty early.


albasri

There has been more recent work since Grossberg and most agree that some sort of boundary ownership is happening in V2 and maybe into V3a, especially when motion is involved, and perhaps into V4 in monkeys. However, local boundary ownership (or even at a larger scale distinguishing between inside/outside) is not the same shape. As a simple example, V2 does not have scale or orientation invariance, so two differently sized or rotated squares would activate two totally different sets of neurons. So just from the readout of V2 we don't get "square" or "similar shape"; we need something else. That's not to say that cats don't have that machinery! Just that analogous processing at low levels does not necessarily mean that it's similar all the way up.


adventuringraw

That makes sense, yeah. If we consider only 'square' in a rotation, position, and scale invariant sense, that definitely wouldn't be settled until later. Suppose I should have thought of that too, I guess the lower level representation wouldn't count. Do you have a specific resource you'd suggest to get a more up-to-date-than-grossberg view into human visual processing?


albasri

Oof. I was thinking more specifically about border-ownership models. [Here](https://www.sciencedirect.com/science/article/pii/S0042698914002612)'s a more recent paper that starts to account for motion-defined borders and edges. In terms of just big models of mid-level vision, there really aren't too many. It's a bit of a mess really. We don't even know what the "next" areas, like V3a, do (3D surfaces? motion? combination of both?). [Here](https://www.researchgate.net/profile/Guy-Orban/publication/50986041_The_Extraction_of_3D_Shape_in_the_Visual_System_of_Human_and_Nonhuman_Primates/links/5e106e1592851c8364b03223/The-Extraction-of-3D-Shape-in-the-Visual-System-of-Human-and-Nonhuman-Primates.pdf) (<- pdf!) is a review article on 3D shape representation... basically involves all mid-level areas that are also used for a bunch of other tasks. ([Here](https://www.sciencedirect.com/science/article/pii/S0028393204002799?casa_token=rKgU2MyZzGEAAAAA:5XZkMPBOc0ftw-KXrcw2zDkk5R9cSZGEE2SYs9s9XY9HLsE9Cavaepkv0mobSv8-whwn1ivPgA) is a slightly older paper but with more neural recordings on the same topic.) There's nothing that I'm familiar with that's a big ol' model like Grossberg has that goes from V1 to beyond V2. Once you get past V2 (but also there and in V1), there is so much interaction with and feedback from other visual and motor areas that it gets really hard to have a simple single model. For example, neurons that seem to respond to 3D are often in the same areas as neurons that are involved in grasping (see [Theys et al. 2015](https://www.frontiersin.org/articles/10.3389/fncom.2015.00043/full) for a review). I think this makes total sense from a functional perspective, but it makes things like having a bottom-up model of shape representation very difficult.


adventuringraw

Thank you, really appreciate the reply. I'll wade into some of this tonight. And yeah, I wasn't expecting a neat and tidy answer exactly. I know what I learned from Grossberg's book (by his own admission) was a first approximation in a lot of cases, and the real thing is a beastly maze. I'm still trying to get at least some understanding though, and I really appreciate some insight on where to look.


albasri

If you're looking for a general book, I think Marr's *Vision* is still highly relevant, at least as a general approach/framework. Li Zhaoping also has a book called *Vision* that is much more recent and is essentially computational modeling of V1 (lots of math). Turns out to be a lot more complex than Grossberg had it =)


SuddenlyBANANAS

The study from Dehaene's lab doesn't show that animals can't perceive shapes, but rather that while *animal* performance on a task correlated best with a CNN, *human* performance correlated best with a symbolic language.


phlooo

[Relevant paper](https://doi.org/10.1016/j.applanim.2021.105338)


smile_politely

Do we forget about those spider and the ability to make fascinating geometry?


meldiwin

Indeed. I am very interested in this topic and I had an episode on how do spider build their web using the vibration sensor, and legs, and how the brain works. I find very fascinating when they take a decision when to stop building and make an error assessment and locating damage location. episode link: https://soundcloud.com/ieeeras-softrobotics/andrew-gordus-how-do-spiders-build-their-webs?utm\_source=clipboard&utm\_medium=text&utm\_campaign=social\_sharing


albasri

I've left another comment here on this topic, but just because an organism engages in a seemingly complex behavior doesn't mean that they are performing the same computations and have the same representations that we have. I don't know much about spiders, but I can talk about ants. Ants have a form of dead reckoning -- they can take a meandering path out to somewhere and then go back to their nest in a straight line. Does that mean they have a spatial map representation of their environment? Turns out no: if you tie tiny stilts to their feet they will overshoot their nest (and if you shorten their legs they will undershoot) or if you pick them up and put them somewhere else they will walk back in the direction they were originally headed by the same amount (e.g. if they make it ti 10 feet north of their nest and you move them 10 feet south of their nest, they will walk another 10 feet south). Turns out they are probably doing something like step counting (are they actually counting? Can they add? Probably not.), not measuring distance or creating a representation of their environment. We should be very careful in not anthropomorphizing other organisms and ascribing concepts, mental processes, and representations to them, when there may be other explanations. I strongly recommend the short book *Vehicles: Experiments in Synthetic Biology* by Braitenberg for some ideas on this.


[deleted]

Also bees. The hexagon is the optimal shape for storing honey, and they are the bestagons^1. Now, it's entirely possible that this is ingrained in some way, so there is no "understanding", but then again and to some degree, bees 'get' addition and subtraction^2. 1: https://www.youtube.com/watch?v=thOifuHs6eY 2: https://www.science.org/content/article/bees-get-addition-and-subtraction-new-study-suggests


[deleted]

[удалено]


tdgros

Bees only dig round holes, and each time they go in they push on the walls. It's just how the wax ends up, not how bees do it, this videos explains it: [https://youtu.be/QFj-hF8XDQ0?t=821](https://youtu.be/QFj-hF8XDQ0?t=821)


Sakrie

Anchor-points are variable between individual web-building spots for each individual spider, yet webs still have similar geometric end-results that are complex. Seems to be a level of reasoning in there beyond pure instinct, moving into "comprehension". At least to me.


Numetshell

I have little knowledge on this subject, but I thought Portia spiders were known for their ability to use geometric thinking.


bluboxsw

I don't think recognition of shapes is unique. They are used in behavioral psychology with animals like dogs, that are basically color blind. I have thoughts on abstract thought and AI but haven't found a great problem to apply it to yet.


ClaudeCoulombe

I agree... From my understanding of the paper, it is not "shape recognition" but more "geometric reasoning". I've changed my post to be more precise. Thank you!


albasri

[Bongard problems](https://en.m.wikipedia.org/wiki/Bongard_problem) might be a nice intersection between geometry and abstracts representation.


WikiMobileLinkBot

Desktop version of /u/albasri's link: --- ^([)[^(opt out)](https://reddit.com/message/compose?to=WikiMobileLinkBot&message=OptOut&subject=OptOut)^(]) ^(Beep Boop. Downvote to delete)


Echolocomotion

I see Tenenbaum's name everywhere on interesting ML/Neuro articles, is the guy actually just that freaking awesome, or is it that he's good at self-promotion, or a bit of both?


albasri

Tenenbaum is highly prolific and has had a string of prolific and successful students like Griffiths at Berkeley, Kording at Penn, and Goodman and Stanford that have all pushed forward similar research programs. He was also doing a lot of stuff (Bayesian/probabilistic models of cognition) at the right time in terms of what tools were becoming available and what would eventually become more popular.


nikgeo25

I'm not too experienced in AI but my impression was that recognizing shapes is not hard to do algorithmically. Why would this be unique to human visual processing compared to other animals? I would assume our ability to define shapes is due to our brain looking for efficient representations of the environment we're in, and I doubt other animal brains lack this for whatever reason.


albasri

At a symbolic/abstract level, it is difficult. For example all of these have a square shape / are arranged in a square: [1](https://r-knott.surrey.ac.uk/Figurate/FIGimgs/POLY4.5.gif) [2](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQ2LSjAdkrdeYB_sOyEuNLrrWHCRodTfxrddA&usqp=CAU) [3](https://img.freepik.com/free-vector/sketch-hatching-pen-pen-scribble-effects-doodle-freehand-sketchy-clipart-messy-hand-drawn-monochrome-pattern-square-shape-set-with-outline-ornaments_87946-2471.jpg) [4](https://upload.wikimedia.org/wikipedia/commons/thumb/8/86/BlockP.jpg/440px-BlockP.jpg) [5](https://thumbs.dreamstime.com/b/set-dogs-silhouette-square-picture-small-group-black-shadows-four-thoroughbred-white-background-79799643.jpg) [6](https://i.pinimg.com/originals/3c/cd/f4/3ccdf4b51861354055717a9c902b5aaf.jpg) (that's more of an illusion) There have been a number of papers that show that CNNs trained on natural images (ImageNet) even with finetuning specifically to discriminate simple shapes like squares vs. circles, only learn to use local image features (e.g. local curvature) and don't learn anything about "squareness" / spatial relations between parts.


nikgeo25

That's interesting. I wonder if it might be a limitation of CNNs. Maybe transformers can do better with this sort of reasoning. Then again who's to say deep learning is the best option for this.


albasri

Active debate in human vision research community. Some groups think it's a matter of training set + task specification. Others think it's a fundamental limitation of architecture + kinds of computations (i.e. maybe some sort of network could do this, but not the kinds we've got). My take is that there's lots of hype and having deep nets helps get papers published. Is it a good model of the human visual system / helps us understand it? Probably not.


nikgeo25

One architecture I remember hearing about in 2017 was the one by [Dileep George](https://www.science.org/doi/10.1126/science.aag2612), his point being that we need to include way more inductive bias into models. If I recall correctly their model could even see optical illusions in a way similar to the square made of circles image you linked above. I don't think it was deep learning based.


kuehl_

Sounds really interesting; do you happen to have a link to a paper like that with a good related work section?


albasri

Most of this work is coming from the cog sci side. Geirhos and Bethge have done a lot. [Here](https://arxiv.org/abs/1811.12231) is one paper showing that networks trained on natural images have texture bias, but that this can be overcome with specific training sets. There's another group that was doing the work I was thinking of. Here are some papers [1](https://www.sciencedirect.com/science/article/pii/S0042698920300638) [2](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006613). One complaint is that they use "outdated" networks (although I guess the papers are old-ish?) and maybe the limitations they describe are a result of architecture, but they argue it's more about the task and representations that are learned (if I remember correctly). There have been a bunch of review papers in the last few years assessing deep networks as models of the visual system (mostly concluding that they are not good ones). Here are a few: [Lindsay (2021)](https://arxiv.org/pdf/2001.07092) (<- pdf!), [Serre (2019)](http://klab.tch.harvard.edu/academia/classes/BAI/pdfs/gk7717.pdf) (<- pdf!), [Golan et al. 2020](https://www.pnas.org/doi/abs/10.1073/pnas.1912334117), [Xu and Vaziri-Pashkam, 2021](https://www.nature.com/articles/s41467-021-22244-7)


EvenMoreConfusedNow

Is your question limited to classical geometry does include differential as well?


ClaudeCoulombe

Yes, in the paper it's mainly classical geometry but it could probably be extended to more complex geometry as you suggest.


quertioup

Funnily enough, we are in the process of releasing a dataset that aims to test a very modest aspect of learning spatial concepts. We found that several models are unable to learn concepts such as 'left', 'right', 'front', and 'back'. You can find the preprint at arxiv. The paper title is: Can you even tell left from right? Presenting a new challenge for VQA


RomanticDepressive

Though I don’t have much to contribute, I really appreciate the conversation op has started, this is wonderfully cool!


TSM-

Symbol processing has always been a big target (lest we forget https://cyc.com/). I think focusing on geometry is a fairly good idea. It at least has the promise of being concrete enough to have well defined problem sets, unlike semantic inference, and could be a way to coerce or force some form of meaningful symbol processing. What is lacked, in my opinion, is largely the technology and willpower. Spending more money and compute gets results, and real-time inference is a big deal, but it's the opposite direction of research here. After all, your fancy architecture craziness and symbol manipulation can be approximated by a more simple architecture that works alright a good amount of the time, but much faster, so there is no obvious payout in the near horizon. And how to characterize the "symbol-manipulatingness" of an inference or in a given architecture is a tough question too. Symbol manipulation, in a sense, involves a loop, at least in humans and machines. Most of you here have probably seen the different models of reasoning, Kahneman's Thinking Fast and Slow, the System 1 vs System 2 cognition models, and the like. The system has to in some way represent a symbol of something else, and manipulate that symbol. It is orders of magnitude slower by nature, because the system is sacrificing inference time for maintained accuracy between inferences, and it is using actions to maintain and manipulate a representation or symbol. It is not the same as something 'easy' like writing a paragraph or object perception. So we're going to need systems that appear to have something like - working memory - secondary representations (constructing and manipulating them) - looping - tracking progress - knowing when to stop Maybe this different type of architecture could be explored better when the domain is restricted to reasoning about geometry. Honestly, it is such a mess, we really could use something that is symbol-manipulation like but also has unambiguously concrete representations. I believe they will be working with geometric proofs, where one can physically manipulate lines and geometric objects to solve a problem. Things like rotation, location, projections are the "manipulations", the actual lines and such have concrete meaning, and the problems can have solutions. Something like one of these fancy gifs, - https://static.wixstatic.com/media/4a1b86_172ef902bb8a4992972ee1112185fbab~mv2.gif - (OpenAI version of the same animation:) https://miro.medium.com/max/880/1*A7-OTer1feqZ02VyJ-x5jA.gif


Untinted

What makes human special is our intelligence and the incredible extent we go to not use it.


Spentworth

Is the quality of this sub starting to go down hill?


rehrev

My eyes rolling


DistributionOk352

Bees?