Created: April 4, 2024
Tags: Analysis-by-Synthesis, Neuroscience,
Link: https://www.pnas.org/doi/full/10.1073/pnas.1917565117
Status: Reading
Three-dimensional (3D) shape perception is one of the most important functions of vision. It is crucial for many tasks, from object recognition to tool use, and yet how the brain represents shape remains poorly understood. Most theories focus on purely geo- metrical computations (e.g., estimating depths, curvatures, symmetries). Here, however, we find that shape perception also involves sophisticated inferences that parse shapes into features with distinct causal origins.
Making sense of such structures requires segmenting the shape based on their causes, to distinguish whether lumps and ridges are due to the shrouded object or the ripples and folds of overlying cloth.
- Richness of Shape Representation: The paper suggests that the human visual system's representation of 3D shape goes beyond merely mapping local surface properties like depth, orientation, or curvature. Instead, shape perception involves inferences about other aspects of objects, including their material properties and causal origin.
- The "Veiled Virgin" Effect: The paper introduces the concept of the "Veiled Virgin" effect, where observers can distinguish between shape features caused by a hidden object and those resulting from the draping of cloth. This inference of causal origin is crucial for understanding and interacting with objects successfully.
- Computational Models: Previous studies have proposed computational models based on approximate physical simulation to explain shape perception in draped objects. These models outperform conventional deep neural networks, suggesting the importance of internal "physics engines" in the human visual system.
- Connection to Amodal Completion: The findings of the paper are related to amodal completion, where observers perceive a surface continuing behind an occluder. However, the "Veiled Virgin" effect differs from amodal completion as it involves inferring hidden object shape based on its effects on visible surfaces rather than interpolation of surface structure across gaps in the sensory signal.