Created: February 14, 2024
Tags: object discovery, unsupervised learning, object-centric learning, neuroscience
Link: https://arxiv.org/pdf/2204.02075.pdf
Status: Reading
Discovering objects in unsupervised manner is crucial for solving binding problem, that is, the human understanding of the world in terms of objects.
Following a coding scheme theorized to underlie object representations in biological neurons, its complex-valued activations represent two messages: their magnitudes express the presence of a feature, while the relative phase differences between neurons express which features should be bound together to create joint object representations.
Slot-based models, although competive and are becoming applicable in real world systems, still has some issues. Slot-based methods require elaborate structural biases, intricate training schemes to achieve a good separation of object features into slots. Second, they limit information flow and expressiveness of models, which lead to failure cases for complex objects, e.g with textured objects
The question is : “can we find simpler alternative to slot-based methods?”
Inspired by temporal correlation hypothesis (similar to spiking NNs, maybe?), introduce new complex-valued models for object discovery. In short,
Brain binds information from different neurons by synchronizing their firing patterns, while desynchronized firing patterns represents information that should be processed separately
Two types of messages that neuron sends:
This second type of message is not explored well in current NNs. And this paper exactly explores this message via complex-valued activations
Remember, complex numbers have magnitude and phase
How to extract object-wise representations from the latent space?