Foveated Retinotopy Improves Classification and Localization in CNNs

Publication
Vision
*Foveated Retinotopy in CNNs.* We represent Left an input image and how it is transformed by foveated retinotopy. We show below a representative reconstruction showing that it also acts as a cortical zoom on the image around the point of fixation. The transformed image is then fed to the ResNet deep learning architecture.
Foveated Retinotopy in CNNs. We represent Left an input image and how it is transformed by foveated retinotopy. We show below a representative reconstruction showing that it also acts as a cortical zoom on the image around the point of fixation. The transformed image is then fed to the ResNet deep learning architecture.

From falcons spotting prey to humans recognizing faces, the ability to rapidly process visual information depends on a foveated retinal organization that provides high-acuity central vision while preserving low-resolution peripheral vision. This organization is conserved along early visual pathways, yet remains under-explored in machine learning. Here, we examine the impact of embedding a foveated retinotopic transformation as a preprocessing layer on convolutional neural networks (CNNs) for image classification. By applying a log-polar mapping to off-the-shelf models and retraining them, we achieve comparable accuracy while improving robustness to scale and rotation. We demonstrate that this architecture is highly sensitive to shifts in the fixation point and that this sensitivity provides an effective proxy for defining saliency maps that facilitate object localization. Our results demonstrate that foveated retinotopy encodes prior geometric knowledge, providing a solution for visual searches and a meaningful classification robustness and localization trade-off. These findings provides a proof of concept in order to connect principles of biological vision with artificial networks, suggesting new, robust and efficient approaches for computer vision systems.

*Foveated Retinotopy simulated by a log-polar map.* We represent Left an input image with some geometrical objects and how it is transformed by the log-polar representation that implements foveated retinotopy. This shows that a rotation amounts to a translation on the polar axis (abscissa) and a zoom to a translation on the ordinates. We show right a representative reconstructionshowing that it also acts as a cortical zoom on the image around the point of fixation.
Foveated Retinotopy simulated by a log-polar map. We represent Left an input image with some geometrical objects and how it is transformed by the log-polar representation that implements foveated retinotopy. This shows that a rotation amounts to a translation on the polar axis (abscissa) and a zoom to a translation on the ordinates. We show right a representative reconstructionshowing that it also acts as a cortical zoom on the image around the point of fixation.
Jean-Nicolas Jérémie
Jean-Nicolas Jérémie
Phd in Computational Neuroscience

During my PhD, I was focusing on ultra-fast processing using convolutional neural networks.

Laurent U Perrinet
Laurent U Perrinet
Researcher in Computational Neuroscience

My research interests include Machine Learning and computational neuroscience applied to Vision.