Next-generation neural computations
Next-generation neural computations
Home
Latest
Events
Projects
People
Publications
Talks
Grants
BlogBook
Contact
Neuro-Inspired AI
A saccade-inspired approach to image classification using vision transformer attention maps
Saccade selection method: (a.) The input image of dimensionH× Wis split intoH16×Wnsized patches and embeddedinto token vectors. (b.) The tokens are passed through the DINO transformer, and attention flow from patch tokens to [CLS]token (white arrows) are extracted and reshaped into one attention map per attention-head. (c.) The multiple attention maps arefused into one by taking the maximum value across heads. (d.) The highest-attention locations define square regions(“saccades”) whose tokens are retained. (e.) Selected regions are revealed sequentially, and the image variants are classified bya pre-trained linear head.
Matthis Dallain
,
Laurent Rodriguez
,
Laurent U Perrinet
,
Benoît Miramond
Cite
Cite
×