logo
ResearchBunny Logo
Salience-based object prioritization during active viewing of naturalistic scenes in young and older adults

Psychology

Salience-based object prioritization during active viewing of naturalistic scenes in young and older adults

A. Nuthmann, I. Schütz, et al.

This fascinating study conducted by Antje Nuthmann, Immo Schütz, and Wolfgang Einhäuser reveals insights into how fixation selection in visual scenes is influenced by salience and objects, particularly across different age groups. Discover how older adults prioritize highly salient objects and how their fixation patterns vary compared to younger adults.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates what guides attention and gaze in naturalistic scene viewing, comparing two perspectives: the salience view (fixations driven by low-level, image-computable salience maps) and the object view (objects as primary units of saccadic selection). Building on prior work demonstrating a preferred viewing location near object centers and mixed evidence for the predictive power of salience maps, the authors aim to reconcile these views by testing both location-based (grid cells) and object-based salience effects. A further goal is to examine age-related differences by comparing young adults to older adults (65+ years), given known age-related changes in visual and cognitive processing and prior findings that the influence of low-level features on fixation selection may decrease with age. The central questions are: (1) Does visual salience independently guide fixation selection beyond central bias when analysed location-wise and object-wise? (2) Does salience prioritize among objects? (3) How do these effects differ between young and older adults? (4) Where within objects do viewers fixate first (preferred viewing location) across age groups?
Literature Review
Saliency map models based on low-level features (e.g., Itti & Koch and successors) predict fixation selection reasonably well in free viewing. Object-based accounts show that fixations within objects cluster near object centers (preferred viewing location, PVL), modulated by visuomotor factors (object size, saccade direction/launch site) and sometimes object category/affordances. Prior work suggests saliency maps may partly succeed by correlating with object locations; when object locations are known, traditional saliency maps provide little additional predictive power, though results differ with newer models. Object-based models that incorporate PVL can match or outperform saliency models, especially when objects are experimentally dissociated from high-salience regions. Proto-object models offer image-computable proxies for objects but may lack PVL unless overlapping with annotated objects. Developmentally, the influence of low-level salience tends to be higher in children/infants and appears to decrease with age; findings in older adults are mixed regarding attentional capture. Meaning maps (human-rated patch meaningfulness) often outperform salience in predicting fixations, but definitions of meaning vary and may conflate high-level visual features; meaning may differ across ages. The present work remains agnostic about semantics and uses AWS salience to isolate visual salience effects while directly testing object-based prioritization and PVL across ages.
Methodology
Participants: 42 young adults (mean age 22.1, 8 men/34 women, 18–29 years) and 34 older adults (mean age 72.1, 17 men/17 women, 66–83 years), normal or corrected-to-normal vision by self-report. Ethics approval and informed consent obtained. Stimuli and task: 150 color photographs of real-world scenes (800×600 px; 25.78°×19.34° at 90 cm). Each scene viewed for 6 s with free eye movements; central fixation check before each scene. On 30/150 trials (20%), a yes/no object-related memory question followed to assess encoding (e.g., presence of a specific object). Apparatus and data acquisition: EyeLink 1000 (desktop mount, 1000 Hz binocular; right eye analyzed). Gaze preprocessing: Initial central fixation excluded from all analyses. Last fixation included for position analyses but excluded from duration analyses. Processing implemented in MATLAB and re-implemented in Python as GridFix toolbox. Salience computation: AWS (Adaptive Whitening Saliency) model used with default parameters except output scaling=1.0 (full resolution). AWS maps normalized to unit integral. Object annotation: Independent annotator labeled 1,032 objects (bounding boxes) across 150 scenes, focusing on moderately sized, minimally occluded objects not spanning the vertical midline. Mean object width 2.5° (SD 1.4°), height 2.6° (SD 1.5°); mean eccentricity (center-to-center distance from scene center) 8.6° (SD 2.6°). Average 6.9 objects per scene (SD 2.1). Feature Congestion used as a global clutter measure; clutter unrelated to number of annotated objects. Mean object salience exceeded scene mean in 73% of objects; maximum scene salience fell within an annotated object in 26% of scenes. Analytic approach: Generalized linear mixed models (GLMMs) and linear mixed models (LMMs) in R (lme4), Laplace approximation (glmer) and REML (lmer). Predictors were centered and z-scored; age group treatment-coded (young reference). Central bias modeled via anisotropic Euclidean distance to scene center (vertical distances scaled by 0.45). - Grid-based GLMM (location-based salience): Scenes partitioned into an 8×6 grid (48 cells). Binary response per observer×scene×cell (excluding the first-fixated cell): fixated (1) vs not (0). Fixed effects: intercept, age, central bias, central bias×age, AWS salience (cell mean), salience×age. Random effects: by-subject (intercept, slopes for central bias and salience with correlations), by-scene (intercept, slopes for central bias and salience with correlations). Observation matrix size 535,330 rows. - Object-based GLMM (object prioritization): Binary response per observer×object: fixated (1) vs not (0). Fixed effects: intercept, age, object eccentricity (central bias), size (log area of bounding box), salience (mean AWS within box), and each ×age interaction. Random effects: by-subject (intercept and slopes for central bias, size, salience with correlations), by-object (intercept) nested within scene, by-scene (intercept). - Fixation time LMMs (objects only): Log-transformed first-fixation duration and gaze duration (sum from first entry to first exit). Fixed effects matched the object GLMM. Random structures included by-subject slopes; by-object and by-scene intercepts as supported by data. - Within-object landing positions (PVL): Initial fixations on objects analyzed. Landing positions normalized to object size and entry direction (x,y in [-0.5,0.5], 0 at object center; negative indicates undershoot). Separate LMMs for horizontal and vertical positions with fixed effects: intercept (PVL) and object size (log area), each with age interactions; simplified random structures as supported.
Key Findings
Memory and basic eye movements: - Memory performance: All observers performed above chance. d′ young M=1.54 (SD=0.45), old M=1.33 (SD=0.47); difference not significant, t(69.8)=1.97, p=0.053. Older adults used a more conservative criterion c (M=0.67 vs 0.45), t(66.8)=2.68, p=0.009. - Basic measures: No significant age differences in number of fixations (≈21/trial), mean fixation duration (~248–252 ms), or saccade amplitude (~4.5–4.7°). Object-based GLMM (fixation probability on objects): - Overall probability: Intercept (young) b=-0.2278, SE=0.071, z=-3.207, p=0.001; older vs young difference b=-0.208, SE=0.0865, z=-2.406, p=0.016 (lower overall object fixation probability in older adults). - Central bias (eccentricity): Young b=-0.1795, SE=0.0434, z=-4.131, p<0.001; age interaction ns (b=0.033, p=0.420). - Object size: Young b=1.028, SE=0.0403, z=25.485, p<0.001; stronger in older adults (interaction b=0.1567, SE=0.032, z=4.901, p<0.001). - Object salience: Young b=0.3823, SE=0.0384, z=9.949, p<0.001; stronger in older adults (interaction b=0.0523, SE=0.0219, z=2.387, p=0.017). Fixation times on objects (log-transformed): - First-fixation duration (FFD): For young, eccentricity b=0.0221, SE=0.0055, t=4.003; size b=-0.0122, SE=0.0045, t=-2.726; salience b=0.0225, SE=0.0046, t=4.933. No significant age interactions. - Gaze duration (GD): Older adults longer overall (intercept old-young b=0.1046, SE=0.026, t=4.025). For young, eccentricity b=0.0314, SE=0.0075, t=4.176; size b=0.0599, SE=0.0079, t=7.579; salience b=0.039, SE=0.0075, t=5.207. Age interactions: larger eccentricity effect in older (b=0.0153, SE=0.0071, t=2.165) and larger size effect in older (b=0.043, SE=0.0078, t=5.478); salience interaction ns (b=-0.0008, t=-0.128). Within-object landing positions (PVL): - Horizontal intercept (young) b=-0.0509, SE=0.0051, t=-9.952; vertical intercept b=-0.0294, SE=0.0069, t=-4.232, indicating undershoot relative to object center in both axes. No significant age differences (horizontal interaction b=0.0102, t=1.456; vertical interaction b=-0.0109, t=-1.144). Larger objects associated with greater undershoot (horizontal size b=-0.0113, t=-4.238; vertical size b=-0.0173, t=-5.227); size×age interactions ns. Grid-based GLMM (location-based salience on grid cells): - Intercepts: No significant age difference (old-young b=0.0223, SE=0.0406, z=0.549, p=0.583). Mean fixation probability (probability scale): young 0.220, older 0.224. - Central bias: Young b=-0.5665, SE=0.0364, z=-15.564, p<0.001; age interaction ns (b=0.0424, p=0.319). - Cell salience (AWS): Young b=0.6830, SE=0.0289, z=23.617, p<0.001; reduced in older adults (interaction b=-0.0399, SE=0.0150, z=-2.653, p=0.008).
Discussion
Findings demonstrate that both location-based and object-based visual salience independently influence fixation selection beyond central bias, but their age modulation diverges: older adults show reduced reliance on location-based salience while exhibiting stronger effects of object-based salience and object size on which objects are fixated. This supports objects as key units of saccadic and attentional selection and suggests that visual salience helps prioritize among objects. The preserved preferred viewing location near object centers (with systematic undershoot) across ages indicates stable oculomotor targeting principles. Longer gaze durations and stronger effects of object size in older adults suggest more extended engagement with selected objects, potentially contributing to fewer distinct objects being fixated overall without increased central bias. These results reconcile salience- and object-based accounts by showing that while objects dominate region selection, salience contributes to ranking objects for fixation, with an increased relevance of object-bound information in aging.
Conclusion
The study integrates salience- and object-based perspectives on gaze guidance in scenes by showing that visual salience prioritizes among objects and that objects are central units of selection. Age-related differences reveal decreased influence of location-based salience but increased effects of object-based salience and size on object selection in older adults, alongside preserved PVL. Methodologically, combining GLMMs with grid- and object-based parcellations and open-source tooling (GridFix) enables disentangling central bias, salience, and object properties. Future research should incorporate scene/object semantics (e.g., meaning maps with improved operationalization), examine task effects, assess individual visual abilities, and leverage computational object detection to scale object-based analyses. Experimental manipulations of object size/eccentricity could further probe age-related constraints in peripheral processing and targeting.
Limitations
- Visual abilities were not independently assessed beyond self-report, which may influence peripheral processing and salience effects. - Object annotations were manual, non-exhaustive, and constrained (moderate size, low occlusion, not crossing vertical midline), potentially biasing object sampling. - The presence of object-related memory probes may have biased viewers toward object-focused strategies. - Semantic influences were not modeled; AWS was chosen to avoid implicit semantics in DNN saliency, but meaning likely contributes to fixation selection and may differ by age. - Analyses were conducted under free-viewing with a specific display geometry and grid resolution; generalization to other tasks or viewing conditions may vary. - Proto-object and DNN-based models were not directly compared here; object detection automation was not used. - PVL analyses did not include incoming saccade amplitude/direction covariates due to normalization constraints.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny