logo
ResearchBunny Logo
Non-line-of-sight snapshots and background mapping with an active corner camera

Engineering and Technology

Non-line-of-sight snapshots and background mapping with an active corner camera

S. Seidel, H. Rueda-chacón, et al.

Discover an innovative active non-line-of-sight imaging technique that not only reconstructs moving objects but also reveals hidden stationary backgrounds with just a single snapshot. This groundbreaking research, conducted by Sheila Seidel, Hoover Rueda-Chacón, Iris Cusini, Federica Villa, Franco Zappa, Christopher Yu, and Vivek K Goyal, promises to elevate situational awareness across numerous applications.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses the challenge of non-line-of-sight (NLOS) imaging, where light reaching the sensor has undergone multiple diffuse bounces, losing directional information and experiencing strong attenuation. Prior passive NLOS methods leverage occluding structures (e.g., window apertures, moving-object “inverse pinholes,” and especially vertical wall edges) to recover directional information, enabling azimuthal (and with additional constraints, limited longitudinal) reconstruction of hidden scenes. Active NLOS approaches commonly raster-scan a pulsed laser across a relay surface and use time-resolved single-photon detection to reconstruct large-scale scenes, but require large scan areas and openings, limiting practicality. Edge-resolved transient imaging (ERTI) combines edge occluders with transient measurements to reconstruct stationary scenes but still requires laser scanning. Snapshot methods that avoid scanning typically assume one or two point targets and static scenes during acquisition. The research question is whether one can eliminate laser scanning and still achieve frame-by-frame reconstruction of moving objects’ size, position, and reflectivity, while also recovering the stationary background they occlude. The study proposes an active corner-camera system using a floor relay surface and SPAD array positioned near a vertical edge occluder to obtain azimuthal (from the edge) and longitudinal (from transient timing) information, enabling single-snapshot per frame acquisition and reconstruction of both moving foreground objects and the stationary background they occlude, and accumulation over frames to map the hidden scene.
Literature Review
- Passive NLOS imaging: Methods exploit occluders to disambiguate light paths (e.g., window apertures, inverse pinholes), with vertical edge occluders offering known geometry and reliable azimuthal cues. Prior works achieved 1D azimuthal reconstructions and, in controlled cases, 2D reconstructions, though longitudinal information in purely passive setups is weak and often needs multiple edges. - Active NLOS imaging: Commonly involves scanning a pulsed laser across a Lambertian relay wall and using SPADs for time-resolved measurements (confocal and wave-based methods). These achieve 3D reconstructions but require large scan areas and openings, limiting scalability and acquisition speed. - ERTI: Combines edge occlusion with transient imaging by scanning along an arc around a vertical edge; processes differences between consecutive scan positions to reconstruct stationary hidden scenes. Still constrained by the scanning requirement. - Snapshot tracking with floor as relay: Prior work using a 32×32 SPAD array and stationary laser shortened acquisition time and tracked a moving point-like target’s horizontal position using simultaneous measurements and reference subtraction, but did not reconstruct extended object shape and assumed point reflectors and static scenes during scanning. - Other moving-object methods: Some require rigid motion with fixed orientation and clutter-free environments. Overall, prior snapshot methods either treat targets as points or require scanning; robust reconstruction of extended moving targets and simultaneous background mapping without scanning remained open.
Methodology
Acquisition and model: - Illumination: A pulsed laser illuminates the floor near the base of a vertical edge occluder. A SPAD array images a field of view (FOV) adjacent to the edge. The edge provides azimuthal discrimination; transient timing gives longitudinal (range) information. - Measurement model: For each spatial pixel n and time bin k, photon counts x_{nk} are modeled as Poisson-distributed with rate composed of a known stationary component b_{nk} plus contributions from moving foreground targets and reduced counts from the background regions they occlude. A prior stationary-scene reference measurement estimates b, enabling statistically grounded subtraction of visible-side and other stationary contributions. - Scene parameterization: Moving objects are modeled as vertical, planar rectangular facets on the ground. For M objects, foreground parameters Φ_fg = {(a^m, r^m, h^m, θ^m_min, θ^m_max)} include albedo a, range r, height h, and azimuthal extent around the edge. Each corresponding occluded background region is parameterized by (a_oc^m, r_oc^m); its height depends on r_oc^m, the object’s range r^m, and height h^m. Each frame is processed independently to estimate Φ_fg and Φ_oc from snapshot data. Estimated occluded background regions from successive frames are accumulated to form a stationary background map. - Occlusion modeling: The algorithm explicitly models (i) the vertical edge occluder to recover azimuthal structure and (ii) the occlusion that moving objects impose on the stationary background, leveraging both added photon counts from the foreground and count reductions at longer ranges due to shadowing of the background. Experimental setup: - Laser: 120 mW master oscillator fiber amplifier picosecond laser (PicoQuant VisUV-532) at 532 nm; ~80 ps FWHM pulse width; triggered by SPAD at 50 MHz repetition. - Detector: 32×32 SPAD array, ~3.14% fill factor, per-pixel TDC, ~390 ps time resolution (160.3 MHz clock), ~30% photon detection probability at 532 nm, ~100 Hz average dark count rate. Lens: 50 mm focal length; FOV ~25×25 cm at ~1.20 m height. - Acquisition: Frame length 10 µs with 800 ns gate-on time (8% duty cycle); ~40 laser pulses per frame. Theoretical frame rate ~100 kHz (10 µs readout), practical ~17 kHz limited by USB 2.0. - Geometry: Hidden room 2.2 m wide, 2.2 m deep, 3 m high; walls white foam board, ceiling black cloth. SPAD positioned to look down near the edge so about half the array is occluded. Laser shined through a small hole in the occluding wall to reject ballistic photons; spot ~6 cm right of the origin after alignment. Data acquisition protocol: - Reference: 30 s acquisition before motion to estimate the stationary scene response b. - Motion frames: Objects placed at discrete positions along their trajectories; per-frame integration time 0.4 s (total acquisition time longer due to data transfer). Two scenarios: (1) two moving white rectangular facets (20×110 cm); (2) four targets of varying albedo and shape: white facet (20×110 cm), gray facet (painted), fabric mannequin (30×80 cm), and a staircase-shaped object (75×75 cm). All objects face the occluding edge. Processing: - Uses spatial-temporal photon arrival histograms across the SPAD array to jointly infer foreground facets and occluded background regions per frame, guided by the physical light-transport and occlusion model. Accumulation across frames builds a background map without assuming morphological continuity of the foreground.
Key Findings
- Snapshot NLOS reconstruction without scanning: Each frame is acquired in a single snapshot, enabling tracking of moving objects while reconstructing their position (range and azimuth), size (height and width), and albedo, along with the stationary background they occlude. - Two-object scenario: With 0.4 s per frame and a 30 s reference, two moving targets along arcs are accurately resolved in Frames 1 and 7; reconstructed heights, widths, and ranges match ground truth, and occluded background regions are correctly placed in range. When targets overlap in azimuth (Frame 6), the reconstruction prioritizes the nearer target’s range. Temporally integrated measurements show distinct penumbra shadows from each object; spatially averaged histogram differences show a peak near ~3 m (added counts from objects) and a dip near ~6 m (reduced counts from occluded background). - Background mapping: Accumulating occluded-region estimates over 8 frames yields a stationary hidden-scene map closely matching true wall locations in both bird’s-eye and side views. - Robustness to target properties: Single-frame reconstructions succeed for targets with lower reflectivity (gray facet), non-planar geometry (mannequin), and non-rectangular shape (staircase). The stairs’ varying height profile cannot be fully captured by the facet model, but the reconstruction correctly indicates increased width and rightward extent. - Practical performance: Demonstrations used a 32×32 SPAD with ~390 ps timing and ~3.14% fill factor at ~17 kHz frame rate. The method effectively subtracts visible-side contributions using the reference measurement while preserving information about the stationary hidden background via explicit occlusion modeling.
Discussion
The proposed active corner camera method shows that careful modeling of both the vertical edge occluder and intra-scene occlusions enables snapshot (no-scan) reconstruction of moving, extended targets and simultaneous mapping of the stationary background. This addresses key limitations of prior active NLOS methods that require laser scanning or assume point targets, and of snapshot point-target trackers that cannot recover extended object dimensions or background structure. By using a reference stationary-scene measurement, the method removes arbitrary visible-side contributions without erasing stationary hidden-scene information, overcoming a challenge for configurations where the SPAD array’s spatial extent causes visible-side variability. Experimental results demonstrate accurate per-frame estimation of target size, position, and reflectivity, robust performance across different shapes and albedos, and consistent background mapping via multi-frame accumulation. The findings indicate that exploiting edge-induced azimuthal information together with time-of-flight range cues can deliver practical situational awareness in occluded environments, and performance is poised to improve significantly with higher-resolution, higher-fill-factor, and faster SPAD arrays.
Conclusion
This work introduces a snapshot active NLOS imaging technique that reconstructs moving, extended objects (count, location, size, reflectivity) and simultaneously maps stationary background regions they occlude, all without laser scanning. The approach leverages a vertical edge occluder for azimuthal discrimination, transient timing for range, a reference measurement for stationary-scene subtraction, and explicit modeling of foreground and background occlusions. Experiments validate accurate reconstructions for multiple targets and shapes, and show that accumulated frame-wise occlusion estimates recover the hidden environment’s layout. Future research directions include: (i) joint multi-frame processing with motion and shape priors; (ii) modeling and/or estimating wall thickness; (iii) higher-resolution target modeling via horizontal segmentation with per-segment albedo and height; and (iv) optimizing system parameters (FOV size/position, laser placement) for improved information balance and reconstruction fidelity.
Limitations
- Frame-wise independence: Current processing treats frames independently, not leveraging inter-frame motion continuity or shape constancy, which could improve robustness and accuracy. - Thin-wall assumption: Modeling assumes a thin occluding wall; significant thickness would require different handling of angular ranges and potentially estimating wall thickness. - Target model simplification: Foreground objects are modeled as vertical planar rectangular facets, limiting reconstruction fidelity for complex shapes (e.g., stairs’ varying height profile) and fine details. - Hardware constraints: Limited spatial (32×32) and temporal (~390 ps) resolution constrain parameter precision; low fill factor (~3.14%) and practical frame rate (~17 kHz, USB-limited) reduce SNR per acquisition time and restrict tracking speed/range. - Dependence on reference measurement: Accurate subtraction of stationary-scene and visible-side contributions relies on a good reference; changes in the stationary environment between reference and motion frames could degrade performance. - Ballistic light handling: The setup uses a physical hole to mitigate ballistic photons; more robust fast gating would simplify operation and reduce sensitivity to alignment.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny