Engineering and Technology
Waveguide holography for 3D augmented reality glasses
C. Jang, K. Bang, et al.
This innovative research by Changwon Jang, Kiseung Bang, Minseok Chae, Byoungho Lee, and Douglas Lanman introduces a groundbreaking holographic near-eye display that merges the best of waveguide and holographic displays, delivering true 3D augmented reality experiences with enhanced resolution, making it the future of wearable technology.
~3 min • Beginner • English
Introduction
The paper addresses the challenge of creating compact, comfortable near‑eye displays for augmented reality that provide realistic 3D imagery with natural focus cues and a large eyebox. Conventional waveguide image combiners are attractive for AR due to compact form factor and étendue expansion via exit‑pupil replication, but they typically provide fixed (infinity) focus and suffer from focus-spread and ghosting for finite-conjugate imagery, as well as low efficiency with conventional sources and coherence‑induced artifacts with lasers. Holographic displays, which modulate the optical wavefront using an SLM, promise aberration-free, high-resolution, per-pixel depth control with large color gamut, and recent advances in CGH computation improve image quality and reduce speckle. However, near‑eye holography faces étendue bottlenecks, and prior retinal projection approaches from the temple side have limited space and angular bandwidth. Earlier work embedding static or non‑replicating holograms in waveguides could not scale to glasses form factor and did not resolve focus spread or support pupil replication, often requiring thick slabs and yielding small eyebox and FOV. To overcome these limitations, the authors propose waveguide holography, a system that models and exploits coherent interactions in exit‑pupil expanding waveguides, enabling control of the out‑coupled wavefront by modulating the SLM at the input coupler. This approach targets true 3D holographic AR glasses with software‑steered eyebox, natural focus cues, and enhanced resolution.
Literature Review
The study situates its contribution within AR near‑eye display architectures, including birdbath, curved mirror, retinal projection, and pin‑mirror designs, highlighting waveguide combiners as leading due to form factor and étendue expansion. It reviews waveguide types (geometric with partially reflective surfaces; diffractive using surface relief, volume Bragg, polarization, and metasurface elements) and the limitations of fixed depth and focus‑spread/ghosting with finite‑conjugate imagery. Multi‑plane waveguide architectures can mitigate vergence–accommodation conflict but increase bulk and reduce performance. Brightness is challenging with micro‑LEDs due to efficiency, and coherent laser use in waveguides has been limited by interference artifacts during TIR propagation. In holography, the paper cites progress in CGH algorithms, GPU acceleration, and mitigation of speckle and image artifacts, while noting the étendue constraint for compact near‑eye systems. Prior retinal‑projection holographic approaches from the temple side face limited angular/space bandwidth and require mechanical pupil steering. Early waveguide holograms produced static patterns only. More recent attempts to transmit dynamic holograms through light‑guiding slabs improved aberrations but fundamentally avoided pupil replication, requiring thick substrates (3–8 mm), leading to small eyebox/FOV and unsuitability for glasses. These gaps motivate a scalable, replication‑compatible holographic waveguide approach.
Methodology
Architecture: The system comprises a collimated laser source, a spatial light modulator (SLM), and an exit‑pupil expanding (EPE) waveguide with surface‑relief gratings (in‑coupler, EPE grating, out‑coupler), with linear polarizers on the SLM and out‑coupler. The SLM is placed without projection optics for compactness; a benchtop prototype uses a 4‑f relay to demagnify the SLM for design iteration, while a compact prototype removes the relay.
Coherent propagation model: The waveguide is first approximated as a linear shift‑invariant (LSI) system comprising free propagation in the substrate, TIR at boundaries, and first‑order diffraction at gratings, enabling a single convolution representation. To account for practical spatial variance (grating non‑uniformity, edge clipping, substrate imperfections, scattering), a multi‑channel convolution model is introduced with complex apertures and kernels: Q (input in‑coupler aperture, channel‑selective), h (multi‑channel convolution kernels emulating distinct light paths due to pupil replication), and R (output aperture correcting phase/amplitude fluctuations near the out‑coupler). Outputs from all channels are coherently summed, providing a differentiable forward model. Additional front‑end parameters model SLM behavior (3×3 crosstalk kernel; spatially varying phase response function) and physical propagation (free‑space propagation with 3D tilt; homography to capture alignment and aberrations).
Model calibration via complex wavefront capture: A Mach–Zehnder phase‑shifting interferometer (“wavefront camera”) at the out‑coupler measures complex wavefronts for random SLM phase inputs, retrieving amplitude and phase using phase‑shifting digital holography. Training minimizes L1 loss between estimated and measured complex fields over ~1000 captured wavefronts. Complex measurements are pivotal to disambiguate overlapping, coherently interfering replicated wavefronts inside the waveguide. The calibrated model supports one‑time calibration over a large 3D eyebox: eye pupil position, size, and eye relief are numerically selected via cropping and propagation within the ROI without re‑calibration.
Ablation and scalability: The study evaluates models ranging from single‑kernel (no Q/R/DC) to full multi‑channel with Q, R, and DC components. PSNR and complex‑PSNR against measured complex fields show that multi‑channel modeling and complex apertures markedly improve fidelity; performance saturates around 9–16 channels. Model performance is largely independent of ROI size (e.g., 3.5 mm to 7 mm square), with required channel count set by in‑coupler path diversity rather than out‑coupler size.
CGH rendering: After calibration, CGH is optimized by back‑propagation through the differentiable waveguide model plus numerical free‑space propagation to the retinal/image plane. The SLM phase is iteratively updated from a random initialization to minimize L1 loss to target images; for 3D scenes, losses at multiple depths (e.g., focal stacks or light‑field inputs) are summed to reproduce defocus blur and ocular parallax. Convergence typically begins around 1000 iterations; runs up to 3000 iterations are reported.
Waveguide fabrication: Glass substrate n=1.5, thickness 1.15 mm, center TIR angle ~50° at λ=532 nm. Designed diagonal FOV 28°; out‑coupler size 16×12 mm. Fabricated via nano‑imprinting; surface‑relief gratings (saw‑tooth) optimized for spatial/angular uniformity versus efficiency (end‑to‑end throughput >5% average). Some −1st order diffraction necessitated an external beam‑split illumination arrangement.
Prototypes and instruments: Benchtop uses 532 nm Cobolt Samba (1500 mW), Meadowlark E‑series 1920×1200 SLM (62.5% area used: 1200×1200 px), piezo actuator for phase shifting, two wavefront cameras (for waveguide output and relayed SLM), ND filters in reference paths, and an imaging camera with a 3D‑printed 3.4 mm square pupil mask on motorized stages. Benchtop FOV ≈11° diagonal; eyebox ~16×12 mm. Compact prototype uses a 4K SLM (3840×2160, 3.74 µm pitch) without relay, supporting ~12° FOV; ~20% of SLM area (≈1300×1300 px) active. Calibration for compact is challenged by ~10% SLM phase flicker and fringe‑field effects.
Algorithmic details: SLM crosstalk kernel 3×3; SLM phase response modeled by an 18‑coefficient polynomial; free‑space propagation with 3 parameters (2D tilt angles, distance); homography up to second order with 12 coefficients. Q sized to in‑coupler (≈1200×1200); R and DC sized to ROI. Kernel h side length equals sum of Q and R extents. Dataset size ~1000 complex wavefronts for training.
Key Findings
- Demonstrated control of out‑coupled wavefronts through an exit‑pupil expanding waveguide by modulating only the input SLM, enabling true 3D holographic display via waveguides.
- Software‑steered eyebox: The calibrated model allows arbitrary selection of pupil size and 3D location within the ROI without re‑calibration, fully leveraging étendue expansion; practical ROI up to ~7 mm square (limited by wavefront camera sensor).
- Image quality improvements: Substantial reduction of focus‑spread/ghost artifacts for finite‑depth holograms versus waveguide‑agnostic propagation; improved infinity‑depth image quality as well.
- Full depth range demonstrations from 0 diopters to infinity, including finite depths (e.g., 3 D), captured both with an imaging camera and via numerically propagated wavefront camera data.
- Resolution enhancement: By stitching phase discontinuities caused by beam clipping during pupil replication, the method produces a smooth phase and uniform amplitude at the eyebox, effectively increasing numerical aperture. Experimental tilted plane‑wave tests achieved sub‑arc‑minute resolution and over threefold increase in Strehl ratio (per Mahajan formula) compared to baseline without the method.
- Modeling performance: Multi‑channel convolution with complex apertures significantly boosts PSNR and complex‑PSNR versus single‑channel models; gains saturate around 9–16 channels. Model performance is largely independent of ROI size, consistent with in‑coupler‑determined channel diversity.
- Practical prototypes: Benchtop prototype delivered ~11° diagonal FOV and ~16×12 mm eyebox; compact prototype achieved ~12° FOV, demonstrating see‑through 3D AR imagery. Pseudo‑color 3D results obtained by temporally multiplexing RGB captures at a single wavelength per channel.
- Waveguide throughput designed for >5% end‑to‑end efficiency (average) while maintaining eyebox uniformity.
Discussion
The findings demonstrate that coherent interactions within exit‑pupil expanding waveguides can be modeled and exploited to synthesize desired out‑coupled wavefronts using only an input SLM, solving focus‑spread artifacts and enabling 3D holography with a large, software‑steered eyebox. This approach addresses étendue constraints that limit compact near‑eye holography and mitigates resolution loss from beam clipping by phase stitching, yielding sub‑arc‑minute resolution and significantly improved Strehl ratio. The one‑time, complex‑wavefront calibration enables flexible pupil placement and eye relief adjustment through numerical propagation, improving practicality relative to camera‑in‑the‑loop calibrations that would otherwise require per‑position tuning. The work points toward ultra‑compact, lens‑free holographic AR glasses, while highlighting the dependencies on SLM performance and calibration robustness. Future improvements in SLM pixel pitch, phase stability, complex modulation, and refresh rate, along with refined, possibly more efficient yet expressive models (e.g., reducing channel redundancy, modeling partial coherence/polarization), are expected to further enhance image quality, FOV, and system scalability.
Conclusion
The paper introduces waveguide holography, a compact near‑eye display architecture that unifies waveguide pupil‑expansion with holographic wavefront control. A differentiable, multi‑channel convolution model with complex apertures, calibrated using complex wavefront measurements via phase‑shifting interferometry, accurately captures coherent light interactions in exit‑pupil expanding waveguides. Using this model, the system renders CGHs that reconstruct high‑quality 3D images across depths through the waveguide, provides a large software‑steered eyebox, and overcomes conventional resolution limits by correcting phase discontinuities from pupil replication. Prototypes validate feasibility, including see‑through 3D AR demonstrations and resolution gains (sub‑arc‑minute, >3× Strehl). Future work should address SLM limitations (phase flicker, DC artifacts, pixel pitch), improve calibration robustness and model efficiency, extend to broader spectral operation and partial coherence/polarization modeling, and pursue further miniaturization toward practical, full‑color, wide‑FOV holographic AR glasses.
Limitations
- SLM limitations: phase flicker (~10% in the compact prototype) degrades calibration fidelity and image quality; DC noise artifacts (e.g., at FOV center) and fringe‑field effects further impact performance; FOV constrained by SLM pixel pitch unless a projection lens is added (at the expense of form factor).
- Calibration sensitivity: Complex‑wavefront interferometric calibration is mechanically sensitive; data acquisition is more susceptible to perturbations than subsequent display operation.
- Spectral constraints: Demonstrations use a single wavelength waveguide; pseudo‑color results require separate captures and merging, not simultaneous full‑color operation.
- Efficiency and uniformity trade‑offs: Waveguide gratings optimized for >5% average throughput sacrifice some efficiency for uniformity; −1st order diffraction necessitates additional optical handling.
- Model coverage and redundancy: The multi‑channel model, while effective, exhibits parameter redundancies and is an approximation; further accuracy and efficiency improvements are desirable.
- ROI and hardware limits: Practical eyebox ROI (e.g., up to ~7 mm square) was limited by the wavefront camera sensor size in experiments; FOV in prototypes (~11–12°) is below the designed 28° of the waveguide due to SLM constraints.
Related Publications
Explore these studies to deepen your understanding of the subject.

