logo
ResearchBunny Logo
From remote sensing and machine learning to the history of the Silk Road: large scale material identification on wall paintings

Humanities

From remote sensing and machine learning to the history of the Silk Road: large scale material identification on wall paintings

S. Kogou, G. Shahtahmassebi, et al.

Discover how cutting-edge machine learning techniques can unveil hidden writings and material variations in ancient wall paintings! This groundbreaking research, conducted by Sotiria Kogou, Golnaz Shahtahmassebi, Andrei Lucian, Haida Liang, Biwen Shui, Wenyuan Zhang, Bomin Su, and Sam van Schaik, sheds light on the rich history of Mogao Cave 465, dating its exquisite paintings to the late 12th to 13th centuries.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses how to efficiently process and interpret very large reflectance spectral imaging datasets of wall paintings to identify materials and automatically reveal faded or ‘hidden’ writings and drawings, and how these findings can inform art-historical questions such as dating. The Mogao Cave 465 wall paintings are stylistically unique and difficult to date by stylistic comparison alone. Traditional analyses relied on limited, point-based or sampled measurements, which are time-consuming and unrepresentative for large murals. Recent advances enable large-scale imaging spectroscopy, but create new challenges in data analysis. The authors propose an automated clustering approach to map material variations across entire interiors and a data-driven method (PCA/ICA) to reveal hidden texts/drawings, aiming to integrate scientific material evidence with palaeography and archaeology to refine the dating of Cave 465.
Literature Review
Prior work showed non-invasive VIS–NIR spectral imaging can reveal underdrawings and faded writings and aid material identification in cultural heritage. Clustering and dimensionality reduction have been used in remote sensing, astronomy, medicine, and selected paintings, often via ENVI’s Spectral Hourglass Wizard. However, these workflows require operator intervention (e.g., manual cluster selection) and may be insensitive to spectral intensity (e.g., spectral angle mapper), which is suboptimal for cultural heritage where illumination is controlled and intensity differences carry material information (e.g., mixtures, particle size). Salerno et al. applied SOM to limited-band (RGB+NIR) images of a single easel painting, not scalable to many spectral cubes. PCA and ICA have been applied to enhance writings in documents, with varying success. The present work builds on these by designing a scalable SOM-based pipeline that leverages both spectral shape and intensity, processes large numbers of cubes sequentially, and systematically applies PCA/ICA to entire cubes to detect hidden information.
Methodology
Automated large-scale spectral imaging: The in-house PRISMS VIS/NIR spectral imaging system comprises a Jenoptik CCD camera, a 10-band filter wheel from 400–880 nm (nine 40 nm FWHM filters at 50 nm intervals from 400–800 nm and one 70 nm FWHM at 880 nm), and a Meade ETX90 telescope. It acquires high spatial resolution data (80 μrad) from 3–30 m stand-off, enabling automated imaging of large murals with simultaneous reflectance and 3D topography. Calibration to reflectance is automated. The 10-band design balances spectral informativeness with survey efficiency, adequate for most pigments/dyes with broad features; high-resolution FORS or hyperspectral imaging is reserved for materials with fine features. Automated materials cluster map generation: A Kohonen Self-Organizing Map (SOM) approach implemented with the ‘kohonen’ package in R is used. To handle millions of pixels per cube, an initial dimensionality reduction clusters each cube into a sufficiently large number of preliminary clusters (e.g., ~100). Workflow: (1) Unsupervised SOM on an initial cube yields N clusters whose mean spectra form a Reference Spectral Database. (2) Sequential supervised SOM processes each subsequent cube using the reference database: pixels are mapped to existing clusters when their spectra match within ±k·σ (k≈2) of the reference mean across bands; unmatched pixels form an ‘unclassified’ pool. Unclassified pixels undergo unsupervised SOM; resulting clusters are merged with existing ones if their mean spectra match (±k·σ), otherwise added as new reference clusters, updating the database to N+w clusters. (3) A final spectral comparison merges clusters whose mean spectra are indistinguishable (±k·σ in all bands). Final clusters are mapped back to produce materials cluster maps. The method considers both spectral shape and absolute reflectance intensity enabling discrimination of mixtures, pigment concentration, particle size, and overlays (e.g., ink wash). Multimodal, non-invasive confirmation: Detailed, local analyses at accessible regions extend identifications to inaccessible areas via cluster membership. Techniques include: FORS (ASD FieldSpec, 350–2500 nm; 3 nm resolution 350–1000 nm; 10 nm in SWIR), Raman (Horiba HE785, 785 nm excitation, ~20 mW, ~30 μm spot, ~15 cm⁻1 resolution; libraries by Bell et al. and Burgio et al.), and portable XRF (Niton XL3t; Au anode, max 50 kV/200 μA; detects Z>14 in air). Kubelka–Munk modeling on PRISMS/FORS spectra aids mixture identification. Automatic uncovering of hidden writings/drawings: For each cube, PCA (R ‘princomp’ on correlation matrix) and ICA (‘fastICA’) are computed. Early principal components (PC2/PC3) commonly enhance writings/drawings not evident in any single band; later PCs reveal acquisition imperfections. PCA/ICA outputs are inspected to locate faded or hidden features. Comparative framework: Pigment combinations in Cave 465 are compared with those in dated Mogao caves from Tibetan (8th–9th c.), Tangut (11th–13th c.), and Mongol/Yuan (13th–14th c.) periods using new non-invasive surveys and published XRD/XRF data from Mogao, Yulin, and East Thousand Buddhas caves within ~200 km, acknowledging methodological differences.
Key Findings
- Scale and clustering: The eastern ceiling (~10 m²) required ~5000 image cubes and produced over 10⁶ spectra. SOM clustering reduced the east ceiling data to 960 unique spectral clusters; ~300 correspond to physical damage (cracks, exposed substrate), useful for conservation survey. The algorithm grouped spectrally similar areas across widely separated cubes, enabling coherent material maps. - Extension of identifications: Cluster co-membership between inaccessible upper regions and accessible lower regions allowed extrapolation of detailed local identifications. Example: a figure high on the wall clustered with a tiger at ground level; multimodal analysis identified cinnabar + orpiment over an indigo/gypsum white-blue ground; scaffold-enabled measurements later confirmed identical composition aloft. - Pigment palette and mixtures in Cave 465 (Table 1 summary): • Reds: cinnabar; cinnabar + orpiment; cinnabar + orpiment + red lead; red lead + orpiment; red ochre. • Blues: indigo; azurite; indigo + azurite (dark blue). • Greens: atacamite; indigo + orpiment. • Whites: gypsum + dolomite. • Yellows: orpiment (no yellow ochre detected in 465). • Browns/blacks: plattnerite (PbO₂) as degradation of red lead, occurring with cinnabar or azurite; thin carbon ink wash over red lead used to darken shades (intensity decrease without shape change in spectra). • Lilac: cinnabar + plattnerite. • Red organic dyes were not confidently identified; indigo was identified, whereas other organics are difficult to detect in situ due to ageing and non-invasiveness. - Spectral-intensity insight: Inclusion of absolute reflectance intensity enabled differentiation of same-shape spectra at different intensities, revealing practices like carbon ink washes over red lead and variations in pigment concentration/particle size. - Hidden writings revealed: PCA/ICA on 10-band cubes from the west ceiling unveiled faint Sanskrit stamped texts (not visible in color image or individual bands). The stamps, in cinnabar on small paper sheets, appear to have been pasted face-down (letters flipped), consistent with consecration practices. The text is the Pratītyasamutpāda-gāthā ("ye dharma hetuprabhavā..."). - Palaeographic dating: The Nagari script form aligns with late 12th century onward (comparanda include Turfan manuscripts and inscriptions such as Feilaifeng, 1287–1292 CE). Similar printed texts in nearby caves 462 and B168 date to Mongol/Yuan. - Comparative material analysis across caves: Cave 465’s pigment combinations are most similar to Mongol/Yuan Cave 95; distinct from typical Tibetan-period materials. Patterns include: lapis lazuli present in Tibetan period caves and Tangut Cave 65, absent in Caves 97, 95, and 465; dark blue via azurite+indigo seen only in Caves 95 and 465; whites: talc/calcite in Tibetan period, gypsum+dolomite in Caves 95, 97, 465; greens: malachite+atacamite in Tibetan Cave 159, mainly atacamite otherwise; Tangut caves often lack yellows, while 465 uses orpiment. Expanded comparisons using published XRD/XRF from Mogao/Yulin/East Thousand Buddhas corroborate these trends (e.g., talc absent in Yuan-period whites). - Chronological synthesis: Material evidence disfavors a Tibetan-period date for Cave 465; the overall palette and mixtures align best with Mongol/Yuan, though some Tangut-period attributions of comparison caves are uncertain. Archaeological finds (documents dated to Mongol/Yuan in adjacent areas) and graffiti (1309–1373 CE in the middle hall) indicate activity in the 13th–14th centuries. Integrating materials, palaeography, and archaeology narrows the painting date to late 12th to 13th century.
Discussion
The findings address the core questions by demonstrating that automated, scalable SOM clustering of reflectance spectra across thousands of spectral cubes can generate coherent materials maps that support material identification at scale, overcoming the limitations of point analyses. Incorporating both spectral shape and intensity proved crucial for distinguishing subtle material variations, mixtures, degradation products, and overpaints/washes. The PCA/ICA pipeline systematically revealed faded or hidden writings and drawings that were not evident in any single band, enabling the discovery of stamped Sanskrit consecration texts in cinnabar. Interpreting these results within an interdisciplinary framework advances the dating of Mogao Cave 465. The pigment combinations—e.g., widespread orpiment for yellow, azurite+indigo dark blues, gypsum+dolomite whites, atacamite greens, and prevalence of plattnerite—match best with Mongol/Yuan-period practices and comparanda (especially Cave 95), and differ from Tibetan-period materials (e.g., talc whites, lapis lazuli blues). Palaeographic analysis of the Nagari script on the hidden stamps places the consecration texts no earlier than the late 12th century. Archaeological evidence (Mongol/Yuan documents in nearby passages, 1309–1373 graffiti in the middle hall) aligns with a 13th-century context. While uncertainties in Tangut-period cave attributions limit absolute precision, convergent evidence indicates the main hall paintings date from the late 12th to 13th century. Methodologically, the work demonstrates how large-scale non-invasive spectral imaging, machine learning clustering, and component analyses can yield historically meaningful conclusions in cultural heritage studies.
Conclusion
This work introduces a scalable, automated SOM-based clustering pipeline for large-scale reflectance spectral imaging that accounts for both spectral shape and intensity, producing materials cluster maps across entire interiors. Coupled with PCA/ICA for hidden-feature detection and multimodal non-invasive confirmation (FORS, Raman, XRF, Kubelka–Munk modeling), the approach enables robust, site-wide material identification and discovery of faded or hidden texts/drawings. Applied to Mogao Cave 465, it mapped complex pigment mixtures, extended identifications to inaccessible areas, revealed Sanskrit consecration stamps, and—together with palaeographic and archaeological evidence—narrowed the dating of the main hall wall paintings to the late 12th to 13th century, most consistent with Mongol/Yuan. Future research should expand large-scale material analyses across more caves tentatively attributed to the Tangut period to refine regional chronologies, improve detection of organic dyes with targeted high-resolution/hyperspectral methods, and further integrate conservation mapping (e.g., damage clusters) with materials maps for preservation planning.
Limitations
- Non-invasive constraints precluded radiocarbon dating and limited definitive identification of some organic colorants (e.g., red dyes), which are difficult to detect in situ due to ageing and weak spectral features. - Initial access limitations necessitated extrapolation from accessible areas via clustering, later validated in a subset when scaffolds were available. - Comparative datasets from literature (XRD/XRF of extracted samples) differ methodologically and may underreport organics; inter-site stylistic differences (Mogao vs. Yulin vs. East Thousand Buddhas) may also influence materials. - Uncertainties in attribution and dating of several comparison caves (especially those classified as Tangut) limit the precision of period-based comparisons. - PCA/ICA enhancements are data dependent; in some cases ICA offered only subtle improvements over PCA.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny