logo
ResearchBunny Logo
Explainable dimensionality reduction (XDR) to unbox AI ‘black box’ models: A study of AI perspectives on the ethnic styles of village dwellings

Interdisciplinary Studies

Explainable dimensionality reduction (XDR) to unbox AI ‘black box’ models: A study of AI perspectives on the ethnic styles of village dwellings

X. Li, D. Chen, et al.

Discover a revolutionary explainable Dimensionality Reduction (XDR) framework that transforms high-dimensional AI knowledge into clear insights! With a compelling case study on ethnic styles of village dwellings in Guangdong, China, led by Xun Li, Dongsheng Chen, Weipan Xu, Haohui Chen, Junjun Li, and Fan Mo, this research highlights pivotal features that enhance our understanding of culture and architecture.

00:00
00:00
Playback language: English
Introduction
The increasing use of AI in various fields raises concerns about the "black box" nature of AI models and their potential biases. Explainable AI (XAI) aims to address this by making the decision-making processes of AI models transparent and understandable. Current XAI methods often fall short in translating the tacit knowledge learned by AI models into explicit knowledge useful to domain experts. This study addresses this limitation by proposing a novel XDR framework. The framework focuses on improving dimensionality reduction techniques, a universally applicable method for understanding the behavior of various AI models. Traditional unsupervised dimensionality reduction methods, such as PCA and ICA, transform high-dimensional features into lower-dimensional spaces but fail to quantitatively translate these features into domain-specific knowledge. The proposed XDR framework incorporates domain knowledge through a supervised approach, using labeled data to train a model that extracts the most significant feature maps from a pre-trained AI model. This allows for a more effective translation of tacit knowledge into explicit, human-understandable knowledge. The study uses a case study involving the recognition of ethnic styles of village dwellings in Guangdong, China to demonstrate the effectiveness of the XDR framework. This region presents a rich context for investigating building styles due to its history of large-scale migrations and the resulting cultural diversity. Existing architectural and historical geographical knowledge provides a strong foundation for incorporating domain knowledge into the XDR framework.
Literature Review
Existing literature highlights the challenges of interpreting AI models' decision-making processes and the need for XAI. Concerns about bias in AI models based on race, gender, and age are discussed, alongside the potential for AI models to "cheat" the learning system. The importance of understanding how AI models work before trusting their decisions is emphasized. XAI is presented as both a process and a product, involving not only identifying causes but also facilitating knowledge transfer between the explainer and explainee. The paper reviews various XAI methods, including dimension reduction, feature importance, attention mechanisms, knowledge distillation, and surrogate models, focusing on the potential of dimension reduction due to its broad applicability. The limitations of conventional unsupervised dimension reduction approaches are noted, particularly their inability to quantitatively translate neural network features into domain knowledge. Existing research on knowledge-infusion XAI methods is reviewed, highlighting the scarcity of studies that demonstrate the generation of new domain knowledge using these methods. The study highlights the potential of XAI to extract tacit knowledge from AI models and translate it into explicit knowledge.
Methodology
The proposed XDR framework consists of four main steps: (1) **Pyramid Layer Selection:** This step selects relevant feature maps from the Feature Pyramid Network (FPN) of a pre-trained Mask R-CNN model. The selection is based on the size of the target objects (buildings) and utilizes equation (1) to determine the optimal FPN layer for extracting building-related features. (2) **Building- and Village-Scale Feature Extraction:** This step transforms image-scale feature maps into building-scale and village-scale features. Building-scale features are extracted by cropping the image-scale feature maps according to building bounding boxes. Village-scale features are obtained by averaging the building-scale features for all buildings within a village (Eqs. 2 and 3). (3) **Infusion of Domain Knowledge:** This step uses domain expert knowledge to quantitatively estimate the importance of different features in differentiating historical village types. Domain experts label satellite images with specific historical village types. These labels, along with village-scale feature maps, are used to train an XGBoost-SHAP model. XGBoost builds a predictive model for village types, while SHAP values quantify the importance of individual features (Eq. 6). The prominent features are identified using a random sampling strategy, repeated 500 times to ensure robustness. (4) **Proximity Evaluation:** This step computes the proximity relationships and geographical distributions of different village types based on the selected features. Cosine similarity (Eq. 7) is used to measure the similarity between village feature vectors, and a village network is constructed using Gephi. K-means clustering is applied to group villages based on their similarity. The Mask R-CNN model, pre-trained on the COCO dataset and fine-tuned on over 10,000 building footprint annotations in Guangdong, is used as the AI black box model. The study focuses on three main ethnic groups in Guangdong: Canton, Hakka, and Teochew, with their associated architectural styles. The datasets used include high-resolution satellite imagery from MapQuest and Place of Interest (POI) data to identify traditional villages.
Key Findings
The XDR framework identified eleven prominent features (Msemantics) that significantly contribute to distinguishing village types (Fig. 4). These features, visualized using SHAP values, were interpreted by domain experts as representing specific architectural characteristics such as patio presence (Fig. 5), building size, length, direction, and shape. The analysis reveals significant variations in the distribution of these prominent features across different village types. The XDR results enabled the classification of villages at scale, leading to the identification of eight distinct village clusters: Hakka village (Shaoguan-Qingyuan type), Canton-Hakka mixed village, Canton village, Canton-Teochew mixed village, Hakka-Teochew mixed village, Hakka village (Meizhou type), Teochew village, and Modern village (Fig. 6). Geographical mapping of these clusters showed spatial correlations between village types and ethnic group distribution (Fig. 7). The study discovered a previously undocumented migration pattern, evidenced by the presence of Hakka architectural styles in an area predominantly characterized by Canton-style dwellings (Fig. 9). An ablation study compared the performance of the XDR framework to alternative methods, including using original feature maps, PCA-based dimensionality reduction, and village-scale features without the domain knowledge infusion process (Fig. 10). The XDR framework outperformed other methods in terms of clarity and consistency with existing geographical knowledge, producing a village network with clear clusters and meaningful spatial interpretations. The XDR framework confirmed existing domain knowledge regarding architectural features like dwelling shape, patio size, building size, and symmetrical layout (Fig. 11). It also uncovered new knowledge, particularly the presence of mixed-ethnic villages which were previously undocumented, adding to the understanding of cultural integration and human migration (Fig. 11).
Discussion
The XDR framework successfully addresses the challenge of translating AI's tacit knowledge into explicit domain knowledge. The integration of domain expertise throughout the process ensures meaningful interpretation of the results. The framework’s ability to confirm existing domain knowledge and uncover new knowledge highlights its potential for advancing research in diverse fields. The use of SHAP values and the attention mechanism in the Mask R-CNN model facilitates the identification and interpretation of key features. The framework's few-shot learning capability is significant given the limited availability of labeled data in many human geography studies. The scalability of the XDR framework is a key advantage, offering a cost-effective method for large-scale analysis of spatial patterns in human settlements. The study's findings demonstrate the potential of AI to enhance understanding of cultural integration and human migration patterns. The XDR methodology offers a new perspective for studying historical and contemporary settlements.
Conclusion
This study presents a novel XDR framework that successfully bridges the gap between AI's tacit knowledge and human-understandable explicit knowledge. By integrating domain expertise and utilizing advanced machine learning techniques, the framework facilitates the discovery of new knowledge and confirmation of existing understanding. The case study on ethnic village styles in Guangdong exemplifies its effectiveness. The XDR framework holds significant potential for future research across a range of fields. Future work could explore the integration of additional architectural features to further refine the accuracy and detail of analysis.
Limitations
The accuracy of the XDR framework is dependent on the performance of the underlying Mask R-CNN model for building footprint detection. Inaccuracies in bounding boxes could affect the extracted features and subsequent analysis. The current framework primarily focuses on the most important features identified by the SHAP values. Integrating additional key features could enhance the analytical depth and provide a more nuanced understanding of the relationships between architectural styles and ethnic groups. The current study is limited to Guangdong province, China; further research is needed to explore the generalizability of this framework in other geographical contexts.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny