Engineering and Technology

Neural structure fields with application to crystal structure autoencoders

N. Chiba, Y. Suzuki, et al.

Discover how researchers Naoya Chiba, Yuta Suzuki, Tatsunori Taniai, Ryo Igarashi, Yoshitaka Ushiku, Kotaro Saito, and Kanta Ono are transforming crystal structure representation for machine learning with their innovative Neural Structure Fields (NeSF). This cutting-edge method uses neural networks to redefine material design, showcasing remarkable reconstruction capabilities compared to traditional grid-based techniques.... show more

Introduction

Materials science traditionally studies structure–property relationships, but conventional discovery can be labor-intensive and serendipity-driven. Materials informatics leverages machine learning on large databases to predict properties from structures and compositions. While encoding crystal structures is well-handled by graph neural networks, decoding or determining structures remains a bottleneck due to variable atom counts, lack of ordering, and the need for fixed-size tensors in ML. The authors propose Neural Structure Fields (NeSF), representing a crystal structure as continuous vector fields over 3D space, enabling neural networks to decode structures. NeSF uses a position field that points from any query point to the nearest atom and a species field that predicts the species of that nearest atom. This implicit representation avoids issues with directly outputting unordered, variable-size atomic sets and aims to enable inverse design tasks that map desired properties to structures.

Literature Review

Prior materials informatics work focuses on predicting properties from given structures or compositions using deep learning and graph neural networks. Inverse design of crystal structures has been explored via generative models (e.g., GANs, autoencoders) and grid-based 3D representations (ICSG3D), which voxelize unit cells and use 3D CNNs. However, voxelization in 3D suffers from a severe tradeoff between spatial resolution and computational cost, limiting practical resolutions (e.g., 32×32×32 in ICSG3D) and hindering accurate representation of complex, elongated, or distorted cells. Implicit neural representations in computer vision (e.g., DeepSDF, occupancy networks, NeRF) address analogous representation issues for 3D shapes by learning continuous fields. The authors extend this paradigm to crystalline materials, directly modeling nearest-atom vectors and species to more precisely capture atomic positions than scalar distance or occupancy fields.

Methodology

NeSF represents a crystal structure via two neural fields conditioned on a latent vector z: a position field f_r(p, z) outputting the 3D vector from query point p to its nearest atom a (ideally a−p), and a species field f_s(p, z) outputting a categorical distribution over atomic species of the nearest atom. Lattice constants (lengths a,b,c and angles α,β,γ) are predicted by separate MLPs from z. The position field can be viewed as the gradient of the scalar potential representing squared distance to the nearest atom. Estimation algorithm (decoding from z): 1) Initialize particles by predicting lattice, then regularly place particles as a 3D grid within a dataset-specific bounding box. 2) Move particles iteratively by p ← p + f_r(p, z) so particles flow toward nearest atoms. 3) Score particles via ||f_r(p, z)|| (estimated distance) and discard those above a threshold (0.9 Å). 4) Detect atoms by clustering particles via non-max suppression: repeatedly select the lowest-score particle into accepted set and suppress neighbors within 0.5 Å spheres (requiring ≥10 particles per sphere) to obtain atomic positions {a_i}. 5) Estimate species: around each detected a_i, densely sample local query particles and evaluate f_s to get multiple species distributions; assign the majority-vote species. The algorithm is deterministic. Training NeSF: Train fields directly without iterative movement by sampling 3D query points and supervising outputs to point to nearest atoms and species. To meet dynamics needs, two sampling strategies are used: global grid sampling (uniform coverage with Gaussian perturbations) for field accuracy across the unit cell, and local grid sampling (around atomic sites with perturbations) for high accuracy near atoms. The position field uses both sampling types; the species field uses local sampling. Autoencoder: A PointNet/DeepSets-style encoder transforms an input crystal (atomic positions, species, and lattice) into a latent vector z (192-D). The NeSF decoder (fields + lattice decoders) reconstructs the structure from z via the above estimation algorithm. Network architecture: Encoder embeds each atom to 512-D features, aggregates via max pooling, then maps to 192-D z. Position field: 9-layer MLP; species field: 3-layer MLP; lattice length and angle MLPs: 2 layers each. Datasets: Three datasets from Materials Project—ICSG3D (cubic AB, ABX2, ABX3; 7897 samples; easiest), LCS6Å (unit cell sizes ≤6 Å in x,y,z; 6005 samples; diverse), YBCO-like (100 samples with narrow c-axis, YBCO-type; most challenging). Splits: train 90.25%, val 4.75%, test 5% (YBCO-like uses 20-fold CV). Training: Loss L = 10·L_pos + 0.1·L_spe + 1·L_len + 1·L_ang, with MSE for positions and lattice, cross-entropy for species. Optimizer: Adam, lr 1e-3 decayed by 0.5 every 640 epochs, batch 128, 3200 epochs with early stopping (except YBCO-like). Computation: single Quadro RTX8000; training times ~11 h (ICSG3D), 9 h (LCS6Å), 1 h (YBCO-like).

Key Findings

Across three datasets, the NeSF autoencoder consistently outperforms the voxelization-based ICSG3D baseline in reconstructing crystal structures, particularly for species identification and position accuracy on complex structures. - ICSG3D dataset (cubic AB/ABX2/ABX3): ICSG3D achieves good atom-count and position errors but has high species error (~65%), whereas NeSF achieves similar or better position and atom-count accuracy with drastically lower species error (~4%). - LCS6Å dataset (diverse, ≤6 Å cells): Proposed vs ICSG3D (mean ± SD). • Error in number of atoms: 6.35% ± 0.50 vs 17.28% ± 1.13. • Position error (actual): 0.1161 Å ± 0.0073 vs 0.2006 Å ± 0.0132. • Position error (detected): 0.1632 Å ± 0.0123 vs 0.2886 Å ± 0.0159. • Species error (actual): 14.78% ± 0.81 vs 55.70% ± 1.40. • Species error (detected): 16.11% ± 0.66 vs 58.32% ± 1.45. • Lattice length error: 0.05 Å ± 0.00 vs 0.06 Å ± 0.01. • Lattice angle error: 0.19° ± 0.04 vs 0.34° ± 0.07. - YBCO-like dataset (narrow c-axis, complex): ICSG3D failed to output any atom for 44/100 materials, yielding no valid actual-position score; computed excluding failures shows worse true performance. Proposed vs ICSG3D: • Error in number of atoms: 12.00% vs 91.00%. • Position error (actual): 0.2631 Å vs 0.6311 Å (ICSG3D computed on subset excluding failures). • Position error (detected): 0.2448 Å vs 0.4358 Å. • Species error (actual): 14.78% vs 78.08%. • Species error (detected): 19.89% vs 54.24%. • Lattice length error: 0.25 Å vs 0.10 Å. • Lattice angle error: 2.91° vs 0.06°. - Performance vs complexity: Errors generally increase with number of atoms for both methods, but NeSF degrades less and consistently outperforms ICSG3D; ICSG3D underestimates atom counts more on LCS6Å, likely due to 32³ voxel resolution limits. - Model efficiency: NeSF autoencoder has 0.76M parameters (2.24% of ICSG3D’s 34M), aiding generalization especially on small datasets. - Latent space quality: Interpolations between known structures (e.g., ZnS→CdS; MgO→NaCl) preserve AX composition and cubic prototypes along paths, showing smooth, meaningful structure changes. t-SNE on LCS6Å latent vectors reveals clustering correlated with cell volume, atom count, and space group, indicating the latent space captures structural similarity.

Discussion

The findings demonstrate that representing crystal structures as continuous neural fields eliminates the spatial-resolution and memory tradeoffs inherent to voxel grids, enabling accurate decoding of complex and elongated/distorted unit cells. NeSF’s direct vector-to-nearest-atom and categorical species predictions provide more precise atomic reconstructions than electron-density voxel peaks, improving both position and species accuracy. The smaller parameter count relative to 3D CNNs likely contributes to better generalization, particularly on limited data (e.g., YBCO-like). Analysis across atom-count regimes shows robustness as structural complexity grows. Qualitative latent-space analyses (interpolations and t-SNE) indicate the encoder-decoder learns a meaningful structural manifold where similar structures map nearby and interpolations yield plausible intermediate structures that preserve key prototypes and compositions. These characteristics support the suitability of NeSF for inverse design workflows where decoding structures from target representations is required.

Conclusion

The paper introduces Neural Structure Fields (NeSF), a continuous, implicit neural representation for crystal structures comprising position and species fields, enabling neural networks to decode variable-size, unordered atomic configurations. Integrated into a crystal-structure autoencoder with a PointNet/DeepSets-style encoder, NeSF reconstructs diverse structures and consistently outperforms a voxel-based baseline (ICSG3D), particularly for species and position accuracy on complex datasets. The approach circumvents voxel resolution limits and is parameter-efficient. The learned latent space captures structural similarities and supports smooth interpolations. Future work includes enforcing space-group symmetry in decoding, integrating NeSF into generative models (e.g., VAEs, GANs) for inverse design of novel crystals, and learning latent representations that better disentangle and predict material properties.

Limitations

Symmetry: The current NeSF does not explicitly enforce space-group symmetry, so reconstructed conventional cells may violate crystallographic symmetry constraints. - Generative evaluation: Experiments focus on autoencoding; applying NeSF within generative models to produce novel, out-of-distribution structures remains for future work, and quantitative evaluation is limited by lack of ground-truth for generated structures. - Property prediction: Latent vectors learned via reconstruction have limited effectiveness as descriptors for property prediction; disentangling structural and property factors requires dedicated training objectives. - Practical sensitivities: Decoding involves hyperparameters for particle sampling, thresholds, and clustering; while analyzed in Supplementary Information, performance can be sensitive to these settings.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Insights to the neural response to food cues in class III compared with class I and II obese adults using a sample of endometrial cancer survivors seeking weight loss

N. L. Nock, H. Jiang, et al.

Medicine and Health

Crystal structure of adenosine A<sub>2A</sub> receptor in complex with clinical candidate Etrumadenant reveals unprecedented antagonist interaction

T. Claff, J. G. Schlegel, et al.

Physics

Nanosecond anomaly detection with decision trees and real-time application to exotic Higgs decays

S. T. Roche, Q. Bayer, et al.

Engineering and Technology

Specialising neural network potentials for accurate properties and application to the mechanical response of titanium

T. Wen, R. Wang, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny