Engineering and Technology

Graphene-based 3D XNOR-VRRAM with ternary precision for neuromorphic computing

B. Alimkhanuly, J. Sohn, et al.

Discover how the innovative use of microfabricated, graphene-based Vertical RRAM (VRRAM) can revolutionize neuromorphic computing, enhancing energy efficiency and recognition accuracy. This exciting research conducted by Batyrbek Alimkhanuly, Joon Sohn, Ik-Joon Chang, and Seunghyun Lee showcases the advantages of graphene in advanced computing technologies.

00:00

~3 min • Beginner • English

Index

Introduction

Deep neural networks require high compute and memory bandwidth, which is constrained by the von Neumann separation of memory and processing. Memory-centric paradigms (in-/near-memory and neuromorphic computing) and emerging NVMs such as RRAM are promising. Quantizing neural networks to binary or ternary weights (e.g., XNOR-Net) reduces model size and computation cost while retaining high accuracy, making binary RRAM attractive as synapses. To achieve high density and bit-cost efficiency, 3D vertical stacking (VRRAM) is used, but conventional metal word-plane (WP) electrodes introduce parasitics that limit scaling and weighted-sum operation across multiple layers. This work explores replacing metal WPs with atomically thin graphene in 3D VRRAM, aiming to improve device switching characteristics, interconnect performance, and system-level neuromorphic computing, and proposes an XNOR-inspired architecture to implement 1-bit ternary synaptic weights in graphene-based VRRAM.

Literature Review

The paper situates its contribution within: (1) advances in DNN quantization to binary/ternary weights enabling XNOR/bit-count operations with energy and size benefits; (2) use of RRAM and 3D VRRAM architectures for dense, bit-cost-efficient memory arrays; (3) limitations of conventional metal electrodes in 3D VRRAM due to parasitic resistance and scaling; (4) prior demonstrations of graphene electrodes/interconnects improving conductivity and device behavior, and potential further gains via graphene doping; and (5) prior circuit/array modeling (e.g., Stanford RRAM Verilog-A) and neuromorphic systems using memristive arrays. These studies motivate integrating graphene WPs into VRRAM and tailoring circuits/algorithms (XNOR with ternary precision) to exploit the device/interconnect advantages.

Methodology

Devices: Two-layer TiN/HfO2/Pt (Pt-RRAM) and TiN/HfO2/graphene (Gr-RRAM) VRRAM cells were fabricated. Pt WP thickness ~5 nm; graphene WP ~0.3 nm (monolayer). Al2O3 adhesion layer used under graphene. TEM confirmed stack integrity. Electrical characterization captured DC I–V, endurance under pulses, and retention/read noise at elevated temperatures. Distinctive for Gr-RRAM is inverted SET/RESET polarity due to graphene acting as an oxygen reservoir; lower switching voltages/currents arise from sub-nm electrode and edge-field concentration. Cycle-to-cycle switching variability was quantified over 30–100 cycles. Programming/read protocols: Based on measured distributions, safe 1/2-bias write windows were set to avoid half-select disturbance. Pt-RRAM: V_SET ≈ +2 V and V_RESET ≈ −2.4 V (safe ranges ~1.5–2 V and −1.5 to −2.4 V). Gr-RRAM: V_SET ≈ −1.5 V and V_RESET ≈ +0.5 V (safe ranges −1.0 to −1.5 V and 0.4–0.5 V). Read voltage V_R = 0.1 V. Modeling: A Verilog-A compact model combining tunneling gap evolution and conductive filament radius evolution was calibrated to experimental data, including intrinsic programming variations and read noise. Switching current constraints: Pt-RRAM I_w ≈ 80 µA; Gr-RRAM I_w ≈ 5 µA. Array architecture and simulation: A 3D VRRAM array was modeled in HSPICE using 2×2×2 subcircuit tiles with a virtual node to build large arrays (up to 416×224×8). Array biasing: 1/2-bias write; single-WP read at V_R with other lines grounded; vertical selector transistors (1T-kR configuration) control cell access. Weighted-sum (WS) operation performed along bit-lines (BLs) with WPs serving as inputs. Interconnect parameters (WP and pillar resistances) were derived from ITRS and literature; selector transistors used PTM sub-45 nm models. Three WP options were evaluated: metal Pt (Pt-RRAM), pristine graphene (Gr-RRAM), and doped graphene (DGr-RRAM) with reduced sheet resistance; DGr-RRAM assumed same device switching as Gr-RRAM but lower WP resistivity. Neural network and XNOR architecture: A 2-layer MLP (400 input, 200 hidden, 10 output neurons) was trained on binarized 20×20 MNIST (60,000 train / 10,000 test). Weights were trained with higher precision (e.g., 6-bit) and quantized to 1-bit ternary (−1, 0, +1) using a sign bit and condition bits, enabling XNOR-based VMM in 3D VRRAM using two vertical layers per synapse. Inputs encoded by differential read pulses on top/bottom layers; pruning (0 weight) implemented by programming both cells to HRS. Worst-case WS read inaccuracy from array simulations was injected during inference; Monte Carlo analyses sampled read inaccuracy uniformly over device-specific ranges. Effects of read and write noises (cycle-to-cycle and device-to-device variations) were evaluated on accuracy.

Key Findings

- Device behavior: Gr-RRAM exhibits inverted bipolar switching polarity relative to Pt-RRAM due to graphene’s role as an oxygen reservoir. Switching voltages and currents are significantly lower in Gr-RRAM (I_w ≈ 5 µA) than Pt-RRAM (I_w ≈ 80 µA). Cycle-to-cycle variation σ/μ for SET voltage: ~13% (Pt-RRAM) vs ~6.4% (Gr-RRAM). Measured distributions: Pt-RRAM V_SET mean ≈ +1.27 V (σ ≈ 0.063 V) and V_RESET mean ≈ −1.37 V (σ ≈ 0.052 V); Gr-RRAM V_SET mean ≈ −0.89 V (σ ≈ 0.047 V) and V_RESET mean ≈ +0.31 V (σ ≈ 0.034 V). Gr-RRAM endurance under pulses (1.5 V, 500 ns) showed ON/OFF >10×; read noise increased with temperature (up to ~200 °C). - Programming windows: Safe write pulses established: Pt-RRAM V_W = {+2.0 V SET, −2.4 V RESET}; Gr-RRAM V_W = {−1.5 V SET, +0.5 V RESET}; read at V_R = 0.1 V. - Array scalability and access: In worst-case access, Pt-RRAM fails to meet minimum access voltage beyond ~128×128 planar size, entering probabilistic switching; Gr-RRAM and DGr-RRAM maintain safe programming across considered sizes up to 416×224×8. - Energy: Array-level programming energy reduced on Gr-RRAM vs Pt-RRAM by ~8× (SET) and ~262× (RESET), reaching sub-pJ levels for RESET, aided by lower switching current and reduced parasitics; half-selected cells account for reduced ratio vs single-device estimates. - Read/WS performance: All three arrays met a 100 nA read margin at 416×224 for sense discrimination. WS read inaccuracy (worst-case weights): Gr-RRAM and DGr-RRAM ≤ ~10%; Pt-RRAM significantly higher. Parallel BL read degrades WS accuracy due to sneak paths; ~8 BLs are optimal for 416×224×8; DGr-RRAM tolerates more parallelism due to lower WP resistivity. - Interconnect and thickness constraints (Shmoo): Metal WP resistivity rises sharply below ~5 nm; Pt-RRAM requires WP thickness >~30 nm to pass write/read/WS, conflicting with high-density stacking goals. Below ~3 nm, metal WP arrays fail regardless of I_w. Graphene WP (0.3 nm) passes all operations. Voltage drop analysis shows severe parasitic losses in thin-metal WP, undermining selected-cell bias, while graphene avoids this. - Recognition accuracy: With injected array WS inaccuracies and device noises, MNIST MLP accuracy reached ~83.5% (Pt-RRAM) vs ~94.1% (Gr-RRAM), approaching ideal ternary NN performance; graphene-based arrays showed tighter accuracy distributions in Monte Carlo runs compared to Pt-RRAM.

Discussion

Replacing metal WPs with graphene directly addresses parasitic resistance and scaling bottlenecks in 3D VRRAM, enabling reliable access and WS operations at larger planar sizes and thinner interconnects. Graphene’s ultrathin, conductive WP reduces IR drops, supports low switching currents/voltages, and improves read/write margins, which collectively translate into substantial programming energy savings and accurate in-memory VMM. The XNOR-based ternary architecture leverages binary device behavior and natural pruning to mitigate device/circuit nonidealities, making the system less sensitive to read/write noise compared to analog-weight implementations. Consequently, graphene-based VRRAM arrays achieve significantly higher inference accuracy than metal-based counterparts, demonstrating that material-level innovation can propagate benefits through circuit and algorithm layers for neuromorphic computing.

Conclusion

The study presents a holistic device–circuit–architecture approach showing that graphene word-planes in 3D VRRAM enhance scalability, reduce parasitics, and enable low-energy, accurate in-memory computing. An XNOR-inspired architecture with 1-bit ternary synapses implemented on graphene-based VRRAM achieves robust WS operations and high MNIST recognition accuracy (~94.1%), with large energy reductions (~8× SET, ~262× RESET) versus metal WP arrays. Results highlight the synergy between 2D materials and vertical memory architectures for memory-centric neuromorphic systems. Future work should focus on: (1) process integration advances for graphene (low-temperature synthesis, high-quality dry transfer); (2) exploring doped/multilayer graphene WPs to further reduce resistivity and expand parallelism; (3) experimental demonstration of larger-scale stacked arrays; (4) selector/device engineering to support even lower currents and higher stack counts; and (5) expanding XNOR-based schemes to deeper networks and more complex datasets.

Limitations

Large-scale array results are based on compact models calibrated to two-layer devices; full experimental validation on large stacked arrays is pending. Assumptions for doped graphene (DGr-RRAM) treat device switching as identical to pristine graphene, which may differ in practice. Graphene integration faces BEOL temperature limits and transfer-induced defects; process variability could affect interconnect performance. The monolithic WP pattern and 1T-kR configuration constrain input vector flexibility and can idle entire pillars for certain inputs, limiting multilayer parallelism. WS accuracy degrades with excessive parallel BL reads due to sneak currents, necessitating constraints (e.g., ~8 BLs) on parallelism.

Related Publications

Explore these studies to deepen your understanding of the subject.

Computer Science

Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip

M. Yao, O. Richter, et al.

Engineering and Technology

3D printed graphene-based self-powered strain sensors for smart tires in autonomous vehicles

D. Maurya, S. Khaleghian, et al.

Chemistry

Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity

T. Ochiai, T. Inukai, et al.

Engineering and Technology

Toward grouped-reservoir computing: organic neuromorphic vertical transistor with distributed reservoir states for efficient recognition and prediction

C. Gao, D. Liu, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny