logo
ResearchBunny Logo
Advancing mathematics by guiding human intuition with AI

Mathematics

Advancing mathematics by guiding human intuition with AI

A. Davies, P. Veličković, et al.

Discover groundbreaking advances in pure mathematics achieved with machine learning by a team of experts from DeepMind and the University of Oxford. This research unveils patterns linking algebraic and geometric structures of knots and proposes a novel algorithm tied to the combinatorial invariance conjecture.... show more
Introduction

The paper investigates whether and how modern machine learning can guide human mathematical intuition to discover new patterns, conjectures, and ultimately theorems in pure mathematics. Motivated by the long-standing use of computation in mathematics and the relative lack of established roles for AI systems in conjecture discovery, the authors propose and test a framework that integrates supervised learning and attribution techniques into the mathematician’s workflow. The goal is to verify the presence of meaningful structure between complex mathematical objects, understand that structure via model interpretation, and use the insights to formulate mathematically interesting, plausible, and provable conjectures. The importance lies in augmenting human insight with data-driven pattern recognition to make progress on challenging open problems, demonstrated here in knot theory and representation theory.

Literature Review

The article situates its work within a history of computational assistance in mathematics, from early data-driven discoveries like the prime number theorem and Birch–Swinnerton-Dyer conjecture, to computational proofs such as the four color theorem. It reviews prior applications of AI/ML in mathematics, including finding counterexamples to conjectures, accelerating computations, generating symbolic solutions, and detecting structure. Previous systems for generating conjectures either produced valuable but specialized conjectures or provided general methods that had not yielded mathematically impactful results. Recent supervised learning efforts have identified patterns in mathematical data, and related work in combinatorics and knot theory has explored ML-based prediction of invariants. The authors build on this by emphasizing model interpretability to translate learned patterns into human-understandable conjectures and proofs.

Methodology

The framework is an iterative, interactive process: (1) Hypothesize the existence of a relationship between mathematical objects X(z) and Y(z), i.e., a function f such that f(X(z)) = Y(z). (2) Generate datasets of paired examples (X(z), Y(z)) from an appropriate sampling distribution. (3) Train a supervised learning model to approximate f and assess predictive accuracy to confirm nontrivial structure. (4) Apply attribution techniques—specifically gradient saliency—to identify which components or substructures of X(z) most influence predictions of Y(z). (5) Use insights from attribution to refine hypotheses, focus on salient features, propose candidate closed-form relationships, alter the sampling distribution if needed, and iterate. (6) Formulate precise conjectures and pursue mathematical proofs. Case study: Knot theory. Hypothesis: previously undiscovered relationships exist between hyperbolic (geometric) invariants and algebraic invariants of knots. The team trained supervised models to predict the signature σ(K) from a suite of geometric invariants. Attribution highlighted cusp-geometry features—real and imaginary parts of meridional translation μ and longitudinal translation λ—as most salient. A reduced model using only these features maintained accuracy, suggesting sufficiency. Guided by this, the authors introduced the natural slope slope(K) = Re(λ/μ) and explored its relationship to the signature. An initial inequality conjecture |2σ(K) − slope(K)| < c₁ vol(K) + c₂ held broadly but had counterexamples. Refining the analysis, they proved an inequality incorporating injectivity radius inj(K): |2σ(K) − slope(K)| ≤ c vol(K) inj(K)⁻³, with additional results (e.g., using short geodesics) given in the Supplementary Information. Case study: Representation theory. Hypothesis (combinatorial invariance conjecture): the Kazhdan–Lusztig (KL) polynomial for a pair of permutations can be computed from the unlabelled Bruhat interval (a directed graph). Graph-based supervised learning models predicted KL polynomials with notable accuracy. Varying input graph representations revealed that certain subgraphs, inspired by prior work, improved performance. Attribution analyses of salient subgraphs showed overrepresentation of extremal reflections compared to baselines, despite the network not receiving edge labels. This led to the discovery of a natural decomposition of Bruhat intervals into a hypercube (induced by extremal reflections) and a component isomorphic to an interval in a smaller symmetric group. From this structure, the authors derived a formula computing KL polynomials from the hypercube and S_M components and proved a canonical hypercube decomposition theorem. Extensive computational verification supported a stronger conjecture that any hypercube decomposition suffices.

Key Findings
  • A general, machine-learning-guided framework can effectively validate and elucidate relationships between complex mathematical invariants, enabling mathematicians to formulate and prove new results.
  • Knot theory: Attribution pointed to cusp-geometry features (Re(μ), Im(μ), λ) as key predictors of the signature. This motivated defining the natural slope slope(K) = Re(λ/μ), linearly related to σ(K). The authors proved the inequality |2σ(K) − slope(K)| ≤ c vol(K) inj(K)⁻³ for hyperbolic knots, with empirical lower bound c ≥ 0.23392 and plausible upper bound around 0.3 in explored regions. The result links algebraic and geometric invariants and has implications for Dehn surgeries and surface genus.
  • Representation theory: Learned models and attribution uncovered structural prominence of extremal reflections in Bruhat intervals, leading to a canonical hypercube decomposition along extremal reflections from which the KL polynomial is directly computable. Computational checks verified correctness for all 3 × 10^6 intervals in symmetric groups up to S_7 and over 1.3 × 10^9 non-isomorphic intervals sampled from S_8 and S_9. This supports a conjecture that any hypercube decomposition determines the KL polynomial, suggesting a resolution path for the combinatorial invariance conjecture in symmetric groups.
  • Practicality: Models for these tasks can be trained within hours on a single GPU, demonstrating feasibility for mathematical workflows.
Discussion

The findings validate the core hypothesis that supervised learning, combined with attribution, can reveal actionable structure in mathematical objects that is otherwise difficult to discern. In knot theory, the framework distilled a complex set of geometric invariants down to cusp-geometry features, guiding the introduction of the natural slope and enabling a provable quantitative link between algebraic and geometric invariants—one of the first of its kind. In representation theory, the approach translated predictive signals into structural insights (extremal reflections and hypercube decompositions), yielding a new decomposition theorem and a compelling conjecture that, if proven, would effectively settle the combinatorial invariance conjecture for symmetric groups. These results illustrate how ML can serve as a test bed for mathematical intuition, rapidly vetting patterns, focusing attention on salient aspects, and suggesting proof strategies, thereby enhancing the mathematician’s toolkit and fostering productive human–AI collaboration.

Conclusion

The paper introduces and demonstrates an ML-guided framework that augments mathematical intuition, leading to notable contributions in two distinct domains: a quantitative relationship between knot signature and cusp geometry via the natural slope, and a canonical hypercube decomposition of Bruhat intervals that computes KL polynomials, with strong empirical support for a broader conjecture. Rather than replacing human creativity, the framework channels learned patterns into interpretable insights that inform conjectures and proofs. Future directions include proving the conjectured generality of hypercube decompositions for KL polynomials, strengthening the knot-theoretic bounds to avoid dependence on inj(K)⁻³, extending the methodology to other areas of pure mathematics, and developing richer attribution and model-interpretability tools tailored to mathematical structures.

Limitations

The framework requires the ability to generate large datasets of mathematical objects and associated invariants; it is most effective when patterns are detectable within computationally accessible examples. Some target functions may be too complex to learn accurately with available models or data. In the knot theory result, the current bound depends on the injectivity radius term inj(K)⁻³, which may be undesirable and motivates further refinement (alternative bounds using short geodesics are discussed in supplementary materials). More generally, insights from attribution depend on model choice and training stability, and care is needed to avoid overinterpreting spurious correlations.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny