logo
ResearchBunny Logo
Synthesizing theories of human language with Bayesian program induction

Linguistics and Languages

Synthesizing theories of human language with Bayesian program induction

K. Ellis, A. Albright, et al.

Explore a groundbreaking framework that combines Bayesian inference with program synthesis to generate interpretable morpho-phonological models from 70 datasets spanning 58 languages. Conducted by a team of experts including Kevin Ellis, Adam Albright, Armando Solar-Lezama, Joshua B. Tenenbaum, and Timothy J. O'Donnell, this research paves the way for powerful machine-enabled discoveries in linguistics and beyond.... show more
Abstract
Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence. We present a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sounds. We integrate Bayesian inference with program synthesis and representations inspired by linguistic theory and cognitive models of learning and discovery. Across 70 datasets from 58 diverse languages, our system synthesizes human-interpretable models for core aspects of each language's morpho-phonology, sometimes approaching models posited by human linguists. Joint inference across all 70 data sets automatically synthesizes a meta-model encoding interpretable cross-language typological tendencies. Finally, the same algorithm captures few-shot learning dynamics, acquiring new morphophonological rules from just one or a few examples. These results suggest routes to more powerful machine-enabled discovery of interpretable models in linguistics and other scientific domains.
Publisher
Nature Communications
Published On
Aug 30, 2022
Authors
Kevin Ellis, Adam Albright, Armando Solar-Lezama, Joshua B. Tenenbaum, Timothy J. O'Donnell
Tags
morpho-phonology
Bayesian inference
program synthesis
language models
few-shot learning
cross-language typology
machine learning
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny