This paper investigates the ability of large language models (LLMs) to generate novel research ideas. A large-scale human study involving over 100 NLP researchers was conducted to compare LLM-generated ideas with those produced by human experts. The results show that LLM-generated ideas were judged as significantly more novel than human expert ideas (p < 0.05), although slightly weaker in terms of feasibility. The study also identifies limitations in current LLM capabilities, including a lack of diversity in idea generation and unreliable self-evaluation.
Publisher
Published On
Authors
Chenglei Si, Diyi Yang, Tatsunori Hashimoto
Tags
large language models
novel research ideas
human study
NLP researchers
idea generation
feasibility
self-evaluation
Related Publications
Explore these studies to deepen your understanding of the subject.