Computer Science
Larger and more instructable language models become less reliable
L. Zhou, W. Schellaert, et al.
Recent research by Lexin Zhou, Wout Schellaert, Fernando Martínez-Plumed, Yael Moros-Daval, Cèsar Ferri, and José Hernández-Orallo highlights a paradox: while larger language models perform better on tough tasks, they falter in simpler ones, leading to plausible but incorrect answers. This raises critical questions about the reliability of AI models, especially in high-stakes situations.
Related Publications
Explore these studies to deepen your understanding of the subject.

