logo
ResearchBunny Logo
AI model GPT-3 (dis)informs us better than humans

Computer Science

AI model GPT-3 (dis)informs us better than humans

Not Available

Explore how humans and the advanced GPT-3 AI model tackle the challenge of distinguishing between accurate and inaccurate information, as well as synthetic and organic tweets. This intriguing study, conducted by the authors, reveals the complexities of information veracity and origin discernment.

00:00
00:00
Playback language: English
Introduction
The proliferation of misinformation and the increasing sophistication of AI-generated text pose significant challenges to information integrity. This study investigates the comparative abilities of humans and the GPT-3 AI model to identify disinformation and differentiate between synthetic (AI-generated) and organic (human-generated) tweets. The research question focuses on whether humans are superior to GPT-3 in detecting disinformation and identifying the source of tweets. The context is the urgent need to understand human susceptibility to misinformation in the digital age and to assess the potential of AI to either mitigate or exacerbate the problem. This is important because accurate identification of disinformation is crucial for informed decision-making and effective counter-measures against the spread of false narratives. The study's purpose is to empirically compare human and AI performance in these crucial tasks, providing insights into the strengths and limitations of both. The importance lies in contributing to the development of more effective strategies to combat the spread of misinformation online.
Literature Review
While the supplementary materials do not explicitly detail a comprehensive literature review, it implicitly refers to existing research on human susceptibility to misinformation and the capabilities of large language models like GPT-3. The work builds upon previous studies examining the detection of fake news and the challenges of distinguishing between human-generated and AI-generated content. The study acknowledges the need to understand the complexities of human judgment in evaluating information credibility in the context of evolving AI technologies.
Methodology
The study employed a quantitative methodology involving an online survey. A total of 697 participants evaluated 20 tweets per category (11 categories in total), encompassing both accurate and inaccurate statements, with 5 tweets each categorized as organic true, synthetic true, organic false, and synthetic false. Tweets were sourced from Twitter and generated using GPT-3. Participants rated the accuracy (True/False score) and origin (Organic/Synthetic score) of each tweet. Pre- and post-confidence levels in their abilities were also assessed. Data analysis involved correlation analysis between various variables (scores, demographics, survey duration), ANOVA tests, and appropriate non-parametric tests (Shapiro-Wilk, Kruskal-Wallis) as needed. A power analysis was conducted to determine the necessary sample size, with a final sample of 449 respondents targeted. A budget of 492.24€ was allocated for data collection, primarily through a Facebook dissemination campaign targeting various demographics.
Key Findings
The key findings demonstrate a complex interplay between human judgment and AI capabilities in detecting disinformation and identifying tweet origins. Humans were not uniformly superior to GPT-3 across all categories. While humans generally performed better at identifying disinformation in certain topics, GPT-3 outperformed humans in other areas. Similarly, in identifying the source (organic vs. synthetic) of tweets, human performance was inconsistent, with a better ability to identify organic tweets but less success distinguishing synthetic ones. Analysis of True/False scores revealed that disinformation tweets were more often misidentified as accurate information than accurate tweets were misidentified as disinformation. Regarding Organic/Synthetic scores, organic tweets were more often wrongly classified as synthetic. Correlations analysis revealed small, but significant correlations between age and education levels with accuracy scores; younger participants, particularly 18-25 years old, tended to score higher in AI-generated tweet recognition. Older participants (42-57 years) scored higher on disinformation recognition compared to the 58-76 age group. Higher education levels were associated with improved performance. Pre- and post-confidence scores showed a correlation with actual performance, especially concerning disinformation recognition. Importantly, there was no correlation between the time taken to complete the survey and performance.
Discussion
The findings highlight the limitations of both human judgment and AI in reliably detecting disinformation and discerning the origins of online content. The inconsistency in performance across different topics suggests that expertise and contextual knowledge play significant roles in accurate evaluation. The study's findings challenge the assumption that humans are always better at identifying disinformation than sophisticated AI models. Future research should explore the development of AI-assisted tools that leverage the strengths of both humans and AI in identifying misinformation and its source. This could involve techniques that combine AI's pattern recognition capabilities with human contextual understanding.
Conclusion
This supplementary material provides valuable insights into the challenges of detecting disinformation in the digital age. Both humans and AI struggle with consistent accuracy. Future research needs to focus on integrating human expertise with AI capabilities to improve detection strategies and address the limitations identified in this study.
Limitations
The study's limitations include the potential for sampling bias, despite efforts to target a diverse demographic through Facebook advertising. The reliance on a single AI model (GPT-3) may limit the generalizability of findings to other AI systems. The specific selection of tweets may not fully represent the diversity of disinformation tactics online. Further research with larger, more representative samples and broader AI model comparisons is needed.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny