logo
ResearchBunny Logo
Using think-aloud protocol to identify cognitive events while generating data-driven scientific hypotheses by inexperienced clinical researchers

Medicine and Health

Using think-aloud protocol to identify cognitive events while generating data-driven scientific hypotheses by inexperienced clinical researchers

X. Jing, B. N. Draghi, et al.

In a data-driven study, clinical researchers used VIADS or conventional tools while think-aloud protocols were recorded and coded to trace cognitive events during hypothesis generation; the VIADS group showed the fewest cognitive events per hypothesis and high reliance on “using analysis results” (30%) and “seeking connections” (23%). This research was conducted by the Authors present in <Authors> tag. Listen to the audio to hear how VIADS may better guide hypothesis generation.... show more
Introduction

The study investigates how clinical researchers generate data-driven scientific hypotheses, with a focus on identifying and quantifying the cognitive events involved during the process. It examines whether the visual interactive analysis tool VIADS influences the efficiency and pattern of cognitive events compared with commonly used analytical tools (e.g., SPSS, SAS, R). The context is the critical role of hypothesis generation in scientific research, particularly in data-rich, knowledge-intensive domains like medicine. The purpose is to characterize cognitive mechanisms during hypothesis generation using a think-aloud protocol and to compare inexperienced researchers using VIADS versus controls, providing insights that may improve informatics tools and research training. The importance lies in filling a gap in prospective, controlled human-subject studies that directly observe users’ interactions with tools during the formation of new hypotheses, enhancing understanding of an early, pivotal phase of the research lifecycle.

Literature Review

Prior work has explored scientific hypothesis generation and reasoning in both scientific and clinical domains. Tools for hypothesis generation include literature-mining systems and interactive visual analytics; however, many were validated retrospectively without controlled human-subject observation. Research distinguishes between scientific reasoning (often convergent, knowledge-lean, starting from defined problems) and data-driven hypothesis generation (often divergent, open discovery in knowledge-rich domains). Medical diagnostic hypothesis generation shares features with scientific reasoning due to starting with a defined case. Foundational studies (e.g., Klahr & Dunbar) emphasized the roles of memory search and prior results; Patel and colleagues demonstrated differences in reasoning processes between experienced and novice clinicians. The authors previously developed a conceptual framework for hypothesis generation and a tool (VIADS) for analyzing hierarchical, code-based datasets, along with usability studies and instruments for hypothesis quality evaluation. This study extends these by prospectively observing cognitive events during data-driven hypothesis generation using think-aloud protocols.

Methodology

Design: A 2×2 mixed-methods study compared hypothesis generation between participants using VIADS versus non-VIADS analytical tools, stratified by experience (experienced vs. inexperienced clinical researchers). Participants used identical datasets and scripts, worked under the same facilitator, and followed a 2-hour think-aloud protocol. VIADS users received a 1-hour training session. Control participants used familiar tools (e.g., SPSS, SAS, R, Excel). Datasets: 2005 and 2015 National Ambulatory Medical Care Survey (NAMCS) data. Preprocessing aggregated ICD-9-CM diagnostic and procedural codes and their frequencies; full code names were provided to participants. Data capture: Screen activity and audio were recorded (BB Flashback), transcribed professionally, and accuracy-checked by a content expert. Consent for recording was obtained. Coding framework: A preliminary conceptual framework (informed by literature and pilot analyses) defined initial codes and code groups for cognitive events (e.g., Analyze data, Use analysis results, Seek connections, Analogy, Use PICOT). One session served as a pilot to establish coding principles and train two coders. Remaining transcripts were coded independently by two coders; discrepancies were resolved via discussion with the facilitator and iterative refinement of coding rules. New codes/groups were added as needed. Each hypothesis generation instance was treated as an independent unit; cognitive events were labeled per hypothesis. Analysis strategy: Four levels of analysis: (1) per hypothesis (n=199), (2) per participant (n=16) and overall, (3) inexperienced VIADS group (n=9), and (4) inexperienced control group (n=7). Two participants’ sessions (both inexperienced controls) had partial recordings and were excluded. Independent t-tests compared cognitive events between VIADS vs. control among inexperienced participants, and between experienced vs. inexperienced participants. Hypothesis quality was evaluated by seven expert clinicians using a brief instrument rating significance, validity, and feasibility on 5-point Likert scales; a hypothesis was deemed invalid if ≥3 experts rated validity as 1. Analyses considered all hypotheses and valid-only subsets.

Key Findings
  • Participants: 16 clinical researchers (9 female, 7 male); majority with <2 years’ study design/data analysis experience and <5 significant publications. Total hypotheses: 199 (inexperienced: 163; experienced: 36). Total cognitive events: 1216.
  • Distribution of cognitive events (all hypotheses): Processing evidence accounted for ≥70% of events. Top events: Use analysis results 29.85% (363/1216); Seek connections 23.03% (280/1216); Analyze data 20.81% (253/1216). Similar pattern held for valid-only hypotheses (Use analysis results 31.77%; Seek connections 24.48%; Analyze data 18.98%).
  • Average cognitive events per hypothesis:
    • Inexperienced control: 7.38 (SD 5.02)
    • Inexperienced VIADS: 4.48 (SD 2.43); significantly fewer than inexperienced control (p<0.001)
    • Experienced: 6.15 (SD 3.03); significantly more than inexperienced VIADS (p<0.01)
  • Valid hypothesis rates: Experienced 72.22% (26/36) vs. inexperienced 63.19% (103/163), ≈10% higher for experienced.
  • Inexperienced VIADS vs. control (percentages among cognitive events; t-tests):
    • Use analysis results: 31.3% vs. 27.1% (p<0.001)
    • Seek connections: 25.4% vs. 17.8% (p<0.001)
    • Analyze data: 22.1% vs. 21.1% (ns)
    • Pause/think: 3.8% vs. 9.3% (p<0.05)
    • VIADS showed higher counts for Preparation, Using analysis results, and Seeking connections; control showed higher Needing further study, Inferring, Pause/think, Using checklists, and Using PICOT.
  • Experienced vs. inexperienced (percentages):
    • Use analysis results: 31.7% vs. 29.4% (p<0.01)
    • Seek connections: 27.6% vs. 21.9% (p<0.01)
    • Analyze data: 17.5% vs. 21.6% (p<0.01)
    • Experienced showed higher Using analysis results, Seeking connections, Inferring, Pausing; inexperienced showed higher Preparation, Data analysis, Using suggestions, Using checklists, Using PICOT.
Discussion

Findings indicate that most cognitive activity during data-driven hypothesis generation involves processing evidence, with the most frequent events being Using analysis results, Seeking connections, and Analyzing data. Inexperienced participants using VIADS required significantly fewer cognitive events per hypothesis and less time per hypothesis (per prior publication) than controls, suggesting improved efficiency. Moreover, the distribution of cognitive events in the inexperienced VIADS group resembled that of experienced researchers—greater emphasis on Using analysis results and Seeking connections and reduced reliance on Pause/think and external scaffolds—implying that VIADS may guide novices toward more expert-like cognitive patterns during hypothesis generation. While the control group used familiar tools and VIADS users received only a brief training, differences favoring VIADS may be conservative estimates; greater familiarity with VIADS might further accentuate these effects. Causality cannot be established from these observational metrics alone, but the study provides baseline quantitative evidence of cognitive event patterns in a controlled setting, underscoring the feasibility of measuring and comparing hypothesis-generation processes.

Conclusion

Experienced clinical researchers produced a higher proportion of valid hypotheses than inexperienced researchers. Inexperienced participants using VIADS generated hypotheses with the fewest cognitive events and lowest variability and showed cognitive event distributions more akin to those of experienced researchers, indicating that VIADS may provide better guidance than other analytical tools for hypothesis generation. The study delivers foundational metrics and a cognitive framework for data-driven hypothesis generation in clinical research and demonstrates the feasibility of such experiments. Larger, more naturalistic studies are warranted to confirm effects, explore sequence patterns of cognitive events, and enhance both efficiency and quality of generated hypotheses.

Limitations
  • Small number of experienced participants (n=3) limits generalizability and precluded VIADS vs. control comparisons within the experienced stratum.
  • Think-aloud protocol captures only verbalized, conscious processes; nonverbal and tacit cognitive activities were not measured.
  • Two participants’ sessions (inexperienced controls) had partial recordings and were excluded; technical failures suggest need for pre-session recording checks.
  • Potential tool familiarity bias: controls used familiar tools, whereas VIADS users had limited training; group differences may be underestimated or influenced by learning curves.
  • Recruitment challenges for experienced researchers may bias participation and limit representativeness.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny