logo
ResearchBunny Logo
The interrelationship between confidence and correctness in a multiple-choice assessment: pointing out misconceptions and assuring valuable questions

Medicine and Health

The interrelationship between confidence and correctness in a multiple-choice assessment: pointing out misconceptions and assuring valuable questions

R. Grazziotin-soares, C. Blue, et al.

Discover how confidence levels influence performance in dental education! This study by a team of researchers, including Renata Grazziotin-Soares and Diego Machado Ardenghi, reveals intriguing insights into the relationship between confidence and correctness in a multiple-choice assessment among dental students, highlighting the potential for improved educational strategies.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates how confidence relates to correctness in multiple-choice assessments within a preclinical endodontics course. Confidence is distinguished from self-efficacy and is considered context-specific certainty about content, while self-efficacy includes affective and motivational factors. Prior literature shows a positive relationship between confidence and accuracy and highlights four confidence-correctness interactions: (1) correct and confident (well-calibrated knowledge), (2) correct and unconfident (possible guessing), (3) incorrect and confident (misconceptions that are resistant to change and potentially harmful in clinical decision-making), and (4) incorrect and unconfident (awareness of knowledge gaps). The purpose was to identify students’ performance patterns (correctness, confidence, and misconceptions) and evaluate the value of questions (appropriateness, clarity, and potential to induce misconceptions), extending prior work to a Canadian context. The research questions were: (1) How was students’ performance considering correctness, misconceptions, and confidence? (2) Were the questions valuable, appropriate, and friendly, and which ones led to misconceptions?
Literature Review
Evidence indicates confidence correlates positively with correctness, and confidence, self-concept, self-efficacy, and anxiety predict academic achievement. Confidence ratings in assessments can reveal calibration and identify misconceptions, which are often resistant to change and can lead to unsafe decisions. Confidence-based scoring approaches and the inclusion of confidence scales offer richer diagnostic insight than traditional MCQs alone and can improve feedback and learning retention. High-quality MCQ construction enhances cognitive level. Prior dental education studies found students tend to be overconfident and that confidence data can identify misconceptions. Negative question stems can be misleading, while questions involving clinical scenarios or images may better support understanding. This study tests generalizability of prior findings to a Canadian cohort and explores question characteristics associated with misconceptions.
Methodology
Design: Cross-sectional study with a convenience sample of second-year dental students in a Canadian dental school’s preclinical endodontics course. Ethics approval: University Behavioral Research Ethics Board (Beh-REB App ID #50). Participants: Entire class invited (n=31); final analyzed sample n=29 (one declined; one excluded for many missing confidence responses). Assessment instrument: 20 multiple-choice questions (MCQs) aligned to course content (access opening, instrumentation, irrigation, intracanal medication, obturation), each with four options (one correct), pre-designated by two faculty as “basic” or “moderate.” Each MCQ was followed by a confidence question on a 4-point scale (very unsure, unsure, sure, very sure), later dichotomized as ‘confident’ (sure/very sure) vs ‘unconfident’ (very unsure/unsure). Four confidence-correctness situations were defined: (1) correct+confident, (2) correct+unconfident, (3) incorrect+confident (misconception), (4) incorrect+unconfident. Research questions: (1) Student performance profile (correctness, confidence, misconceptions). (2) Question value and which questions led to misconceptions. Analyses: Frequencies/percentages of the four situations at student and item levels. Statistical tests (α=0.05): Chi-square tests for associations between question difficulty (basic/moderate) and correctness; difficulty/correctness and confidence; negative stems and misconceptions; clinical/mental scenario/image-based questions and misconceptions. Mann–Whitney test for association between students’ performance and number of misconceptions. Fisher’s exact test used where appropriate. Software: OpenEpi (v3.01) and Jamovi. Faculty were blinded to student identities until course end.
Key Findings
- Overall performance: 92.5% correctness (537/580 total answers correct) and 84.6% confidence. - Misconceptions: 12 total misconceptions produced by 9 students (31.0%). Students with more misconceptions had lower overall correctness (Mann–Whitney, P < 0.001). - High achievers (85–100% scores) tended to be unconfident in their incorrect responses (more situation 4) (Chi-square, P = 0.047). - Distribution of confidence-correctness situations across all responses (n=580): Situation 1 (correct+confident) 77.9% (452/580); Situation 2 (correct+unconfident) 14.65% (85/580); Situation 3 (incorrect+confident; misconceptions) 2.06% (12/580); Situation 4 (incorrect+unconfident) 5.34% (31/580). - Question difficulty: ‘Moderate’ questions induced more incorrect responses than ‘basic’ (Chi-square, P < 0.05) and prompted lower confidence even when correct (Chi-square, P = 0.02). Basic questions had 95.17% correct; moderate questions accounted for 67.44% of all incorrect answers. - Question characteristics: Items using images or requiring a mental picture of a clinical scenario produced fewer misconceptions than theoretical/descriptive questions (Fisher’s exact, P = 0.007). Negative stems had similar misconception rates as other questions (Chi-square, P = 0.96). - Item-level examples: Question 9 (moderate; Ni-Ti instruments) had the highest incorrect rate (31%) and highest misconceptions (17%). Question 20 (moderate; access cavity underextension) had lower confidence and more uncertainty but fewer misconceptions relative to theoretical items.
Discussion
Including confidence ratings with MCQs provided nuanced insight into learners’ knowledge calibration, enabling detection of guessing (correct but unconfident), misconceptions (incorrect but confident), and awareness of uncertainty (incorrect and unconfident). The high accuracy and confidence observed likely reflect a straightforward, technical preclinical curriculum and consistent instruction by a single lecturer. Calibrated confidence with correctness is desirable for safe clinical decision-making, while uncertainty in incorrect responses among high achievers suggests productive self-awareness that may promote help-seeking and reduce clinical error. Moderate-difficulty items yielded more errors and lower confidence, especially when bridging theory to clinical application, consistent with the cohort’s preclinical status. Nevertheless, clinical scenario and image-based items were associated with fewer misconceptions, supporting their value for assessment and learning in dentistry. Negative stems, while sometimes misleading, did not statistically increase misconceptions in this sample but may merit careful redesign. Overall, the approach informs targeted feedback (especially for students with misconceptions) and iterative improvement of question quality and cognitive level.
Conclusion
Preclinical endodontic students were highly accurate and generally confident. Students with more misconceptions performed worse overall. All questions were deemed valuable, though some warrant refinement (e.g., negative stems, items demanding complex theory–clinic transfer). Pairing MCQs with confidence ratings effectively identifies misconceptions, improves feedback, and informs question design. Future work should expand to multiple cohorts and institutions, include repeated assessments, refine negative stems, and integrate more clinical scenario and image-based items; educational interventions (e.g., videos or clinical shadowing) may facilitate theory-to-practice transfer.
Limitations
- Convenience sample from a single institution and a single assessment limits generalizability. - Small class size; results may reflect local teaching context (single lecturer). - Confidence is a subjective, multifaceted measure with inherent imprecision. - MCQ format may encourage superficial or strategic learning focused on recall rather than higher-order understanding. - No concurrent traditional (no-confidence) assessment for direct comparison.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny