logo
ResearchBunny Logo
Abstract
This paper introduces PURPLE, a machine learning method designed to accurately estimate the relative prevalence of underreported health conditions, such as intimate partner violence (IPV). PURPLE addresses the challenge of skewed prevalence estimates caused by underreporting, which varies across different demographic groups. The method leverages positive unlabeled learning but avoids restrictive assumptions about data separability. Experiments on synthetic and real health data demonstrate PURPLE's superior accuracy in recovering relative prevalence compared to existing methods. Applying PURPLE to two large emergency department datasets reveals higher IPV prevalence among Medicaid recipients, non-white patients, unmarried individuals, lower-income populations, and those residing in metropolitan counties. Correcting for underreporting using PURPLE yields more plausible estimates than methods that don't account for underdiagnosis.
Publisher
npj Women's Health
Published On
May 15, 2024
Authors
Divya Shanmugam, Kaihua Hou, Emma Pierson
Tags
machine learning
health conditions
underreporting
intimate partner violence
prevalence estimation
demographics
data accuracy
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny