Introducing smdi: An R package to perform structural missing data investigations for real-world evidence studies


Real-world data are increasingly used to complement evidence from clinical trials. However, missing data are a major statistical challenge when the underlying missingness mechanisms are unknown, e.g., to adjust for confounding. This talk introduces the smdi R package, which aims to streamline routine missing data investigations of partially observed confounders based on a suite of three group diagnostics. The structural missingness assumptions were recently validated in a simulation study and are characterized through M-graphs of realistic relationships between a partially observed confounder and its association with an exposure, outcome and other fully observed covariates. Aiming to differentiate between different missingness mechanisms, the package implements three group diagnostics to 1) compare distributions between patients with and without the partially observed confounder, 2) asses the ability to predict missingness based on observed covariates, and 3) examine if missingness is associated with the outcome under study. As a result, combining all group diagnostics can give guidance on how the underlying missingness for partially observed confounders could be characterized and approached in downstream analyses.

Presented at 2023 Conference