Assuring SDTM data quality with the sdtmchecks package


The job of a data scientist working on a clinical trial team in the pharmaceutical industry is to provide the most accurate analysis possible in order to enable valid insights from the data. Ensuring data quality is extremely hard work and there are teams of people at clinical trial sites, vendor companies, and within the sponsor institution all working to identify and resolve data issues in order to help make datasets analysis ready. Before performing an important analysis a data scientist may want a way to reassure themself about the quality of their data and identify any important issues that have slipped through the cracks. sdtmchecks is a simple, easy to use, open source R package to help identify analysis impacting and actionable data quality issues in SDTM datasets. This talk will touch on this package’s crowd-sourced development history at Roche/Genentech as an accessible way for non-R coders to get initial, practical experience with R, its current use at the company within a Shiny app, as well as its future potential as an open-source tool publicly available for cross-industry collaboration.

Presented at 2022 Conference