RMarkdown for enhanced Quality Control & Documentation


When working with big data sources, such as medical claims data, the process of data review and quality control (QC) can be both complex and tedious. R and RMarkdown have become common tools for data analytics and report writing in the pharmaceutical space. RMarkdown’s ability to integrate both the code used to query the data, with the results and visualization of it, provides a more powerful interface in which the output can contain both the process and the results, unifying two procedures which are separate with other approaches to the QC process. We present our analytical QC pipeline process which can be developed at the start of a new project and marries the processes of deeply understanding the underlying data in the early stages, with developing an ongoing pipeline of reports and QC procedures which can be automated and run in the future when the data is refreshed. The reduced timeline of this process greatly increases the speed in which updated data can be ingested accurately and confidently, and results in key stakeholders having quick access to the most up to date information in downstream processes and decision making.

Presented at 2022 Conference