Visualization methods for RNA-sequencing data analysis


RNA-seq data is biased and accurate detection of differentially expressed genes (DEGs) is not a trivial task. While the data collection can be considered high-throughput, data analysis has intricacies that require careful human attention. The most effective approach to modern data analysis is to iterate between models and visuals, and to enhance the appropriateness of models based on feedback from visuals. As it stands, there is a need to make it easier for scientists and clinicians to use models and visuals in a complimentary fashion during RNA-seq data analysis. Here, we use public RNA-seq data to show that our visualization tools can detect normalization problems, DEG designation problems, and common errors in RNA-seq analysis. We also show that our tools can identify genes of interest that cannot be obtained with models. Through this project, we propose that users slightly modify their approach to data analysis by quickly assessing the sensibility of their models with statistical graphics. We plan to publish a new R software package that includes the plotting techniques introduced in this project, which can be useful for exploring several types of multivariate biological data such as RNA-sequencing data.

Presented at 2018 Conference