Enhancing Quality Risk Classification with Natural Language Processing


Product quality plays a vital role in the success of Biotech/Pharmaceutical organizations, and the accurate classification of quality risks is crucial to ensure the delivery of high-quality products. However, the current practice of assigning risk levels (Critical, Major, and Minor) to self-reported quality issues (QIs) suffers from subjectivity and noise, leading to unreliable risk assessments. To address this limitation, this study aims to develop a web-based application that leverages Natural Language Processing (NLP) algorithms to infer the risk level based on the description of the issue (free text data). In this work, we propose a novel data-driven framework for classifying risk levels, which integrates state-of-the-art deep neural network (DNN) models with ensemble learning concepts. By utilizing the power of NLP techniques, our framework enables the automatic discovery and analysis of quality risks. Through extensive numeric experimentation, we demonstrate the effectiveness of our approach with proper performance metrics. The research findings presented in this work shed light on the potential of NLP in uncovering quality risks and offer valuable insights to practitioners in the pharmaceutical industry. Also, this study contributes to the growing body of knowledge in the field of risk management and highlights the importance of utilizing NLP algorithms for quality assurance in the biotech/pharmaceutical domain. Regarding our technology stack, we utilize Python, Streamlit, and Posit Connect for the development and deployment of the model.

Presented at 2023 Conference