We are amidst a data revolution. Just the past 5 years, the cost of sequencing a human genome has gone down approximately 10-fold. This development moves equally fast within areas such as mass spectrometry, in vitro immuno-peptide screening a.o. This facilitates the search for bio-markers, biologics, therapeutics, etc. but also redefines the requirements for storing, accessing and working with data and the skillset of bio data scientists. In this talk I will present tidysq, an R-package aiming at extending the Tidyverse framework to include (tidy) bio-data-science / bioinformatics. Tidysq will be presented in context with current status in ML driven (neo)epitope prediction within cancer immunotherapy.