A Bayesian Approach combining Peptide Intensity and Missingness Modelling to analyse Label Free Mass Spectrometry based Proteomics Data

Lambert, PhilippeGatto, LaurentHauchamps, PhilippePhilippeHauchamps2025-05-142025-05-142025-05-142021https://hdl.handle.net/2078.2/23032Label Free Mass Spectrometry based Quantitative proteomics presents important data challenges. Among these challenges are the high proportion of missing data, with a missingness process that is Not Missing At Random (NMAR), and the fact that the quantities of interest - protein abundances - are not obtained directly, but need to be estimated from measurable quantities, i.e. peptide intensities. In bio-informatics pipeline settings, these two challenges are often addressed by implementing pre-processing steps, i.e. missing data imputation and peptide intensities summarization, which come with their benefits but also drawbacks. In this thesis, a Bayesian statistical model, combining peptide intensities and missingness process modelling, is built in several steps. Two main practical issues, i.e. model identifiability concerns arising for some proteins, and intractable execution times, are addressed. Finally, an ad hoc two-step empirical Bayesian approach is proposed, which shows slightly better performance in terms of protein classification accuracy, than simpler model counterparts, on a benchmark dataset.Bayesian statisticsHamiltonian Monte CarloLabel Free MS based proteomicsmissing dataA Bayesian Approach combining Peptide Intensity and Missingness Modelling to analyse Label Free Mass Spectrometry based Proteomics Datatext::thesis::master thesisthesis:29911