Gatto, LaurentDupont, PierreMaes, KilianKilianMaes2025-05-142025-05-142025-05-142021https://hdl.handle.net/2078.2/23725Shotgun proteomics is a method widely used to identify proteins present in biological samples. This is essential to understand how cells work, the mechanisms of certain diseases, etc. During shotgun proteomics experiments, a large number of spectra are generated. Their analysis, typically using database search, identifies many proteins present in the samples. However, a large proportion of these spectra remain unidentified. Previous works have shown that clustering the spectra can help to identify new proteins. However, the increasing amount of data requires efficient clustering algorithms. This work focuses on Falcon, a clustering software recently released. First, we analyze its algorithms, and we study its scalability. Then, we use it to cluster a large dataset, and we examine whether useful information can be extracted from the clusters. Finally, we discuss several areas of improvement for Falcon.Mass spectrometryProteomicsFalconClusteringClustering and analyzing large mass spectrometry-based proteomics datatext::thesis::master thesisthesis:30716