No Thumbnail Available

Applying Deep Convolution Neural Network (CNN) on flow cytometry data for the prediction of a survival type clinical outcome

(2023)

Files

Mouaden_98022000_2023.pdf
  • Open access
  • Adobe PDF
  • 4.47 MB

Details

Supervisors
Degree label
Abstract
In 2020, a group of researchers published an innovative paper that presented a new approach to the processing of cytometry data. Employing a Deep Convolutional Neural Network architecture, they demonstrated the potential of these models to not only surpass existing automated biomarker identification methods but also provide fresh insights in cytometry data analysis. Drawing inspiration from this successful initiative, this Master's thesis sets out to validate the application of Deep CNN models to predict survival outcomes using flow cytometry data. The dataset chosen for this endeavor comprises 383 HIV-positive patients, destined to progress to AIDS. Among them, 191 were designated for training and 192 for testing, as part of an automated data analysis challenge that yielded a random forest model as the winning solution. In pursuit of this goal, a modified version of the CNN model is introduced, tailored to the unique aspects of survival data through an adapted loss function—the negative log-likelihood. The thesis delves into the design of the architecture and the intricate process of hyperparameter selection. Model performance evaluation is carried out using the concordance index, which measures prediction accuracy. During testing, the emergence of significant overfitting issues prompted the exploration of remedies, resulting in the incorporation of dropout layers and data augmentation techniques to mitigate these. The pinnacle of the proposed approach materializes in a deep CNN model with two convolutional layers housing 6 feature maps each, followed by an average pooling layer and 3 fully-connected layers. This optimized model was evaluated on the test set and compared to the winning model's performance. While our model's performance falls slightly short of the winner's, it remains remarkably competitive. By embracing this thesis, we have substantiated the potential of deep CNNs within the realm of cytometry data analysis, underscoring the need for further exploration and refinement in this direction.