Revista JCR
Bioinformática
IF: 0
TBD

Application of Data Augmentation techniques towards metabolomics

Francisco J. Moreno-Barea, Leonardo Franco, David Elizondo, Martin Grootveld

Computers in Biology and Medicine2022Vol. 148
13
Citas
74
Visualizaciones
N/A
Descargas
N/A
Altmetric Score
27/7/2022
Publicado
Resumen

Niemann–Pick Class 1 (NPC1) disease is a rare and debilitating neurodegenerative lysosomal storage disease (LSD). Metabolomics datasets of NPC1 patients available to perform this type of analysis are often limited in the number of samples and severely unbalanced. In order to improve the predictive capability and identify new biomarkers in an NPC1 disease urinary dataset, data augmentation (DA) techniques based on computational intelligence have been employed to create synthetic samples, i.e. the addition of noise, oversampling techniques and conditional generative adversarial networks. These techniques have been used to evaluate their predictive capacities on a set of urine samples donated by 13 untreated NPC1 disease and 47 heterozygous (parental) carrier control participants. Results on the prediction have also been obtained using different machine learning classification models and the partial least squares techniques. These results provide strong evidence for the ability of DA techniques to generate good quality synthetic data. Results acquired show increases in sensitivity of 20%–50%, an F1 score of 6%–30%, and a predictive capacity of 0.3 (out of 1). Additionally, more conventional forms of multivariate data analysis have been employed. These have allowed the detection of unusual urinary metabolite profiles, and the identification of biomarkers through the use of synthetically augmented datasets. Results indicate that urinary branched-chain amino acids such as valine, 3-aminoisobutyrate and quinolinate, may be employable as valuable biomarkers for the diagnosis and prognostic monitoring of NPC1 disease.

Palabras Clave
Data Augmentation
Machine Learning
Metabolomics
Niemann–Pick type C disease
Rare Diseases
Acceso a la Publicación
Información de Publicación
Volumen
148
Publicado
27/7/2022
Recibido
26/4/2022
Aceptado
23/7/2022
Métricas de Impacto
Citas13
Factor de Impacto0
Cuartil
TBD
Visualizaciones74