Conference paper
Machine Learning
IF: 0

Data Augmentation Meta-Classifier Scheme for imbalanced data sets

Francisco J. Moreno-Barea, José M. Jerez, Leonardo Franco

2022 IEEE Symposium Series on Computational Intelligence (SSCI)2022Vol. : 1392-1399
0
Citas
81
Visualizaciones
0
Descargas
N/A
Altmetric Score
30/1/2023
Publicado
Resumen

Categorical data obtained from real-world domains are commonly imbalanced, as they often present more number of samples belonging to one of the classes. Imbalanced data tends to be a problem for classifiers, as the majority class biased them and affects overall performance. Among the techniques used for dealing with imbalanced data sets, data augmentation (DA) constitutes an alternative, as it can improve the accuracy of prediction for the minority class (usually the relevant one), but usually at the cost of a loss regarding predictions of the majority one. To benefit from both behaviours, we introduce in this study a meta-classifier scheme that works as a mixture of two classifiers, one trained with the original data and the second one trained using augmented data. The experiments carried out with 12 imbalanced data sets, 5 of them obtained from the TCGA database related to cancer survival prediction, show an improvement in accuracy, area under the ROC curve and Matthews correlation coefficient values compared to the results obtained using the original data sets.

Palabras Clave
Data Augmentation
Imbalance Learning
Data Mining
Meta-Classifier
Acceso a la Publicación
Información de Publicación
Páginas
1392-1399
Publicado
30/1/2023
Aceptado
4/12/2022
Métricas de Impacto
Citas0
Factor de Impacto0
Cuartil
Visualizaciones81
00