Improving classification accuracy using Data Augmentation on small data sets

Francisco J. Moreno-Barea, José M. Jerez, Leonardo Franco

Expert Systems with Applications•2020•Vol. 161: 113696

174

Citas

Visualizaciones

139

Descargas

Altmetric Score

21/7/2020

Publicado

Autores

Fco. Javier Moreno-Barea
CorrespondenciaCorresp

Departamento de Lenguajes y Ciencias de la Computación, Escuela Técnica Superior de Ingeniería Informática, Universidad de Málaga, Málaga, Spain

José Jerez Aragonés

Departamento de Lenguajes y Ciencias de la Computación, Escuela Técnica Superior de Ingeniería Informática, Universidad de Málaga, Málaga, Spain

Leonardo Franco Ruiz

Departamento de Lenguajes y Ciencias de la Computación, Escuela Técnica Superior de Ingeniería Informática, Universidad de Málaga, Málaga, Spain

Resumen

Data augmentation (DA) is a key element in the success of Deep Learning (DL) models, as its use can lead to better prediction accuracy values when large size data sets are used. DA was not very much used with earlier neural network models before 2012, and the reason might be related to the type of models and the size of the data sets used. We investigate in this work, applying several state-of-the-art models based on Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), the effect of DA when using small size data sets, analyzing the results in terms of the prediction accuracy obtained according to the different characteristics of the training samples (number of instances and features, and class unbalance degree). We further introduce modifications to the standard methods used to generate the synthetic samples to alter the class balance representation, and the overall results indicate that with some computational effort a significant increase in prediction accuracy can be obtained when small data sets are considered.

Palabras Clave

Deep Learning

Data Augmentation

GAN

VAE

Unbalanced sets

Acceso a la Publicación

Ver en Revista

Información de Publicación

Volumen

161

Páginas

113696

Publicado

21/7/2020

Recibido

17/10/2019

Aceptado

24/6/2020

Métricas de Impacto

Citas174

Factor de Impacto0

Cuartil

Descargas139

Altmetric8

Improving classification accuracy using Data Augmentation on small data sets

Fco. Javier Moreno-BareaCorrespondenciaCorresp

José Jerez Aragonés

Leonardo Franco Ruiz

Fco. Javier Moreno-Barea
CorrespondenciaCorresp