Francisco J. Moreno-Barea, José M. Jerez, Nuria Ribelles, Emilio Alba, Leonardo Franco

Hospital Universitario Virgen de la Victoria, Málaga, Spain

Hospital Universitario Virgen de la Victoria, Málaga, Spain
Breast cancer is a major public health problem, with 2.3M new cases diagnosed each year. Immunotherapy is an effective treatment for breast cancer depending on several factors like subtype of tumours or associated prognosis. However, the immune system’s efficiency depends on the local microenvironment and requires region-specific trials with a reduced number of samples. To minimise this drawback and improve the accuracy of patient prognosis predictions, we explore several data augmentation methods, i.e. noise injection, oversampling techniques and generative adversarial networks. The experiment was conducted through a set of immune system gene expression samples donated by 165 breast cancer patients from the Málaga region. Results showed a 5% increase in AUC and a 23- 36% increase in F1 score for subtype prediction.