Empirical Study of the Associative Approach in the Context of Classification Problems

Authors

  • Laura Cleofas Sánchez Instituto Politécnico Nacional, Sección de Posgrado, E.S.I.M.E.
  • Anabel Pineda Briseño Tecnológico Nacional de México, Instituto Tecnológico de Matamoros
  • Rosa María Valdovinos Rosas Universidad Autónoma del Estado de México, Facultad de Ingeniería
  • José Salvador Sánchez Garreta Universidad de Jaume I, Instituto de Nuevas Tecnologías de la Imagen, Departamento de Lenguajes y Sistemas de la Informática, Castellón de la Plana
  • Vicente García Jiménez Universidad Autónoma de la Ciudad de Juárez, Departamento de Ingeniería Eléctrica y Computación, Ciudad de Juárez
  • Oscar Camacho Nieto Instituto Politécnico Nacional, Centro de Investigación en Computación
  • Héctor Pérez Meana Instituto Politécnico Nacional, Sección de Posgrado, E.S.I.M.E.
  • Mariko Nakano Miyatake Instituto Politécnico Nacional, Sección de Posgrado, E.S.I.M.E.

DOI:

https://doi.org/10.13053/cys-23-2-3026

Keywords:

Recovery, classification, associative approach, neural networks, C4.5, SVM, imbalance, overlap, atypical patterns, Wilson, selective, SMOTE

Abstract

Research carried out by the scientific community has shown that the performance of the classifiers depends not only on the learning rule, if not also on the complexities inherent in the data sets. Some traditional classifiers have been commonly used in the context of classification problems (three Neural Networks, C4.5, SVM, among others). However, the associative approach has been further explored in the recovery context, than in the classification task, and its performance almost has not been analyzed when several complexities in the data are presented. The present investigation analyzes the performance of the associative approach (CHA, CHAT and original Alpha Beta) when three classification problems occur (class imbalance, overlapping and a typical patterns). The results show that the CHAT algorithm recognizes the minority class better than the rest of the classifiers in the context of class imbalance. However, the CHA model ignores the minority class in most cases. In addition, the CHAT algorithm requires well-defined decision boundaries when Wilson’s method is applied, because of its performance increases. Also, it was noted that when a balance between the rates is emphasized, the performance of the three classifiers increase (RB, RFBR and CHAT). The original Alfa Beta model shows poor performance when pre-processing the data is done. The performance of the classifiers increases significantly when the SMOTE method is applied, which does not occur without a pre-processing or with a subsampling, in the context of the imbalance of the classes.

Author Biography

Laura Cleofas Sánchez, Instituto Politécnico Nacional, Sección de Posgrado, E.S.I.M.E.

Me encuentro cosimionada por CONACyT (cátedras) como profesor Investigador, en ell Instituto Politécnico Nacional,  en la sección de posgrado de ESIME, culhuacan. Tengo el Doctorado en Ciencias de la Computación por el CIC-IPN.

Published

2019-06-27