Empirical Study of the Associative Approach in the Context of Classification Problems
Abstract
Research carried out by the scientific community has shown that the performance of the classifiers depends not only on the learning rule, if not also on the complexities inherent in the data sets. Some traditional classifiers have been commonly used in the context of classification problems (three Neural Networks, C4.5, SVM, among others). However, the associative approach has been further explored in the recovery context, than in the classification task, and its performance almost has not been analyzed when several complexities in the data are presented. The present investigation analyzes the performance of the associative approach (CHA, CHAT and original Alpha Beta) when three classification problems occur (class imbalance, overlapping and a typical patterns). The results show that the CHAT algorithm recognizes the minority class better than the rest of the classifiers in the context of class imbalance. However, the CHA model ignores the minority class in most cases. In addition, the CHAT algorithm requires well-defined decision boundaries when Wilson’s method is applied, because of its performance increases. Also, it was noted that when a balance between the rates is emphasized, the performance of the three classifiers increase (RB, RFBR and CHAT). The original Alfa Beta model shows poor performance when pre-processing the data is done. The performance of the classifiers increases significantly when the SMOTE method is applied, which does not occur without a pre-processing or with a subsampling, in the context of the imbalance of the classes.
Keywords
Recovery, classification, associative approach, neural networks, C4.5, SVM, imbalance, overlap, atypical patterns, Wilson, selective, SMOTE