Learning with Online Drift Detection

Isvani Frías Blanco, Jose del Campo Ávila, Gonzalo Ramos Jiménez, Rafael Morales Bueno, Agustín Ortiz Díaz, Yailé Caballero Mota

Abstract


Learning in data streams is a problem of growing interest. The target function of data streams may change over time, so in such situations, a learning model induced with some previous data may be inconsistent with the current data. This problem is commonly known as concept drift. The strategy broadly used to handle concept drift is to continuously monitor a chosen performance measure of the model over time; if the model performance drops, adequate actions are executed to adapt the model. Taking this into account, our paper proposes a new method to detect drifting concepts, which is independent of the learning algorithm. We use a probability inequality (Hoeffding’s inequality) to offer probabilistic guarantees for the detection of significant changes in the mean of real values. The detection is based on the comparison of averages corresponding to two samples by means of identification of a single relevant cut-point in this sequence of real values maintaining a fixed number of counters and with constant time complexity. As some previous approaches, our method is based on ideas of statistical process control. Preliminary empirical evaluations considering well-known data streams, change detectors and various classifiers reveal advantages of the proposed method.

Keywords


Incremental learning; concept drift; concept drift detection; control chart; data stream; Hoeffding’s bound.

Full Text: PDF (Spanish)