Learning with Online Drift Detection

Authors

  • Isvani Frías Blanco Universidad de Granma
  • Jose del Campo Ávila Universidad de Málaga
  • Gonzalo Ramos Jiménez Universidad de Málaga
  • Rafael Morales Bueno Universidad de Málaga
  • Agustín Ortiz Díaz Universidad de Granma
  • Yailé Caballero Mota Universidad de Camagüey

DOI:

https://doi.org/10.13053/cys-18-1-1573

Keywords:

Incremental learning, concept drift, concept drift detection, control chart, data stream, Hoeffding’s bound.

Abstract

Learning in data streams is a problem of growing interest. The target function of data streams may change over time, so in such situations, a learning model induced with some previous data may be inconsistent with the current data. This problem is commonly known as concept drift. The strategy broadly used to handle concept drift is to continuously monitor a chosen performance measure of the model over time; if the model performance drops, adequate actions are executed to adapt the model. Taking this into account, our paper proposes a new method to detect drifting concepts, which is independent of the learning algorithm. We use a probability inequality (Hoeffding’s inequality) to offer probabilistic guarantees for the detection of significant changes in the mean of real values. The detection is based on the comparison of averages corresponding to two samples by means of identification of a single relevant cut-point in this sequence of real values maintaining a fixed number of counters and with constant time complexity. As some previous approaches, our method is based on ideas of statistical process control. Preliminary empirical evaluations considering well-known data streams, change detectors and various classifiers reveal advantages of the proposed method.

Published

2014-04-01