Date
2021-11-18Subject
004 Data processing and computer science Aktives Maschinelles LernenDatenstromKlassifikationSimulationEtikettierenAlgorithmusMetadata
Show full item record
Aufsatz
Stream-based active learning for sliding windows under the influence of verification latency
Abstract
Stream-based active learning (AL) strategies minimize the labeling effort by querying labels that improve the classifier’s performance the most. So far, these strategies neglect the fact that an oracle or expert requires time to provide a queried label. We show that existing AL methods deteriorate or even fail under the influence of such verification latency. The problem with these methods is that they estimate a label’s utility on the currently available labeled data. However, when this label would arrive, some of the current data may have gotten outdated and new labels have arrived. In this article, we propose to simulate the available data at the time when the label would arrive. Therefore, our method Forgetting and Simulating (FS) forgets outdated information and simulates the delayed labels to get more realistic utility estimates. We assume to know the label’s arrival date a priori and the classifier’s training data to be bounded by a sliding window. Our extensive experiments show that FS improves stream-based AL strategies in settings with both, constant and variable verification latency.
Citation
In: Machine Learning Volume 111 / Issue 6 (2021-11-18) , S. 2011-2036 ; eissn:1573-0565Sponsorship
Gefördert im Rahmen des Projekts DEALCitation
@article{doi:10.17170/kobra-202206016277,
author={Pham, Tuan and Kottke, Daniel and Krempl, Georg and Sick, Bernhard},
title={Stream-based active learning for sliding windows under the influence of verification latency},
journal={Machine Learning},
year={2021}
}
0500 Oax 0501 Text $btxt$2rdacontent 0502 Computermedien $bc$2rdacarrier 1100 2021$n2021 1500 1/eng 2050 ##0##http://hdl.handle.net/123456789/14057 3000 Pham, Tuan 3010 Kottke, Daniel 3010 Krempl, Georg 3010 Sick, Bernhard 4000 Stream-based active learning for sliding windows under the influence of verification latency / Pham, Tuan 4030 4060 Online-Ressource 4085 ##0##=u http://nbn-resolving.de/http://hdl.handle.net/123456789/14057=x R 4204 \$dAufsatz 4170 5550 {{Aktives Maschinelles Lernen}} 5550 {{Datenstrom}} 5550 {{Klassifikation}} 5550 {{Simulation}} 5550 {{Etikettieren}} 5550 {{Algorithmus}} 7136 ##0##http://hdl.handle.net/123456789/14057
2022-08-16T09:17:44Z 2022-08-16T09:17:44Z 2021-11-18 doi:10.17170/kobra-202206016277 http://hdl.handle.net/123456789/14057 Gefördert im Rahmen des Projekts DEAL eng Namensnennung 4.0 International http://creativecommons.org/licenses/by/4.0/ classification active learning evolving data streams concept drift verification latency label delay 004 Stream-based active learning for sliding windows under the influence of verification latency Aufsatz Stream-based active learning (AL) strategies minimize the labeling effort by querying labels that improve the classifier’s performance the most. So far, these strategies neglect the fact that an oracle or expert requires time to provide a queried label. We show that existing AL methods deteriorate or even fail under the influence of such verification latency. The problem with these methods is that they estimate a label’s utility on the currently available labeled data. However, when this label would arrive, some of the current data may have gotten outdated and new labels have arrived. In this article, we propose to simulate the available data at the time when the label would arrive. Therefore, our method Forgetting and Simulating (FS) forgets outdated information and simulates the delayed labels to get more realistic utility estimates. We assume to know the label’s arrival date a priori and the classifier’s training data to be bounded by a sliding window. Our extensive experiments show that FS improves stream-based AL strategies in settings with both, constant and variable verification latency. open access Pham, Tuan Kottke, Daniel Krempl, Georg Sick, Bernhard doi:10.1007/s10994-021-06099-z Aktives Maschinelles Lernen Datenstrom Klassifikation Simulation Etikettieren Algorithmus publishedVersion eissn:1573-0565 Issue 6 Machine Learning 2011-2036 Volume 111 false
The following license files are associated with this item: