A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Herde, Marek; Huseljic, Denis; Sick, Bernhard; Calma, Adrian

Aufsatz

Zusammenfassung

🇬🇧

Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query difficulty. However, in real-world applications, we often face human annotators, e.g., crowd or in-house workers, who make annotation mistakes and can be reluctant to respond if tired or faced with complex queries. Recently, many novel AL strategies have been proposed to address these issues. They differ in at least one of the following three central aspects from traditional AL: 1) modeling of (multiple) human annotators whose performances can be affected by various factors, such as missing expertise; 2) generalization of the interaction with human annotators through different query and annotation types, such as asking an annotator for feedback on an inferred classification rule; 3) consideration of complex cost schemes regarding annotations and misclassifications. This survey provides an overview of these AL strategies and refers to them as real-world AL. Therefore, we introduce a general real-world AL strategy as part of a learning cycle and use its elements, e.g., the query and annotator selection algorithm, to categorize about 60 real-world AL strategies. Finally, we outline possible directions for future research in the field of AL.

Zitierform

In: IEEE Access Volume 9 (2021-12-14) , S. 166970-166989 ; eissn:2169-3536

Förderhinweis

Gefördert durch den Publikationsfonds der Universität Kassel

Sammlung(en)

Artikel (Publikationen im Open Access gefördert durch die UB)

Zitieren

BibTex

@article{doi:10.17170/kobra-202205036117,
   author={Herde, Marek and Huseljic, Denis and Sick, Bernhard and Calma, Adrian},
   title={A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification},
   journal={IEEE Access},
   year={2021}
}

0500 Oax
0501 Text $btxt$2rdacontent
0502 Computermedien $bc$2rdacarrier
1100 2021$n2021
1500 1/eng
2050 ##0##http://hdl.handle.net/123456789/13801
3000 Herde, Marek
3010 Huseljic, Denis
3010 Sick, Bernhard
3010 Calma, Adrian
4000 A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification / Herde, Marek
4030 
4060 Online-Ressource
4085 ##0##=u http://nbn-resolving.de/http://hdl.handle.net/123456789/13801=x R
4204 \$dAufsatz
4170 
5550 {{Aktives maschinelles Lernen}}
5550 {{Human-in-the-loop}}
5550 {{Kosten}}
5550 {{Klassifikation}}
7136 ##0##http://hdl.handle.net/123456789/13801


<resource xsi:schemaLocation="http://datacite.org/schema/kernel-2.2 http://schema.datacite.org/meta/kernel-2.2/metadata.xsd">
2022-05-03T06:34:49Z
2022-05-03T06:34:49Z
2021-12-14
doi:10.17170/kobra-202205036117
http://hdl.handle.net/123456789/13801
Gef&ouml;rdert durch den Publikationsfonds der Universit&auml;t Kassel
eng
Namensnennung 4.0 International
http://creativecommons.org/licenses/by/4.0/
Active learning
classification
error-prone annotators
human-in-the-loop learning
inter- active learning
machine learning
004
300
370
A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification
Aufsatz
Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query difficulty. However, in real-world applications, we often face human annotators, e.g., crowd or in-house workers, who make annotation mistakes and can be reluctant to respond if tired or faced with complex queries. Recently, many novel AL strategies have been proposed to address these issues. They differ in at least one of the following three central aspects from traditional AL: 1) modeling of (multiple) human annotators whose performances can be affected by various factors, such as missing expertise; 2) generalization of the interaction with human annotators through different query and annotation types, such as asking an annotator for feedback on an inferred classification rule; 3) consideration of complex cost schemes regarding annotations and misclassifications. This survey provides an overview of these AL strategies and refers to them as real-world AL. Therefore, we introduce a general real-world AL strategy as part of a learning cycle and use its elements, e.g., the query and annotator selection algorithm, to categorize about 60 real-world AL strategies. Finally, we outline possible directions for future research in the field of AL.
open access
Herde, Marek
Huseljic, Denis
Sick, Bernhard
Calma, Adrian
doi:10.1109/ACCESS.2021.3135514
Aktives maschinelles Lernen
Human-in-the-loop
Kosten
Klassifikation
publishedVersion
eissn:2169-3536
IEEE Access
166970-166989
Volume 9
false
</resource>

Die folgenden Lizenzbestimmungen sind mit dieser Ressource verbunden:

Creative Commons Lizenz

Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Namensnennung 4.0 International

Öffnen

Datum

Autor

Schlagwort

URI

Metadata