Datum
2021-04-19Metadata
Zur Langanzeige
Aufsatz
Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
Zusammenfassung
Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research.
Zitierform
In: Educational and Psychological Measurement (EPM) Vol. 82 / Issue 1 (2021-04-19) , S. 29-56 ; eissn:1552-3888Förderhinweis
Gefördert im Rahmen eines Open-Access-Transformationsvertrags mit dem VerlagZitieren
@article{doi:10.17170/kobra-202201045355,
author={Schroeders, Ulrich and Schmidt, Christoph and Gnambs, Timo},
title={Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting},
journal={Educational and Psychological Measurement (EPM)},
year={2021}
}
0500 Oax 0501 Text $btxt$2rdacontent 0502 Computermedien $bc$2rdacarrier 1100 2021$n2021 1500 1/eng 2050 ##0##http://hdl.handle.net/123456789/13539 3000 Schroeders, Ulrich 3010 Schmidt, Christoph 3010 Gnambs, Timo 4000 Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting / Schroeders, Ulrich 4030 4060 Online-Ressource 4085 ##0##=u http://nbn-resolving.de/http://hdl.handle.net/123456789/13539=x R 4204 \$dAufsatz 4170 5550 {{Umfrage}} 5550 {{Daten}} 5550 {{Datenauswertung}} 5550 {{Bias}} 5550 {{Ausreißer <Statistik>}} 5550 {{Fehlererkennung}} 7136 ##0##http://hdl.handle.net/123456789/13539
2022-01-21T13:53:00Z 2022-01-21T13:53:00Z 2021-04-19 doi:10.17170/kobra-202201045355 http://hdl.handle.net/123456789/13539 Gefördert im Rahmen eines Open-Access-Transformationsvertrags mit dem Verlag eng Namensnennung 4.0 International http://creativecommons.org/licenses/by/4.0/ careless responding gradient boosted trees data cleaning response times outlier detection 150 Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting Aufsatz Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research. open access Schroeders, Ulrich Schmidt, Christoph Gnambs, Timo doi:10.1177/00131644211004708 Umfrage Daten Datenauswertung Bias Ausreißer <Statistik> Fehlererkennung publishedVersion eissn:1552-3888 Issue 1 Educational and Psychological Measurement (EPM) 29-56 Vol. 82 false
Die folgenden Lizenzbestimmungen sind mit dieser Ressource verbunden: