Validation and generalizability of machine learning prediction models on attrition in longitudinal studies

Validation and generalizability of machine learning prediction models on attrition in longitudinal studies

dc.date.accessioned	2022-04-19T12:05:04Z
dc.date.available	2022-04-19T12:05:04Z
dc.date.issued	2022-02-07
dc.description.sponsorship	Gefördert im Rahmen eines Open-Access-Transformationsvertrags mit dem Verlag	ger
dc.identifier	doi:10.17170/kobra-202203035823
dc.identifier.uri	http://hdl.handle.net/123456789/13763
dc.language.iso	eng	eng
dc.relation.doi	doi:10.1177/01650254221075034
dc.rights	Namensnennung 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	machine learning	eng
dc.subject	attrition	eng
dc.subject	longitudinal studies	eng
dc.subject	predictive modeling	eng
dc.subject	generalizability	eng
dc.subject.ddc	150
dc.subject.swd	Maschinelles Lernen	ger
dc.subject.swd	Längsschnittuntersuchung	ger
dc.subject.swd	Prognosemodell	ger
dc.subject.swd	Fehlende Daten	ger
dc.title	Validation and generalizability of machine learning prediction models on attrition in longitudinal studies	eng
dc.type	Aufsatz
dc.type.version	publishedVersion
dcterms.abstract	Attrition in longitudinal studies is a major threat to the representativeness of the data and the generalizability of the findings. Typical approaches to address systematic nonresponse are either expensive and unsatisfactory (e.g., oversampling) or rely on the unrealistic assumption of data missing at random (e.g., multiple imputation). Thus, models that effectively predict who most likely drops out in subsequent occasions might offer the opportunity to take countermeasures (e.g., incentives). With the current study, we introduce a longitudinal model validation approach and examine whether attrition in two nationally representative longitudinal panel studies can be predicted accurately. We compare the performance of a basic logistic regression model with a more flexible, data-driven machine learning algorithm—gradient boosting machines. Our results show almost no difference in accuracies for both modeling approaches, which contradicts claims of similar studies on survey attrition. Prediction models could not be generalized across surveys and were less accurate when tested at a later survey wave. We discuss the implications of these findings for survey retention, the use of complex machine learning algorithms, and give some recommendations to deal with study attrition.	eng
dcterms.accessRights	open access
dcterms.creator	Jankowsky, Kristin
dcterms.creator	Schroeders, Ulrich
dcterms.source.identifier	eissn:1464-0651
dcterms.source.issue	Issue 2
dcterms.source.journal	International Journal of Behavioral Development (IJBD)	eng
dcterms.source.pageinfo	169-176
dcterms.source.volume	Volume 46
kup.iskup	false

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 01650254221075034.pdf
Size:: 383.44 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 3.03 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Artikel