Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems

Research output: Contribution to journalArticlepeer-review

Standard Standard

Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems. / Wang, Jing; Hopkins, Laurel; Hallman, Tyler et al.
In: Transactions on Machine Learning Research, 04.10.2023.

Research output: Contribution to journalArticlepeer-review

HarvardHarvard

Wang, J, Hopkins, L, Hallman, T, Robinson, WD & Hutchinson, R 2023, 'Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems', Transactions on Machine Learning Research.

APA

Wang, J., Hopkins, L., Hallman, T., Robinson, W. D., & Hutchinson, R. (2023). Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems. Transactions on Machine Learning Research.

CBE

Wang J, Hopkins L, Hallman T, Robinson WD, Hutchinson R. 2023. Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems. Transactions on Machine Learning Research.

MLA

VancouverVancouver

Wang J, Hopkins L, Hallman T, Robinson WD, Hutchinson R. Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems. Transactions on Machine Learning Research. 2023 Oct 4.

Author

Wang, Jing ; Hopkins, Laurel ; Hallman, Tyler et al. / Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems. In: Transactions on Machine Learning Research. 2023.

RIS

TY - JOUR

T1 - Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems

AU - Wang, Jing

AU - Hopkins, Laurel

AU - Hallman, Tyler

AU - Robinson, W Douglas

AU - Hutchinson, Rebecca

PY - 2023/10/4

Y1 - 2023/10/4

N2 - Geostatistical learning problems are frequently characterized by spatial autocorrelation in the input features and/or the potential for covariate shift at test time. These realities violate the classical assumption of independent, identically distributed data, upon which most cross-validation algorithms rely in order to estimate the generalization performance of a model. In this paper, we present a theoretical criterion for unbiased cross-validation estimators in the geospatial setting. We also introduce a new cross-validation algorithm toevaluate models, inspired by the challenges of geospatial problems. We apply a framework for categorizing problems into different types of geospatial scenarios to help practitioners select an appropriate cross-validation strategy. Our empirical analyses compare cross-validation algorithms on both simulated and several real datasets to develop recommendations for a variety of geospatial settings. This paper aims to draw attention to some challenges that arise in model evaluation for geospatial problems and to provide guidance for users.

AB - Geostatistical learning problems are frequently characterized by spatial autocorrelation in the input features and/or the potential for covariate shift at test time. These realities violate the classical assumption of independent, identically distributed data, upon which most cross-validation algorithms rely in order to estimate the generalization performance of a model. In this paper, we present a theoretical criterion for unbiased cross-validation estimators in the geospatial setting. We also introduce a new cross-validation algorithm toevaluate models, inspired by the challenges of geospatial problems. We apply a framework for categorizing problems into different types of geospatial scenarios to help practitioners select an appropriate cross-validation strategy. Our empirical analyses compare cross-validation algorithms on both simulated and several real datasets to develop recommendations for a variety of geospatial settings. This paper aims to draw attention to some challenges that arise in model evaluation for geospatial problems and to provide guidance for users.

M3 - Article

JO - Transactions on Machine Learning Research

JF - Transactions on Machine Learning Research

ER -