Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems

Jing Wang; Laurel Hopkins; Tyler Hallman; W Douglas Robinson; Rebecca Hutchinson

Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems

Allbwn ymchwil: Cyfraniad at gyfnodolyn › Erthygl › adolygiad gan gymheiriaid

Fersiynau electronig

Dogfennau

1149_cross_validation_for_geospatia
Llawysgrif awdur wedi’i dderbyn, 4.07 MB, dogfen-PDF

Jing Wang
Marine Mammal Institute, Hatfield Marine Science Center, Oregon State University, Newport, Oregon
Laurel Hopkins
Marine Mammal Institute, Hatfield Marine Science Center, Oregon State University, Newport, Oregon
Tyler Hallman
Ysgol Gwyddorau Amgylcheddol a Naturiol
W Douglas Robinson
Marine Mammal Institute, Hatfield Marine Science Center, Oregon State University, Newport, Oregon
Rebecca Hutchinson
Marine Mammal Institute, Hatfield Marine Science Center, Oregon State University, Newport, Oregon

Geostatistical learning problems are frequently characterized by spatial autocorrelation in the input features and/or the potential for covariate shift at test time. These realities violate the classical assumption of independent, identically distributed data, upon which most cross-validation algorithms rely in order to estimate the generalization performance of a model. In this paper, we present a theoretical criterion for unbiased cross-validation estimators in the geospatial setting. We also introduce a new cross-validation algorithm to
evaluate models, inspired by the challenges of geospatial problems. We apply a framework for categorizing problems into different types of geospatial scenarios to help practitioners select an appropriate cross-validation strategy. Our empirical analyses compare cross-validation algorithms on both simulated and several real datasets to develop recommendations for a variety of geospatial settings. This paper aims to draw attention to some challenges that arise in model evaluation for geospatial problems and to provide guidance for users.

Iaith wreiddiol	Saesneg
Cyfnodolyn	Transactions on Machine Learning Research
Statws	Cyhoeddwyd - 4 Hyd 2023

Cyfanswm lawlrlwytho

Nid oes data ar gael

Gweld graff cysylltiadau

Porth Ymchwil

Cross-validation for geospatial data: Estimating generalization performance in geostatistical problems

Fersiynau electronig

Dogfennau

Cyfanswm lawlrlwytho