Samodra, Guruh and Ngadisih, Ngadisih and Nugroho, Ferman Setia (2024) Benchmarking data handling strategies for landslide susceptibility modelingusing random forest workflows. Artificial Intelligence in Geosciences, 5: 100093. ISSN 26665441
![[thumbnail of S2666544124000340.htm]](https://ir.lib.ugm.ac.id/style/images/fileicons/text.png)
S2666544124000340.htm - Published Version
Restricted to Registered users only
Download (172kB) | Request a copy
Abstract
Machine learning (ML) algorithms are frequently used in landslide susceptibility modeling. Different data handling strategies may generate variations in landslide susceptibility modeling, even when using the same ML algorithm. This research aims to compare the combinations of inventory data handling, cross validation (CV), and hyperparameter tuning strategies to generate landslide susceptibility maps. The results are expected to provide a general strategy for landslide susceptibility modeling using ML techniques. The authors employed eight landslide inventory data handling scenarios to convert a landslide polygon into a landslide point, i.e., the landslide point is located on the toe (minimum height), on the scarp (maximum height), at the center of the landslide, randomly inside the polygon (1 point), randomly inside the polygon (3 points), randomly inside the polygon (5 points), randomly inside the polygon (10 points), and 15 m grid sampling. Random forest models using CV–nonspatial hyperparameter tuning, spatial CV–spatial hyperparameter tuning, and spatial CV–forward feature selection–no hyperparameter tuning were applied for each data handling strategy. The combination generated 24 random forest ML workflows, which are applied using a complete inventory of 743 landslides triggered by Tropical Cyclone Cempaka (2017) in Pacitan Regency, Indonesia, and 11 landslide controlling factors. The results show that grid sampling with spatial CV and spatial hyperparameter tuning is favorable because the strategy can minimize overfitting, generate a relatively high-performance predictive model, and reduce the appearance of susceptibility artifacts in the landslide area. Careful data inventory handling, CV, and hyperparameter tuning strategies should be considered in landslide susceptibility modeling to increase the applicability of landslide susceptibility maps in practical application. © 2024 The Authors
Item Type: | Article |
---|---|
Additional Information: | Cited by: 1; All Open Access, Green Open Access, Hybrid Gold Open Access |
Uncontrolled Keywords: | East Java; Indonesia; Pacitan; Benchmarking; Network security; Tropical cyclone; Cross validation; Hyper-parameter; Hyperparameter tuning; Landslide susceptibility; Machine-learning; Random forests; Sampling strategies; Spatial cross validations; Susceptibility; Work-flows; artifact; benchmarking; landslide; machine learning; performance assessment; polygon; species inventory; Decision trees |
Subjects: | G Geography. Anthropology. Recreation > GB Physical geography |
Divisions: | Faculty of Geography > Departemen Geografi Lingkungan |
Depositing User: | Sri Purwaningsih Purwaningsih |
Date Deposited: | 26 May 2025 07:20 |
Last Modified: | 26 May 2025 07:20 |
URI: | https://ir.lib.ugm.ac.id/id/eprint/18287 |