Abstract
In recent years, deep learning has emerged as a powerful approach in remote
sensing applications, particularly in segmentation and classification
techniques that play a crucial role in extracting significant land features
from satellite and aerial imagery. However, only a limited number of papers
have discussed the use of deep learning for interactive segmentation in land
cover classification tasks. In this study, we aim to bridge the gap between
interactive segmentation and remote sensing image analysis by conducting a
benchmark study on various deep learning-based interactive segmentation models.
We assessed the performance of five state-of-the-art interactive segmentation
methods (SimpleClick, FocalClick, Iterative Click Loss (ICL), Reviving
Iterative Training with Mask Guidance for Interactive Segmentation (RITM), and
Segment Anything (SAM)) on two high-resolution aerial imagery datasets. To
enhance the segmentation results without requiring multiple models, we
introduced the Cascade-Forward Refinement (CFR) approach, an innovative
inference strategy for interactive segmentation. We evaluated these interactive
segmentation methods on various land cover types, object sizes, and band
combinations in remote sensing. Surprisingly, the popularly discussed method,
SAM, proved to be ineffective for remote sensing images. Conversely, the
point-based approach used in the SimpleClick models consistently outperformed
the other methods in all experiments. Building upon these findings, we
developed a dedicated online tool called RSISeg for interactive segmentation of
remote sensing data. RSISeg incorporates a well-performing interactive model,
fine-tuned with remote sensing data. Additionally, we integrated the SAM model
into this tool. Compared to existing interactive segmentation tools, RSISeg
offers strong interactivity, modifiability, and adaptability to remote sensing
data.