TY - JOUR
T1 - Utilizing unsupervised learning, multi-view imaging, and CNN-based attention facilitates cost-effective wetland mapping
AU - Hu, Qiao
AU - Woldt, Wayne
AU - Neale, Christopher
AU - Zhou, Yuzhen
AU - Drahota, Jeff
AU - Varner, Dana
AU - Bishop, Andy
AU - LaGrange, Ted
AU - Zhang, Ligang
AU - Tang, Zhenghong
N1 - Publisher Copyright:
© 2021
PY - 2021/12/15
Y1 - 2021/12/15
N2 - The combination of Unmanned/Unoccupied Aerial Vehicle (UAV) data and deep learning, especially convolutional neural networks (CNNs), offers robust new tools for precision land cover mapping. However, its successful application is highly dependent on local experiences that are rarely documented, resulting in practical limitations during implementation. Cost-effective deep learning frameworks for fast deployment are required. This study presents a deep learning adaptation framework, named Auto-UNet++, trying to streamline wetland mapping tasks (including training data labeling and organizing). The framework treats mapping tasks as an intact semantic segmentation pipeline and then integrates automatic strategies into each step to reduce human intervention. These automatic strategies are achieved by standard computer vision techniques, including multi-view (MV) imaging—highly overlapped UAV images over an area (for labeling/voting), unsupervised clustering (for labeling), multi-scale CNN (for feature extraction), and attention mechanism—a CNN design used to select informative features from input (for feature exploration/selection). The framework was tested on playa wetland mapping in the Rainwater Basin, Nebraska, USA, with multispectral UAV datasets. Generally, the multi-scale CNN mapping task achieved a high of 87% overall accuracy and over 90% accuracy in water delineation. The results indicate that the multi-view and attention strategies have the potential to improve segmentation performance, and together with unsupervised learning, save considerable labor/expertise. Interestingly, evidence shows that the band/scale attention (weight) is adaptively associated with the land cover percentages per input image, indicating spatial contexts captured. This finding highlights the potential usages of the attention rule in automatic feature exploration, selection, and model interpretation. The framework illustrating a highly automated deep learning deployment on small MV datasets facilitates cost-effective wetland cover mapping. Although limitations exist, the study demonstrated the possibility of where/how conventional segmentation pipelines can be improved in typical UAV wetland mapping tasks. The framework and findings are useful for similar applications (including non-UAV studies) that only have limited time, labor, and expertise to implement sophisticated semantic segmentation models.
AB - The combination of Unmanned/Unoccupied Aerial Vehicle (UAV) data and deep learning, especially convolutional neural networks (CNNs), offers robust new tools for precision land cover mapping. However, its successful application is highly dependent on local experiences that are rarely documented, resulting in practical limitations during implementation. Cost-effective deep learning frameworks for fast deployment are required. This study presents a deep learning adaptation framework, named Auto-UNet++, trying to streamline wetland mapping tasks (including training data labeling and organizing). The framework treats mapping tasks as an intact semantic segmentation pipeline and then integrates automatic strategies into each step to reduce human intervention. These automatic strategies are achieved by standard computer vision techniques, including multi-view (MV) imaging—highly overlapped UAV images over an area (for labeling/voting), unsupervised clustering (for labeling), multi-scale CNN (for feature extraction), and attention mechanism—a CNN design used to select informative features from input (for feature exploration/selection). The framework was tested on playa wetland mapping in the Rainwater Basin, Nebraska, USA, with multispectral UAV datasets. Generally, the multi-scale CNN mapping task achieved a high of 87% overall accuracy and over 90% accuracy in water delineation. The results indicate that the multi-view and attention strategies have the potential to improve segmentation performance, and together with unsupervised learning, save considerable labor/expertise. Interestingly, evidence shows that the band/scale attention (weight) is adaptively associated with the land cover percentages per input image, indicating spatial contexts captured. This finding highlights the potential usages of the attention rule in automatic feature exploration, selection, and model interpretation. The framework illustrating a highly automated deep learning deployment on small MV datasets facilitates cost-effective wetland cover mapping. Although limitations exist, the study demonstrated the possibility of where/how conventional segmentation pipelines can be improved in typical UAV wetland mapping tasks. The framework and findings are useful for similar applications (including non-UAV studies) that only have limited time, labor, and expertise to implement sophisticated semantic segmentation models.
KW - Attention mechanism
KW - Automation
KW - Deep learning
KW - Feature selection
KW - Multi-view
KW - Multiscale CNN
KW - Network pruning
KW - Semantic segmentation
KW - UAV
KW - Wetland mapping
UR - http://www.scopus.com/inward/record.url?scp=85117578747&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117578747&partnerID=8YFLogxK
U2 - 10.1016/j.rse.2021.112757
DO - 10.1016/j.rse.2021.112757
M3 - Article
AN - SCOPUS:85117578747
SN - 0034-4257
VL - 267
JO - Remote Sensing of Environment
JF - Remote Sensing of Environment
M1 - 112757
ER -