(1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China; 2.Engineering School, Qinghai Institute of Technology, Xining 810016, China) 在知网中查找 在百度中查找 在本站中查找
To solve the problems of insufficient understanding and fuzzy feature boundary segmentation of multi-target categories small-scale feature semantic information in aerial remote sensing images in complex background, this paper designs a segmentation model that integrates the features of the backbone network information and classifies and reconstructs the features to improve the segmentation effect. The model takes Swin-Transformer as the coding structure and utilizes its ability to understand global semantic information for feature extraction. The segmentation of small-scale target features is refined by the designed information grouping reconstruction convolution (IGRM) and channel classification reconstruction convolution (CRRM), which classify and reconstruct the extracted features by the amount of information. Finally, by integrating the up-sampling and down-sampling connections, the reconstructed features are fused with the features extracted by the encoder to form a multi-scale feature aggregation block to output the segmentation results. The refined reconstruction of small-scale target features is realized in multi-target scenarios with complex backgrounds, and high-quality segmentation maps are generated to improve the segmentation accuracy. Experimental results on the ISPRS Potsdam and ISPRS Vaihingen datasets show that the average intersection and merger ratio (mIoU) is 87.15% and 82.93%, respectively, and the overall accuracy (OA) is 91.53% and 91.4%, respectively. To verify the generalization ability of the model for small-scale target feature extraction in multi-target categories, this paper also designs a comparative experiment for the category of carts in complex backgrounds. The experimental results show that the mIoU on the UAVid dataset reaches 67.86%.