+高级检索
基于孪生网络的特征融合位移RGB-T目标跟踪
作者:

FSSiamNet:Feature Fusion Shift Siamese Network for RGB-T Target Tracking
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
    摘要:

    为解决现有目标跟踪算法深层次特征提取困难、不能充分利用跨模态信息以及目标特征表示较弱等问题,提出了基于孪生网络的特征融合位移RGB-T目标跟踪算法.首先,基于可见光模态SiameseRPN++的目标跟踪框架,扩展设计红外模态分支,以获得多模态目标跟踪框架,设计了改进步长的ResNet50作为特征提取网络,有效挖掘目标的深层次特征.随后,设计特征交互学习模块,利用一种模态的判别信息引导另一种模态的目标外观特征学习,挖掘特征空间和通道中的跨模态信息,增强网络对前景信息的关注.然后,设计多模特征融合模块计算输入的可见光图像和红外图像的特征融合度,对不同模态的重要特征进行空间融合以去除冗余信息,并采用级联融合策略重建多模态图像,增强目标特征表示.最后,设计特征空间位移模块,分割红外模态分支的特征图并向四个不同方向移位,增强热源目标特征的边缘表示.在两个RGB-T数据集上的实验验证了提出算法的有效性,消融实验证明了设计的单个模块的优越性.

    Abstract:

    To solve the problems of the existing target tracking algorithms, such as inability to extract deep-level features, failure to fully exploit cross-modal information, and weak representation of target features, a feature fusion shift Siamese network for RGB-T target tracking is proposed. First, a target tracking framework based on the visible modal SiameseRPN++ is designed to extend the infrared modal branch, in order to obtain a multimodal target tracking framework. Moreover, the improved ResNet50 network with adjusted stride as a feature extraction network enables the acquisition of deep-level features of the target. Subsequently, a multimodal feature interactive learning module (FIM) is designed to leverage the discriminative information from one modality to guide the learning process of target appearance features in the other modality. By mining the cross-modal information within the feature space and channels, the module enhances the network’s attention towards foreground information. Thereafter, a multimode feature fusion module (FAM) is designed, which calculates the degree of feature fusion between the input visible light image and the infrared image, enabling spatial fusion of significant features from different modalities to effectively eliminate redundant information and reconstructing multimodal images by employing a cascade fusion strategy. Finally, a feature space shift module (FSM) is designed, which divides the feature maps of the infrared modal branches and shifts them in four different directions to enhance the edge representation of the heat source target. Extensive experiments on two RGB-T datasets thoroughly validate the effectiveness of the proposed algorithm, while ablation experiments demonstrate the superiority of each designed module.

    参考文献
    相似文献
    引证文献
文章指标
  • PDF下载次数:
  • HTML阅读次数:
  • 摘要点击次数:
  • 引用次数:
引用本文

李海燕 ,曹永辉 ,郎恂 ?,李海江 .基于孪生网络的特征融合位移RGB-T目标跟踪[J].湖南大学学报:自然科学版,2025,52(4):68~78

复制
历史
  • 在线发布日期: 2025-04-28
作者稿件一经被我刊录用,如无特别声明,即视作同意授予我刊论文整体的全部复制传播的权利,包括但不限于复制权、发行权、信息网络传播权、广播权、表演权、翻译权、汇编权、改编权等著作使用权转让给我刊,我刊有权根据工作需要,允许合作的数据库、新媒体平台及其他数字平台进行数字传播和国际传播等。特此声明。
关闭