Computer vision-based gesturerecognition methods play a significant role in robot visual gesture interaction. since of low accuracy leading by insuffcient feature representation and fusion, the existing gesture segme...
详细信息
Computer vision-based gesturerecognition methods play a significant role in robot visual gesture interaction. since of low accuracy leading by insuffcient feature representation and fusion, the existing gesturesegmentation and recognition methods fail to meet the requirements of practical applications. To address these issues, a lightweight two-stage end-to-end gesturerecognition network called Fusing Gate Dual Stages Network (FGDSNet) is proposed. This network adopts a dual-branch network structure in the segmentation stage. Existing dual-branch network models often directly fuse detailed features and semantic features, which leads to detailed information being obscured by blurry semantic information. Additionally, there are redundant issues in the feature maps at different levels during the network inference process. Therefore, we embed Cosine Similarity-KL Divergence Attention Module (CoSKLAM) and Gate Filtering Module (GFM) between the local detail branch and the contextual semantic branch. The role of these two modules is to facilitate the fusion of local and global features during the feature extraction process and filter out redundant information. Finally, the segmentation result and original gesture image are used as inputs for the recognition network to predict gesture categories. The relevant experiments show that the proposed network performs well in both gesturesegmentation and gesturerecognition, while also having real-time inference speed and a smaller parameter size.
暂无评论