文献详情 >Performer: A High-Performance ... 收藏

Performer: A High-Performance Global-Local Model-Augmented with Dual Network Interaction Mechanism

作者：Tan, Dayu Hao, Rui Hua, Linfeng Xu, Qi Su, Yansen Zheng, Chunhou Zhong, Weimin

作者机构：Anhui University Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education Institutes of Physical Science and Information Technology Hefei230601 China East China University of Science and Technology Key Laboratory of Smart Manufacturing in Energy Chemical Process Ministry of Education Shanghai200237 China Dalian University of Technology School of Computer Science and Technology Dalian116024 China

出版物：《IEEE Transactions on Cognitive and Developmental Systems》 (IEEE Trans. Cogn. Dev. Syst.)

年卷期：2024年

核心收录：

学科分类：1205[管理学-图书情报与档案管理] 0808[工学-电气工程] 08[工学]

主　　题：Semantic Segmentation

摘要：In deep learning, Convolutional Neural Networks (CNNs) focus on local information through convolutional kernels, while transformers attend to global information using self-attention mechanisms. The union of these distinct approaches enables a more comprehensive extraction of image features. However, the feature map dimensions of CNN and Transformer differ, leading to dimension mismatch issues when combining these architectures. Additionally, the parameter size of the hybrid model integrating both architectures remains large, making it difficult to train. To further augmenting the interpretation of complex image patterns, we present Performer, a dual-network architecture that seamlessly combines CNNs and transformers, resulting in a novel and efficient representation learning model. In the Performer model, we innovate by devising a unique interaction methodology for CNN and transformer architectures to enhance the image feature extraction capabilities mutually. To counteract issue of dimensionality mismatch, we also introduce a refined transformer block, a advancement over the transformer block of ViT. To validate the effectiveness of Performer, we conduct extensive experiments on both classification and segmentation tasks. Performer achieve an accuracy of 83.37% on the ImageNet-200 dataset. For semantic segmentation, Performer excels on the CamVid and Hippocampus datasets. On CamVid, our model achieves a mean Intersection over Union (mIoU) of 63.27% and Pixel Accuracy of 92.11%, demonstrating superior performance in capturing fine details and handling complex scenes effectively. The code is available at https://***/hlfthh/ Performer. © 2016 IEEE.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Performer: A High-Performance Global-Local Model-Augmented with Dual Network Interaction Mechanism

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Performer: A High-Performance Global-Local Model-Augmented with Dual Network Interaction Mechanism

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：