检索结果-内蒙古大学图书馆

Multi-depth branch network for efficient image super-resolution

IMAGE AND VISION COMPUTING 2024年 144卷

作者： Tian, Huiyuan Zhang, Li Li, Shijian Yao, Min Pan, Gang Zhejiang Univ Coll Comp Sci & Technol 38 Zheda Rd Hangzhou 310027 Zhejiang Peoples R China Zhejiang Univ Adv Technol Res Inst 38 Zheda Rd Hangzhou 310027 Peoples R China

A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images while maintaining semantic coherence. This is particularly crucial in practical applications where SR models are often deployed on low-power devices. To address this issue, we propose an innovative asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM). These MDBMs contain branches of different depths, designed to capture high- and low-frequency information simultaneously and efficiently. The hierarchical structure of MDBM allows the deeper branch to gradually accumulate fine-grained local details under the contextual guidance of the shallower branch. We visualize this process using feature maps, and further demonstrate the rationality and effectiveness of this design using proposed novel Fourier spectral analysis methods. Moreover, our model exhibits more significant spectral differentiation between branches than existing branch networks. This suggests that MDBM reduces feature redundancy and offers a more effective method for integrating high- and low-frequency information. Extensive qualitative and quantitative evaluations on various datasets show that our model can generate structurally consistent and visually realistic HR images. It achieves state-of-the-art (SOTA) results at a very fast inference speed. Our code is available at https://***/thy96 0112/MDBN.

关键词： Efficient super -resolution Multi -depth branch network feature map visualization Fourier spectral analysis feature fusion

来源：评论

学校读者我要写书评

暂无评论

SE-Swin: An improved Swin-Transfomer network of self-ensemble feature extraction framework for image retrieval

引用

IET IMAGE PROCESSING 2024年第1期18卷 13-21页

作者： Xu, Yixuan Wang, Xianbing Zhang, Hua Lin, Hai Wuhan Univ Sch Cyber Sci & Engn Wuhan Peoples R China Wuhan Univ Sch Cyber Sci & Engn Key Lab Aerosp Informat Secur & Trusted Comp Minist Educ Wuhan Peoples R China Wuhan Univ Sch Comp Sci Wuhan Peoples R China

The Swin-Transformer is a variant of the Vision Transformer, which constructs a hierarchical Transformer that computes representations with shifted windows and window multi-head self-attention. This method can handle the scale invariance problem and performs well in many computer vision tasks. In image retrieval, high-quality feature descriptors are necessary to improve retrieval accuracy. This paper proposes a self-ensemble Swin-Transformer network structure to fuse the features of different layers of the Swin-Transformer network, eliminating noise points present in a single layer, and improving the retrieval effect. Two experiments were conducted, one on the In-shop Clothes Retrieval dataset and another on the Stanford Online Product dataset. The experiments showed that the proposed method significantly increased the retrieval effect of features extracted using Vision Transformer, surpassing previous state-of-the-art image retrieval methods. In the second experiment, the feature map of the trained model was visualized, revealing that the improved network significantly reduces focus on some noise points and enhances focus on image features compared to the original network. In order to effectively integrate consistent information between the multiple layers of the Swin Transformer, the model conducts parameter self-ensemble of the internal blocks of the Swin ***

关键词： Swin-Transformer Self-ensemble feature Extraction feature map visualization

来源：评论

学校读者我要写书评

暂无评论

A Long Skip Connection for Enhanced Color Selectivity in CNN Architectures

引用

SENSORS 2023年第17期23卷 7582-7582页

作者： Sanchez-Cesteros, Oscar Rincon, Mariano Bachiller, Margarita Valladares-Rodriguez, Sonia Univ Nacl Educ Distancia Dept Artificial Intelligence Madrid 28040 Spain Univ Santiago de Compostela Dept Elect & Comp Santiago De Compostela 15705 Spain

Some recent studies show that filters in convolutional neural networks (CNNs) have low color selectivity in datasets of natural scenes such as Imagenet. CNNs, bio-inspired by the visual cortex, are characterized by their hierarchical learning structure which appears to gradually transform the representation space. Inspired by the direct connection between the LGN and V4, which allows V4 to handle low-level information closer to the trichromatic input in addition to processed information that comes from V2/V3, we propose the addition of a long skip connection (LSC) between the first and last blocks of the feature extraction stage to allow deeper parts of the network to receive information from shallower layers. This type of connection improves classification accuracy by combining simple-visual and complex-abstract features to create more color-selective ones. We have applied this strategy to classic CNN architectures and quantitatively and qualitatively analyzed the improvement in accuracy while focusing on color selectivity. The results show that, in general, skip connections improve accuracy, but LSC improves it even more and enhances the color selectivity of the original CNN architectures. As a side result, we propose a new color representation procedure for organizing and filtering feature maps, making their visualization more manageable for qualitative color selectivity analysis.

关键词： color selectivity skip connections long skip connection CNN VGG16 Densenet121 Resnet50 feature map visualization

来源：评论

学校读者我要写书评

暂无评论

Machine Learning Model Interpretability in NLP and Computer Vision Applications 5th

Machine Learning Model Interpretability in NLP and Computer ...

引用

5th International Conference on Advances in Computing and Data Sciences (ICACDS)

作者： Chakrabarty, Navoneel Int Inst Informat Technol IIIT Bangalore Bengaluru India

ISBN: (纸本)9783030814625;9783030814618

In light of the recent advancements in Artificial Intelligence (AI), the application of Machine Learning in the domains of Natural Language Processing and Computer Vision is increasing by leaps and bounds. Deployment of Machine Learning Models in applications (apps) have been rampant with the aim of achieving automation, mainly involving textual and image data. Textual data indulges the subject of Text Analytics (also known as NLP) into action and image data indulges the subject of Computer Vision into play. But, the performance of Machine Learning in the domains of Text Analytics or Vision needs to be judged before deployment. Now, performance analysis of ML Models are done with the help of performance metrics, most importantly AUC Score in Classification Problems, but justification by means of numerical scores only, can't establish the relevance of the Model Performance with Domain Knowledge. In this paper, 1 standard NLP use-case and 4 Computer Vision use-cases are considered for ML Model Interpretability Enhancement that can throw light on the relevance with the concerned Domain Knowledge, the use-case deals with.

关键词： Model Interpretability NLP Computer Vision Wordcloud feature map visualization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：