咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Image Quality Assessment Based... 收藏

Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer

作     者:Fu, Geng Wang, Ziyu Zhang, Cuijuan Qi, Zerong Hu, Mingzheng Fu, Shujun Zhang, Yunfeng 

作者机构:Shandong Univ Finance & Econ Sch Comp & Artificial Intelligence Jinan 250014 Peoples R China Yidu Cent Hosp Weifang Dept Intervent Therapy Qingzhou Peoples R China Shandong Chengshi Elect Technol Co Ltd Jinan 250002 Peoples R China Shandong Univ Sch Math Jinan 250100 Peoples R China 

出 版 物:《IEEE ACCESS》 (IEEE Access)

年 卷 期:2025年第13卷

页      面:24276-24286页

核心收录:

基  金:National Natural Science Foundation of China [12071263, 11971269, 12171285, 12371492] Young Taishan Scholars Program [tsqn202211321] Distinguished Taishan Scholars Program [tstp20231251] Innovation Ability Improvement Project of Science and Technology-Based SMEs in Shandong Province [2022TSGC2072] 

主  题:Databases Transformers Image quality Computational modeling Benchmark testing Distortion Spatial resolution Residual neural networks Predictive models Measurement Image quality assessment multi-scale no-reference spatial pooling shifted window transformer 

摘      要:In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分