检索结果-内蒙古大学图书馆

EfficientFCN: Holistically-Guided Decoding for Semantic Segmentation 1

16th European Conference on computer vision, ECCV 2020

作者： Liu, Jianbo He, Junjun Zhang, Jiawei Ren, Jimmy S. Li, Hongsheng CUHK-SenseTime Joint Laboratory The Chinese University of Hong Kong Shatin Hong Kong Shenzhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Beijing China SenseTime Research Beijing China

ISBN: (数字)9783030585747

ISBN: (纸本)9783030585730

Both performance and efficiency are important to semantic segmentation. State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN), which adopt dilated convolutions in the backbone networks to extract high-resolution feature maps for achieving high-performance segmentation performance. However, due to many convolution operations are conducted on the high-resolution feature maps, such dilatedFCN-based methods result in large computational complexity and memory consumption. To balance the performance and efficiency, there also exist encoder-decoder structures that gradually recover the spatial information by combining multi-level feature maps from the encoder. However, the performances of existing encoder-decoder methods are far from comparable with the dilatedFCN-based methods. In this paper, we propose the EfficientFCN, whose backbone is a common ImageNet pretrained network without any dilated convolution. A holistically-guided decoder is introduced to obtain the high-resolution semantic-rich feature maps via the multi-scale features from the encoder. The decoding task is converted to novel codebook generation and codeword assembly task, which takes advantages of the high-level and low-level features from the encoder. Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost. Extensive experiments on PASCAL Context, PASCAL VOC, ADE20K validate the effectiveness of the proposed EfficientFCN. © 2020, Springer Nature Switzerland AG.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

VideoPipe 2022 Challenge: Real-World Video Understanding for Urban Pipe Inspection

arXiv

引用

arXiv 2022年

作者： Liu, Yi Zhang, Xuan Li, Ying Liang, Guixin Jiang, Yabing Qiu, Lixia Tang, Haiping Xie, Fei Yao, Wei Dai, Yi Qiao, Yu Wang, Yali ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institute of Advanced Technology Chinese Academy of Sciences China Shenzhen Bwell Technology Co. Ltd China Shenzhen Longhua Drainage Co. Ltd China Shanghai AI Laboratory Shanghai China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society China

Video understanding is an important problem in computer vision. Currently, the well-studied task in this research is human action recognition, where the clips are manually trimmed from the long videos, and a single class of human action is assumed for each clip. However, we may face more complicated scenarios in the industrial applications. For example, in the real-world urban pipe system, anomaly defects are fine-grained, multi-lab.led, domain-relevant. To recognize them correctly, we need to understand the detailed video content. For this reason, we propose to advance research areas of video understanding, with a shift from traditional action recognition to industrial anomaly analysis. In particular, we introduce two high-quality video benchmarks, namely QV-Pipe and CCTV-Pipe, for anomaly inspection in the real-world urban pipe systems. Based on these new datasets, we will host two competitions including (1) Video Defect Classification on QV-Pipe and (2) Temporal Defect Localization on CCTV-Pipe. In this report, we describe the details of these benchmarks, the problem definitions of competition tracks, the evaluation metric, and the result summary. We expect that, this competition would bring new opportunities and challenges for video understanding in smart city and beyond. The details of our VideoPipe challenge can be found in https://***. Copyright © 2022, The Authors. All rights reserved.

关键词： Defects

来源：评论

学校读者我要写书评

暂无评论

A New Method for Detecting Altered Text in Document Images 1

引用

2nd International Conference on pattern recognition and Artificial Intelligence, ICPRAI 2020

作者： Nandanwar, Lokesh Shivakumara, Palaiahnakote Pal, Umapada Lu, Tong Lopresti, Daniel Seraogi, Bhagesh Chaudhuri, Bidyut B. Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India National Key Lab for Novel Software Technology Nanjing University Nanjing China Computer Science and Engineering Lehigh University BethlehemPA United States

ISBN: (数字)9783030598303

ISBN: (纸本)9783030598297

As more and more office documents are captured, stored, and shared in digital format, and as image editing software becomes increasingly more powerful, there is a growing concern about document authenticity. For example, texts in property documents can be altered to make an illegal deal, or the date on an airline ticket can be altered to gain entry to airport terminals by breaching security. To prevent such illicit activities, this paper presents a new method for detecting altered text in a document. The proposed method explores the relationship between positive and negative coefficients of a DCT to extract the effect of distortions caused by tampering operations. Here we divide DCT coefficients into positive and negative classes, then reconstructs images from the inverse DCT of the respective positive and negative coefficients. Next, we perform Laplacian filtering over reconstructed images for widening the gap between the values of text and other pixels. Then filtered images of positive and negative coefficients are fused by an average operation. For a fused image, we generate Canny and Sobel edge images in order to investigate the effect of distortion through quality measures, namely, MSE, PSNR and SSIM used as features. In addition, for the fused image, the proposed method extracts features based on histograms over the residual images. The features are then passed on to a deep Convolutional Neural Network for classification. The proposed method is tested on our own dataset as well as two standard datasets, namely IMEI and the ICPR 2018 Fraud Contest dataset. The results show that the proposed method is effective and outperforms existing methods. © 2020, Springer Nature Switzerland AG.

关键词： Inverse problems

来源：评论

学校读者我要写书评

暂无评论

A New Journey from SDRTV to HDRTV

A New Journey from SDRTV to HDRTV

引用

International Conference on computer vision (ICCV)

作者： Xiangyu Chen Zhengwen Zhang Jimmy S. Ren Lynhoo Tian Yu Qiao Chao Dong ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institute of Advanced Technology Chinese Academy of Sciences SenseTime Research Qing Yuan Research Institute Shanghai Jiao Tong University Shanghai AI Laboratory Shanghai

ISBN: (纸本)9781665428132

Nowadays modern displays are capable to render video content with high dynamic range (HDR) and wide color gamut (WCG). However, most availab.e resources are still in standard dynamic range (SDR). Therefore, there is an urgent demand to transform existing SDR-TV contents into their HDR-TV versions. In this paper, we conduct an analysis of SDRTV-to-HDRTV task by modeling the formation of SDRTV/HDRTV content. Base on the analysis, we propose a three-step solution pipeline including adaptive global color mapping, local enhancement and highlight generation. Moreover, the above analysis inspires us to present a lightweight network that utilizes global statistics as guidance to conduct image-adaptive color mapping. In addition, we construct a dataset using HDR videos in HDR10 standard, named HDRTV1K, and select five metrics to evaluate the results of SDRTV-to-HDRTV algorithms. Furthermore, our final results achieve state-of-the-art performance in quantitative comparisons and visual quality. The code and dataset are availab.e at https://***/chxy95/HDRTVNet.

关键词： Visualization computer vision Analytical models Codes Image color analysis Pipelines Transforms

来源：评论

学校读者我要写书评

暂无评论

Efficient Image Super-Resolution Using Pixel Attention 16th

Efficient Image Super-Resolution Using Pixel Attention

引用

Workshops held at the 16th European Conference on computer vision, ECCV 2020

作者： Zhao, Hengyuan Kong, Xiangtao He, Jingwen Qiao, Yu Dong, Chao ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Beijing China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China University of Chinese Academy of Sciences Beijing China

ISBN: (纸本)9783030670696

This work aims at designing a lightweight convolutional neural network for image super resolution (SR). With simplicity bare in mind, we construct a pretty concise and effective network with a newly proposed pixel attention scheme. Pixel attention (PA) is similar as channel attention and spatial attention in formulation. The difference is that PA produces 3D attention maps instead of a 1D attention vector or a 2D map. This attention scheme introduces fewer additional parameters but generates better SR results. On the basis of PA, we propose two building blocks for the main branch and the reconstruction branch, respectively. The first one—SC-PA block has the same structure as the Self-Calibrated convolution but with our PA layer. This block is much more efficient than conventional residual/dense blocks, for its two-branch architecture and attention scheme. While the second one—U-PA block combines the nearest-neighbor upsampling, convolution and PA layers. It improves the final reconstruction quality with little parameter cost. Our final model—PAN could achieve similar performance as the lightweight networks—SRResNet and CARN, but with only 272K parameters (17.92% of SRResNet and 17.09% of CARN). The effectiveness of each proposed component is also validated by ablation study. The code is availab.e at https://***/zhaohengyuan1/PAN. © 2020, Springer Nature Switzerland AG.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Conditional Sequential Modulation for Efficient Global Image Retouching 16th

Conditional Sequential Modulation for Efficient Global Image...

引用

16th European Conference on computer vision, ECCV 2020

作者： He, Jingwen Liu, Yihao Qiao, Yu Dong, Chao ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT - SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Beijing China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China University of Chinese Academy of Sciences Beijing China

ISBN: (纸本)9783030586003

Photo retouching aims at enhancing the aesthetic visual quality of images that suffer from photographic defects such as over/under exposure, poor contrast, inharmonious saturation. Practically, photo retouching can be accomplished by a series of image processing operations. In this paper, we investigate some commonly-used retouching operations and mathematically find that these pixel-independent operations can be approximated or formulated by multi-layer perceptrons (MLPs). Based on this analysis, we propose an extremely light-weight framework - Conditional Sequential Retouching Network (CSRNet) - for efficient global image retouching. CSRNet consists of a base network and a condition network. The base network acts like an MLP that processes each pixel independently and the condition network extracts the global features of the input image to generate a condition vector. To realize retouching operations, we modulate the intermediate features using Global Feature Modulation (GFM), of which the parameters are transformed by condition vector. Benefiting from the utilization of 1 × 1 convolution, CSRNet only contains less than 37 k trainable parameters, which is orders of magnitude smaller than existing learning-based methods. Extensive experiments show that our method achieves state-of-the-art performance on the benchmark MIT-Adobe FiveK dataset quantitively and qualitatively. Code is availab.e at https://***/hejingwenhejingwen/CSRNet. © 2020, Springer Nature Switzerland AG.

关键词： Pixels

来源：评论

学校读者我要写书评

暂无评论

Learning Discriminative Representation For Facial Expression recognition From Uncertainties

Learning Discriminative Representation For Facial Expression...

引用

IEEE International Conference on Image Processing

作者： Xingyu Fan Zhongying Deng Kai Wang Xiaojiang Peng Yu Qiao Shenzhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen China

ISBN: (数字)9781728163956

ISBN: (纸本)9781728163963

Recent progresses on Facial Expression recognition (FER) heavily rely on deep learning models trained with large scale datasets. However, large-scale facial expression datasets always suffer from annotation uncertainties caused by ambiguous expressions, low-quality facial images, and the subjectiveness of annotators, which limits FER performance. To address this challenge, this paper introduces novel Rayleigh and weighted-softmax loss from two aspects. First, we propose Rayleigh loss to extract discriminative representation, which aims at minimizing within-class distances and maximizing inter-class distances simultaneously. Moreover, Rayleigh loss has a Euclidean form which make it easily be optimized with SGD and be combined with other forms. Second, we introduce a weight to measure the uncertainty of a given sample, by considering its distance to class center. Extensive experiments on RAF-DB, FERPlus and AffectNet show the effectiveness of our method with SOTA performance.

关键词： Weight measurement Emotion recognition Uncertainty Face recognition Image processing Measurement uncertainty Task analysis

来源：评论

学校读者我要写书评

暂无评论

Temporal modulation network for controllab.e space-time video super-resolution

arXiv

引用

arXiv 2021年

作者： Xu, Gang Xu, Jun Li, Zhen Wang, Liang Sun, Xing Cheng, Ming-Ming College of Computer Science Nankai University Tianjin China School of Statistics and Data Science Nankai University Tianjin China National Lab of Pattern Recognition Institute of Automation CAS Beijing China Youtu Lab. Tencent Shanghai China

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. Recently, deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage. Besides, these methods undervalued the short-term motion cues among adjacent frames. In this paper, we propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction. Specifically, we propose a Temporal Modulation Block (TMB) to modulate deformable convolution kernels for controllab.e feature interpolation. To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos. Experiments on three benchmark datasets demonstrate that our TMNet outperforms previous STVSR methods. The code is availab.e at https://***/CS-GangXu/TMNet. Copyright © 2021, The Authors. All rights reserved.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Survey on Deep Face Restoration: From Non-blind to Blind and Beyond

arXiv

引用

arXiv 2023年

作者： Li, Wenjie Wang, Mei Zhang, Kai Li, Juncheng Li, Xiaoming Zhang, Yuhang Gao, Guangwei Deng, Weihong Lin, Chia-Wen The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing China The Computer Vision Lab ETH Zürich Zürich Switzerland The School of Communication and Information Engineering Shanghai University Shanghai China The Nanyang Technological University Singapore The Intelligent Visual Information Perception Laboratory Institute of Advanced Technology Nanjing University of Posts and Telecommunications Nanjing China The Department of Electrical Engineering National Tsing Hua University Hsinchu Taiwan

Face restoration (FR) is a specialized field within image restoration that aims to recover low-quality (LQ) face images into high-quality (HQ) face images. Recent advances in deep learning technology have led to significant progress in FR methods. In this paper, we begin by examining the prevalent factors responsible for real-world LQ images and introduce degradation techniques used to synthesize LQ images. We also discuss notable benchmarks commonly utilized in the field. Next, we categorize FR methods based on different tasks and explain their evolution over time. Furthermore, we explore the various facial priors commonly utilized in the restoration process and discuss strategies to enhance their effectiveness. In the experimental section, we thoroughly evaluate the performance of state-of-the-art FR methods across various tasks using a unified benchmark. We analyze their performance from different perspectives. Finally, we discuss the challenges faced in the field of FR and propose potential directions for future advancements. The open-source repository corresponding to this work can be found at https://***/24wenjie-li/Awesome-Face-Restoration. Copyright © 2023, The Authors. All rights reserved.

关键词： Restoration

来源：评论

学校读者我要写书评

暂无评论

Varicolored Image De-Hazing

Varicolored Image De-Hazing

引用

Conference on computer vision and pattern recognition (CVPR)

作者： Akshay Dudhane Kuldeep M. Biradar Prashant W. Patil Praful Hambarde Subrahmanyam Murala Computer Vision and Pattern Recognition Lab Indian Institute of Technology Ropar INDIA Indian Institute of Technology Ropar Ropar India

ISBN: (数字)9781728171685

ISBN: (纸本)9781728171692

The quality of images captured in bad weather is often affected by chromatic casts and low visibility due to the presence of atmospheric particles. Restoration of the color balance is often ignored in most of the existing image de-hazing methods. In this paper, we propose a varicolored end-to-end image de-hazing network which restores the color balance in a given varicolored hazy image and recovers the haze-free image. The proposed network comprises of 1) Haze color correction (HCC) module and 2) Visibility improvement (VI) module. The proposed HCC module provides required attention to each color channel and generates a color balanced hazy image. While the proposed VI module processes the color balanced hazy image through novel inception attention block to recover the haze-free image. We also propose a novel approach to generate a large-scale varicolored synthetic hazy image database. An ablation study has been carried out to demonstrate the effect of different factors on the performance of the proposed network for image de-hazing. Three benchmark synthetic datasets have been used for quantitative analysis of the proposed network. Visual results on a set of real-world hazy images captured in different weather conditions demonstrate the effectiveness of the proposed approach for varicolored image de-hazing.

关键词： Image color analysis Atmospheric modeling Image restoration Meteorology Scattering Feature extraction Generators

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：