检索结果-内蒙古大学图书馆

arXiv 2021年

作者： Akhtar, Anique Gao, Wen Li, Li Li, Zhu Jia, Wei Liu, Shan Department of Computer Science and Electrical Engineering University of Missouri-Kansas City Kansas CityMO64110 United States Tencent America 661 Bryant St. Palo AltoCA94301 United States Department of Computer Science and Electrical Engineering University of Missouri-Kansas City Kansas CityMO64110 United States CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

—Photo-realistic point cloud capture and transmission are the fundamental enablers for immersive visual communication. The coding process of dynamic point clouds, especially video-based point cloud compression (V-PCC) developed by the MPEG standardization group, is now delivering state-of-the-art performance in compression efficiency. V-PCC is based on the projection of the point cloud patches to 2D planes and encoding the sequence as 2D texture and geometry patch sequences. However, the resulting quantization errors from coding can introduce compression artifacts, which can be very unpleasant for the quality of experience (QoE). In this work, we developed a novel out-of-the-loop point cloud geometry artifact removal solution that can significantly improve reconstruction quality without additional bandwidth cost. Our novel framework consists of a point cloud sampling scheme, an artifact removal network, and an aggregation scheme. The point cloud sampling scheme employs a cube-based neighborhood patch extraction to divide the point cloud into patches. The geometry artifact removal network then processes these patches to obtain artifact-removed patches. The artifact-removed patches are then merged together using an aggregation scheme to obtain the final artifact-removed point cloud. We employ 3D deep convolutional feature learning for geometry artifact removal that jointly recovers both the quantization direction and the quantization noise level by exploiting projection and quantization prior. The simulation results demonstrate that the proposed method is highly effective and can considerably improve the quality of the reconstructed point cloud. Copyright © 2021, The Authors. All rights reserved.

关键词： geometry

来源：评论

学校读者我要写书评

暂无评论

An Energy Efficient Carry-Free Inner Product Unit

An Energy Efficient Carry-Free Inner Product Unit

引用

2019 IEEE International Conference on Signal, information and Data processing, ICSIDP 2019

作者： Yan, Wen Ercegovac, Milos D. Key Laboratory of Technology in Geo-spatial Information Processing and Application System Institute of Electronics Chinese Academy of Sciences Beijing China University of California Computer Science Department Los AngelesCA United States

ISBN: (纸本)9781728123455

An energy efficient truncated inner product unit is proposed in this paper. The proposed unit is pipelined and processes the m pairs of n-bit operands in serial, so that only one unit is required and it can be reused in each recurrence. Truncated inner product is produced so that no accumulation is needed for the lower part. The array reduction unit is bit-level pipelined to improve throughput and reduce the initial delay. A two-level [4:2] adder is proposed to accumulate redundant vectors with non-uniform input arrival. The proposed inner product is compared with other truncated inner product units using fully merged arithmetic, partially merged arithmetic and the counter based inner product for different precision and vector length. Our proposed inner product unit has significant reduction in area, power and energy. © 2019 IEEE.

关键词： Pipelines

来源：评论

学校读者我要写书评

暂无评论

Dominant Physical Scattering Mechanism Analysis for GF-3 Typical Ground Objects by Polarimetric Decomposition

Dominant Physical Scattering Mechanism Analysis for GF-3 Typ...

引用

IEEE International Symposium on geoscience and Remote Sensing (IGARSS)

作者： Jin Yan Qiu Xiaolan Lijia Huang Laboratory of Spatial Information Intelligent Processing System Institute of Electronics Chinese Academy of Sciences Suzhou China Key Laboratory of Technology in Geo-spatial Information Processing and Application System CAS

The dominant scattering mechanism is of great significance for the application of ground objects classification and target detection. It can also verify the quality of the polarimetric data by check the dominant scattering mechanism of known ground objects. In order to improve the application performance, this paper studies the dominant scattering mechanism of GF-3 typical ground objects based on a large number of data slices. The GF-3 fully polarimetric data slices are classified based on the MODIS global classification map, and the GF-3 slice library of typical ground objects is constructed. Based on large amounts of GF-3 samples, we carry out the statistical analysis of dominant scattering mechanism separation results for typical GF-3 ground objects (building, woodland, cultivated land, grassland and waters) of by means of h/alpha/A decomposition. The quantitative results reveal the polarimetric scattering feature of different ground objects, and provide reference for fully polarimetric SAR application.

关键词： Scattering Synthetic aperture radar Entropy Buildings Statistical analysis Radar polarimetry Sensors

来源：评论

学校读者我要写书评

暂无评论

A coarse-to-fine framework for learned color enhancement with non-local attention

arXiv

引用

arXiv 2019年

作者： Shan, Chaowei Zhang, Zhizheng Chen, Zhibo CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Automatic color enhancement is aimed to adaptively adjust photos to expected styles and tones. For current learned methods in this field, global harmonious perception and local details are hard to be well-considered in a single model simultaneously. To address this problem, we propose a coarse-tofine framework with non-local attention for color enhancement in this paper. Within our framework, we propose to divide enhancement process into channel-wise enhancement and pixel-wise refinement performed by two cascaded Convolutional Neural Networks (CNNs). In channel-wise enhancement, our model predicts a global linear mapping for RGB channels of input images to perform global style adjustment. In pixel-wise refinement, we learn a refining mapping using residual learning for local adjustment. Further, we adopt a non-local attention block to capture the long-range dependencies from global information for subsequent fine-grained local refinement. We evaluate our proposed framework on the commonly using benchmark and conduct sufficient experiments to demonstrate each technical component within it. Copyright © 2019, The Authors. All rights reserved.

关键词： Pixels

来源：评论

学校读者我要写书评

暂无评论

No-reference light field image quality assessment based on micro-lens image

arXiv

引用

arXiv 2019年

作者： Luo, Ziyuan Zhou, Wei Shi, Likun Chen, Zhibo CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Light field image quality assessment (LF-IQA) plays a significant role due to its guidance to Light Field (LF) contents acquisition, processing and application. The LF can be represented as 4-D signal, and its quality depends on both angular consistency and spatial quality. However, few existing LF-IQA methods concentrate on effects caused by angular inconsistency. Especially, no-reference methods lack effective utilization of 2-D angular information. In this paper, we focus on measuring the 2-D angular consistency for LF-IQA. The Micro-Lens Image (MLI) refers to the angular domain of the LF image, which can simultaneously record the angular information in both horizontal and vertical directions. Since the MLI contains 2-D angular information, we propose a No-Reference Light Field image Quality assessment model based on MLI (LF-QMLI). Specifically, we first utilize Global Entropy Distribution (GED) and Uniform Local Binary Pattern descriptor (ULBP) to extract features from the MLI, and then pool them together to measure angular consistency. In addition, the information entropy of Sub-Aperture Image (SAI) is adopted to measure spatial quality. Extensive experimental results show that LF-QMLI achieves the state-of-the-art performance. Copyright © 2019, The Authors. All rights reserved.

关键词： Image quality

来源：评论

学校读者我要写书评

暂无评论

Learned fast HEVC intra coding

arXiv

引用

arXiv 2019年

作者： Chen, Zhibo Shi, Jun Li, Weiping CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

In High Efficiency Video Coding (HEVC), excellent rate-distortion (RD) performance is achieved in part by having a flexible quadtree coding unit (CU) partition and a large number of intra-prediction modes. Such an excellent RD performance is achieved at the expense of much higher computational complexity. In this paper, we propose a learned fast HEVC intra coding (LFHI) framework taking into account the comprehensive factors of fast intra coding to reach an improved configurable tradeoff between coding performance and computational complexity. First, we design a low-complex shallow asymmetric-kernel CNN (AK-CNN) to efficiently extract the local directional texture features of each block for both fast CU partition and fast intra-mode decision. Second, we introduce the concept of the minimum number of RDO candidates (MNRC) into fast mode decision, which utilizes AK-CNN to predict the minimum number of best candidates for RDO calculation to further reduce the computation of intra-mode selection. Third, an evolution optimized threshold decision (EOTD) scheme is designed to achieve configurable complexity-efficiency tradeoffs. Finally, we propose an interpolation-based prediction scheme that allows for our framework to be generalized to all quantization parameters (QPs) without the need for training the network on each QP. The experimental results demonstrate that the LFHI framework has a high degree of parallelism and achieves a much better complexity-efficiency tradeoff, achieving up to 75.2% intra-mode encoding complexity reduction with negligible rate-distortion performance degradation, superior to the existing fast intra-coding schemes. Copyright © 2019, The Authors. All rights reserved.

关键词： Signal distortion

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based video coding: A review and a case study

arXiv

引用

arXiv 2019年

作者： Liu, Dong Li, Yue Lin, Jianping Li, Houqiang Wu, Feng CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

The past decade has witnessed great success of deep learning technology in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. This paper reviews the representative works about using deep learning for image/video coding, which has been an actively developing research area since the year of 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks (deep schemes), and deep network-based coding tools (deep tools) that shall be used within traditional coding schemes or together with traditional coding tools. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding scheme and transform coding scheme, respectively. For deep tools, there have been several proposed techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. According to the newest reports, deep schemes have achieved comparable or even higher compression efficiency than the state-of-the-art traditional schemes, such as High Efficiency Video Coding (HEVC) based scheme, for image coding;deep tools have demonstrated the compression capability beyond HEVC for video coding. However, deep schemes have not yet reached the current height of HEVC for video coding, and deep tools remain largely unexplored at many aspects including the tradeoff between compression efficiency and encoding/decoding complexity, the optimization for perceptual naturalness or semantic quality, the speciality and universality, the federated design of multiple deep tools, and so on. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, namely Deep Learning Vi

关键词： Video signal processing

来源：评论

学校读者我要写书评

暂无评论

RC-CNN: Representation-Consistent Convolutional Neural Networks for Achieving Transformation Invariance

RC-CNN: Representation-Consistent Convolutional Neural Netwo...

引用

IEEE International Conference on systems, Man and Cybernetics

作者： Jun Gu Anfeng He Xinmei Tian CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei Anhui China

Convolutional neural networks (CNNs) are powerful and have achieved state-of-the-art performance in many visual recognition tasks. Despite their impressive performance, CNNs are still unable to remain invariant while some spatial transformations are applied on images. Herein, we propose representation-consistent neural networks to solve this problem. By introducing consistent losses between the representations in different layers of transformed images, the recognition performance of transformed images is significantly improved. This model not only learns to map from the transformed images to the pre-defined labels but each layer also learns to generate invariant representations when the input images are transformed. All the characteristics of transformation invariance are embedded in the model, which means that no extra parameters or computations are introduced in the well-trained model. Comparative experiments demonstrate the superiority of our model when learning invariance to rotation, translation, and scaling on large-scale image recognition and retrieval tasks.

关键词： Computational modeling Feature extraction Training Image recognition Data models Task analysis Kernel

来源：评论

学校读者我要写书评

暂无评论

W-net: Simultaneous segmentation of multi-anatomical retinal structures using a multi-task deep neural network

arXiv

引用

arXiv 2020年

作者： Zhao, Hongwei Peng, Chengtao Liu, Lei Li, Bin School of Information Science and Technology University of Science and Technology of China Hefei Anhui230022 China Department of Precision Machinery and Instrumentation University of Science and Technology of China HefeiAnhui230022 China CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei Anhui230026 China

Segmentation of multiple anatomical structures is of great importance in medical image analysis. In this study, we proposed a W-net to simultaneously segment both the optic disc (OD) and the exudates in retinal images based on the multi-task learning (MTL) scheme. We introduced a class-balanced loss and a multi-task weighted loss to alleviate the imbalanced problem and to improve the robustness and generalization property of the W-net. We demonstrated the effectiveness of our approach by applying five-fold cross-validation experiments on two public datasets e_ophtha_EX and DiaRetDb1. We achieved F1-score of 94.76% and 95.73% for OD segmentation, and 92.80% and 94.14% for exudates segmentation. To further prove the generalization property of the proposed method, we applied the trained model on the DRIONS-DB dataset for OD segmentation and on the MESSIDOR dataset for exudate segmentation. Our results demonstrated that by choosing the optimal weights of each task, the MTL based W-net outperformed separate models trained individually on each task. Code and pre-trained models will be available at: https://***/FundusResearch/MTL_for_OD_and_***. Copyright © 2020, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Two-stream action recognition-oriented video super-resolution

arXiv

引用

arXiv 2019年

作者： Zhang, Haochen Liu, Dong Xiong, Zhiwei CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

We study the video super-resolution (SR) problem for facilitating video analytics tasks, e.g. action recognition, instead of for visual quality. The popular action recognition methods based on convolutional networks, exemplified by two-stream networks, are not directly applicable on video of low spatial resolution. This can be remedied by performing video SR prior to recognition, which motivates us to improve the SR procedure for recognition accuracy. Tailored for two-stream action recognition networks, we propose two video SR methods for the spatial and temporal streams respectively. On the one hand, we observe that regions with action are more important to recognition, and we propose an optical-flow guided weighted mean-squared-error loss for our spatial-oriented SR (SoSR) network to emphasize the reconstruction of moving objects. On the other hand, we observe that existing video SR methods incur temporal discontinuity between frames, which also worsens the recognition accuracy, and we propose a siamese network for our temporal-oriented SR (ToSR) training that emphasizes the temporal continuity between consecutive frames. We perform experiments using two state-of-the-art action recognition networks and two well-known datasets–UCF101 and HMDB51. Results demonstrate the effectiveness of our proposed SoSR and ToSR in improving recognition accuracy. Copyright © 2019, The Authors. All rights reserved.

关键词： Optical resolving power

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：