检索结果-内蒙古大学图书馆

W-net: Simultaneous segmentation of multi-anatomical retinal structures using a multi-task deep neural network

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Zhao, Hongwei Peng, Chengtao Liu, Lei Li, Bin School of Information Science and Technology University of Science and Technology of China Hefei Anhui230022 China Department of Precision Machinery and Instrumentation University of Science and Technology of China HefeiAnhui230022 China CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei Anhui230026 China

Segmentation of multiple anatomical structures is of great importance in medical image analysis. In this study, we proposed a W-net to simultaneously segment both the optic disc (OD) and the exudates in retinal images based on the multi-task learning (MTL) scheme. We introduced a class-balanced loss and a multi-task weighted loss to alleviate the imbalanced problem and to improve the robustness and generalization property of the W-net. We demonstrated the effectiveness of our approach by applying five-fold cross-validation experiments on two public datasets e_ophtha_EX and DiaRetDb1. We achieved F1-score of 94.76% and 95.73% for OD segmentation, and 92.80% and 94.14% for exudates segmentation. To further prove the generalization property of the proposed method, we applied the trained model on the DRIONS-DB dataset for OD segmentation and on the MESSIDOR dataset for exudate segmentation. Our results demonstrated that by choosing the optimal weights of each task, the MTL based W-net outperformed separate models trained individually on each task. Code and pre-trained models will be available at: https://***/FundusResearch/MTL_for_OD_and_***. Copyright © 2020, The Authors. All rights reserved.

关键词： Deep neural networks

Object-based method for optical and SAR images change detection

学校读者我要写书评

暂无评论

Object-based method for optical and SAR images change detect...

IET International Radar Conference 2018, IRC 2018

作者： Wan, Ling Zhang, Tao You, Hongjian Key Laboratory of Technology in Geo-spatial Information Processing and Application System Beijing100190 China Institute of Electronics Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100039 China

This study introduces an automatic method for change detection of multi-sensor remote-sensing images (e.g. optical and synthetic aperture radar (SAR) images). As object-based image analysis can effectively reduce the spurious changes and the sensitivity to registration, first, multi-date segmentation is employed to generate homogeneous image objects in spectral, spatial, and temporal, in order to weak the intensity variation effects of multi-sensor images. Then, modified fuzzy c-means (FCM) algorithms are employed to preliminarily classify optical and SAR images, and a criterion is defined using membership values of parcels to select the sample parcels for each class and image. Finally, a change detection principle, which takes statistical properties as the feature space, is introduced to detect changes between multi-sensor images. The experiment results verify that the proposed method is able to cope with optical and SAR images change detection. © 2019 Institution of Engineering and technology. All rights reserved.

关键词： Change detection

Robust Beamformer based on Magnitude Response Constraint and Sparse Constraint

学校读者我要写书评

暂无评论

Robust Beamformer based on Magnitude Response Constraint and...

2019 IEEE International Conference on Signal, information and Data processing, ICSIDP 2019

作者： Lei, Songlin Qiu, Xiaolan DIng, Chibiao Zhang, Yueting Aerospace Information Research Institute Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing100049 China Key Laboratory of Technology in Geo-spatial Information Processing and Application System CAS China Institute of Electronics Chinese Academy of Sciences Beijing100190 China

ISBN: (纸本)9781728123455

Beamformer with magnitude response constraint can flexibly control the response region by specified beamwidth and response ripple, which has a significant performance against steering vector mismatch. However, a high sidelobe level of the beam is accompanied, resulting in performance degradation. To solve this problem, a novel robust beamformer based on magnitude response constraint and sparse constraint is proposed. This method adds the sparse constraint, that is, Lp-norm to the beamformer with magnitude response constraint, then the non-convex cost function can be formulated as a semidefinite programming (SDP) problem, finally the matrix decomposition theory is used to get the array weight vector. Simulation results demonstrate that the proposed method can not only produce large controlled region against steering vector mismatch and reduce the sidelobe level of the beampattern, but also achieve good performance in Signal to Interference plus Noise Ratio (SINR) enhancement. © 2019 IEEE.

关键词： Vectors

Deep Grammatical Multi-classifier for Continuous Sign Language Recognition

学校读者我要写书评

暂无评论

Deep Grammatical Multi-classifier for Continuous Sign Langua...

IEEE International Conference on Multimedia Big Data (BigMM)

作者： Chengcheng Wei Wengang Zhou Junfu Pu Houqiang Li CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

In this paper, we propose a novel deep architecture with multiple classifiers for continuous sign language recognition. Representing the sign video with a 3D convolutional residual network and a bidirectional LSTM, we formulate continuous sign language recognition as a grammatical-rule-based classification problem. We first split a text sentence of sign language into isolated words and n-grams, where an n-gram is a sequence of consecutive n words in a sentence. Then, we propose a word-independent classifiers (WIC) module and an n-gram classifier (NGC) module to identify the words and n-grams in a sentence, respectively. A greedy decoding algorithm is employed to integrate words and n-grams into the sentence based on the confidence scores provided by both modules. Our method is evaluated on a Chinese continuous sign language recognition benchmark, and the experimental results demonstrate its effectiveness and superiority.

关键词： Videos Assistive technology Gesture recognition Feature extraction Task analysis Decoding Cats

Edge-Guided Panoramic Video Stitching with Limited Overlap

学校读者我要写书评

暂无评论

Edge-Guided Panoramic Video Stitching with Limited Overlap

Signal, information and Data processing (ICSIDP), IEEE International Conference on

作者： Chaoyu Xie Xuejin Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728123455

ISBN: (纸本)9781728123462

Video stitching remains a challenging problem in computer vision. In this paper, we propose a novel edge-guided method to stitch multiple videos that have small overlapped regions. Our algorithm consists of three steps: (1) spherical projection of the input video frames based on camera calibration, (2) edge detection and edge-guided feature matching for video registration, and (3) seam optimization to eliminate distortions and ghosts in the composited panoramic videos. The experimental results and user studies demonstrate that our method is robust to videos that have small overlapped regions and produces more visually pleasing panoramic videos than state-of-the-art techniques.

关键词：

Quality Assessment of Stereoscopic 360-degree Images from Multi-viewports

学校读者我要写书评

暂无评论

Quality Assessment of Stereoscopic 360-degree Images from Mu...

Picture Coding Symposium, PCS

作者： Jiahua Xu Ziyuan Luo Wei Zhou Wenyuan Zhang Zhibo Chen CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex aspects are involved in 3D omnidirectional IQA, especially unlimited field of view (FoV) and extra depth perception, which brings difficulty to evaluate the quality of experience (QoE) of 3D omnidirectional images. In this paper, we propose a multi-viewport based full-reference stereo 360 IQA model. Due to the freely changeable viewports when browsing in the head-mounted display, our proposed approach processes the image inside FoV rather than the projected one such as equirectangular projection (ERP). In addition, since overall QoE depends on both image quality and depth perception, we utilize the features estimated by the difference map between left and right views which can reflect disparity. The depth perception features along with binocular image qualities are employed to further predict the overall QoE of 3D 360 images. The experimental results on our public Stereoscopic OmnidirectionaL Image quality assessment Database (SOLID) show that the proposed method achieves a significant improvement over some well-known IQA metrics and can accurately reflect the overall QoE of perceived images.

关键词：

Reinforced Bit Allocation under Task-Driven Semantic Distortion Metrics

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Shi, Jun Chen, Zhibo CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

Rapid growing intelligent applications require optimized bit allocation in image/video coding to support specific task-driven scenarios such as detection, classification, segmentation, etc. Some learning-based frameworks have been proposed for this purpose due to their inherent end-to-end optimization mechanisms. However, it is still quite challenging to integrate these task-driven metrics seamlessly into traditional hybrid coding framework. To the best of our knowledge, this paper is the first work trying to solve this challenge based on reinforcement learning (RL) approach. Specifically, we formulate the bit allocation problem as a Markovian Decision Process (MDP) and train RL agents to automatically decide the quantization parameter (QP) of each coding tree unit (CTU) for HEVC intra coding, according to the task-driven semantic distortion metrics. This bit allocation scheme can maximize the semantic level fidelity of the task, such as classification accuracy, while minimizing the bit-rate. We also employ gradient class activation map (Grad-CAM) and Mask R-CNN tools to extract task-related importance maps to help the agents make decisions. Extensive experimental results demonstrate the superior performance of our approach by achieving 43.1% to 73.2% bit-rate saving over the anchor of HEVC under the equivalent task-related distortions. Copyright © 2019, The Authors. All rights reserved.

关键词： Reinforcement learning

Tensor oriented no-reference light field image quality assessment

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Zhou, Wei Shi, Likun Chen, Zhibo Cas Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027

Light field image (LFI) quality assessment is becoming more and more important, which helps to better guide the acquisition, processing and application of immersive media. However, due to the inherent high dimensional characteristics of LFI, the LFI quality assessment turns into a multi-dimensional problem that requires consideration of the quality degradation in both spatial and angular dimensions. Therefore, we propose a novel Tensor oriented No-reference Light Field image Quality evaluator (Tensor-NLFQ) based on tensor theory. Specifically, since the LFI is regarded as a low-rank 4D tensor, the principle components of four oriented sub-aperture view stacks are obtained via Tucker decomposition. Then, the Principal Component Spatial Characteristic (PCSC) is designed to measure the spatial-dimensional quality of LFI considering its global naturalness and local frequency properties. Finally, the Tensor Angular Variation Index (TAVI) is proposed to measure angular consistency quality by analyzing the structural similarity distribution between the first principal component and each view in the view stack. Extensive experimental results on four publicly available LFI quality databases demonstrate that the proposed Tensor-NLFQ model outperforms state-of-the-art 2D, 3D, multi-view, and LFI quality assessment algorithms. Copyright © 2019, The Authors. All rights reserved.

关键词： Tensors

Video-based point cloud compression artifact removal

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Akhtar, Anique Gao, Wen Li, Li Li, Zhu Jia, Wei Liu, Shan Department of Computer Science and Electrical Engineering University of Missouri-Kansas City Kansas CityMO64110 United States Tencent America 661 Bryant St. Palo AltoCA94301 United States Department of Computer Science and Electrical Engineering University of Missouri-Kansas City Kansas CityMO64110 United States CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

—Photo-realistic point cloud capture and transmission are the fundamental enablers for immersive visual communication. The coding process of dynamic point clouds, especially video-based point cloud compression (V-PCC) developed by the MPEG standardization group, is now delivering state-of-the-art performance in compression efficiency. V-PCC is based on the projection of the point cloud patches to 2D planes and encoding the sequence as 2D texture and geometry patch sequences. However, the resulting quantization errors from coding can introduce compression artifacts, which can be very unpleasant for the quality of experience (QoE). In this work, we developed a novel out-of-the-loop point cloud geometry artifact removal solution that can significantly improve reconstruction quality without additional bandwidth cost. Our novel framework consists of a point cloud sampling scheme, an artifact removal network, and an aggregation scheme. The point cloud sampling scheme employs a cube-based neighborhood patch extraction to divide the point cloud into patches. The geometry artifact removal network then processes these patches to obtain artifact-removed patches. The artifact-removed patches are then merged together using an aggregation scheme to obtain the final artifact-removed point cloud. We employ 3D deep convolutional feature learning for geometry artifact removal that jointly recovers both the quantization direction and the quantization noise level by exploiting projection and quantization prior. The simulation results demonstrate that the proposed method is highly effective and can considerably improve the quality of the reconstructed point cloud. Copyright © 2021, The Authors. All rights reserved.

关键词： Geometry