检索结果-内蒙古大学图书馆

22nd International Conference on Intelligent Data Engineering and Automated Learning

作者： Mou, Xingang Liu, Chang Zhou, Xiao Wuhan Univ Technol Mech & Elect Engn Wuhan Peoples R China

ISBN: (纸本)9783030916077;9783030916084

The traditional technology of radar echo image extrapolation for rainfall nowcasting faces such problems as insufficiently high accuracy, the incomplete analysis of the data on radar echo images, and the image blurring from the stacked LSTM(Long Short-Term Memory). In order to more accurately and clearly predict the radar echo image at a future moment, an adversarial prediction network based on multi-scale U-shaped encoder-decoder is proposed. To overcome the problem of insufficient details of the predicted image, the generator of the network adopts a U-shaped encoder-decoder structure with jump-layer connection. At the same time, in order to capture the echo movement at different scales, multi-scale convolution kernels is introduced to the encoder-decoder units. Then the conventional discriminator structure is improved and stacked ConvLSTM(Convolutional Long Short-Term Memory) layers were proposed to classify sequence. Based on the prediction of next ten frames from the given ten frames of images, this paper tests the network on the SRAD(Standardized Radar Dataset), and compares the prediction results of different networks. The test results show that the proposed model reduces image blurring, enhances the prediction accuracy while retaining sufficient prediction details.

关键词： encoder-decoder Multi-scale Prediction Rainfall nowcasting

来源：评论

学校读者我要写书评

暂无评论

面向稠密NRSfM的神经网络模型的研究

面向稠密NRSfM的神经网络模型的研究

引用

作者：王敏洁浙江理工大学

学位级别：硕士

稠密非刚性运动恢复结构用于对具有非刚性运动的物体进行精确的三维重建,是非刚性运动恢复结构(Non-rigid Structure from Motion,NRSf M)的一个特殊分支。与标准的NRSf M算法相比,稠密NRSf M可以在更高的精度下还原非刚性物体的形状和... 详细信息

稠密非刚性运动恢复结构用于对具有非刚性运动的物体进行精确的三维重建,是非刚性运动恢复结构(Non-rigid Structure from Motion,NRSf M)的一个特殊分支。与标准的NRSf M算法相比,稠密NRSf M可以在更高的精度下还原非刚性物体的形状和表面细节,并生成更为精细的三维模型,因此在计算机视觉和真实世界应用等方面具有重要的研究意义。然而,由于稠密NRSf M在解决问题时需要处理大规模的、具有非刚性结构的图像数据,且对精度和效率具有极高的要求,故研究进展相对缓慢。2020年,神经网络技术第一次被应用于稠密NRSf M领域,并取得了优于传统方法的性能,因此开始受到关注。但现存基于神经网络的稠密NRSf M方案多适用于半稠密数据集,且由于该研究兴起不久,神经网络技术在稠密NRSf M领域并没有得到充分的研究与应用。因此,本文开展了以下两方面的研究工作:(1)受到基于模板的三维重建(Shape from Template,Sf T)方法启发,本文将传统刚性运动恢复结构(Rigid Structure from Motion,RSf M)方法与神经网络技术相结合,提出基于自建模板的稠密NRSf M神经网络模型(T-NRSf M)。该模型首先采用刚性因式分解方法获得重建对象的稠密三维模板,再使用神经网络技术拟合目标物体的形变特征,从而实现稠密非刚性物体的三维重建。其中,为防止重建效果过于依赖初始模板质量,引入Auto encoder架构在网络训练过程中对模板进行微调。此外,由于神经网络技术在重建形变特征时,所需参数数量与重建对象形变复杂度密切相关,本文引入Res Net结构适当调节网络的参数数量,以提高网络对非刚性物体复杂形变的拟合能力,并避免由于网络加深加宽而带来的梯度消失和收敛缓慢等问题。实验结果表明,与大多数竞争算法相比,T-NRSf M在重建精度方面表现更优,并具有良好的鲁棒性。(2)针对现有的基于神经网络的稠密NRSf M方法忽略了时空数据的重要性,以及引入传统RSf M算法限制了神经网络模型重建具有较大形变的物体的能力等缺点,本文采用encoder-decoder架构设计了一个端到端的深度时空稠密NRSf M网络模型(DST-NRSf M),并引入新型的加权空间约束进一步优化三维重建结果。此外,本文巧妙地将层归一化引入到稠密NRSf M任务中,以解决梯度消失问题,并加速神经网络的收敛速度。实验结果表明,无论是在常用的合成数据集上还是真实的基准数据集上,DST-NRSf M模型均体现出了最先进的性能,且在重建形变较大的目标时,表现依旧优异。综上,针对神经网络技术在稠密NRSf M领域尚未得到充分研究与应用的现状,本文从不同的重建思路出发,提出了两个面向稠密NRSf M的神经网络模型。实验结果表明,相比于现有算法,本文提出的两个模型均可实现更高精度的稠密非刚体三维重建。

关键词：稠密NRSfM 自建模板 ResNet encoder-decoder 加权空间约束

来源：评论

学校读者我要写书评

暂无评论

Multi-Path U-Net Architecture for Cell and Colony-Forming Unit Image Segmentation

引用

SENSORS 2022年第3期22卷 990-990页

作者： Jumutc, Vilen Bliznuks, Dmitrijs Lihachev, Alexey Riga Tech Univ Inst Smart Comp Technol LV-1658 Riga Latvia Univ Latvia Inst Atom Phys & Spect LV-1586 Riga Latvia

U-Net is the most cited and widely-used deep learning model for biomedical image segmentation. In this paper, we propose a new enhanced version of a ubiquitous U-Net architecture, which improves upon the original one in terms of generalization capabilities, while addressing several immanent shortcomings, such as constrained resolution and non-resilient receptive fields of the main pathway. Our novel multi-path architecture introduces a notion of an individual receptive field pathway, which is merged with other pathways at the bottom-most layer by concatenation and subsequent application of Layer Normalization and Spatial Dropout, which can improve generalization performance for small datasets. In general, our experiments show that the proposed multi-path architecture outperforms other state-of-the-art approaches that embark on similar ideas of pyramid structures, skip-connections, and encoder-decoder pathways. A significant improvement of the Dice similarity coefficient is attained at our proprietary colony-forming unit dataset, where a score of 0.809 was achieved for the foreground class.

关键词： U-Net skip-connections neural network encoder-decoder Layer Normalization

来源：评论

学校读者我要写书评

暂无评论

On the role of the architecture for spring discharge prediction with deep learning approaches

引用

HYDROLOGICAL PROCESSES 2022年第10期36卷 e14737-e14737页

作者： Zhou, Renjie Zhang, Yanyan Sam Houston State Univ Dept Environm & Geosci Huntsville TX 77340 USA Texas A&M Univ Dept Elect & Comp Engn College Stn TX 77840 USA

Understanding karst spring flow is important to accommodate the increasing water demand caused by the population growth and manage the freshwater water resource effectively. However, due to the spatial and temporal heterogeneity and complex hydrological processes in karst systems, predicting karst spring discharge remains challenging. In this study, three deep learning-based models, including long short-term memory (LSTM), gated recurrent unit (GRU) and simple recurrent neural network (RNN), are framed with an encoder-decoder architecture to provide multiple-step-ahead spring discharge prediction. The encoder-decoder architecture includes an encoder that reads and encodes the input sequence into a vector and decoder that deciphers the vector and outputs the predicted sequence. Three hybrid models called LSTM-ED, GRU-ED and simple RNN-ED are compared with single-step models and multiple-step models without the encoder-decoder architecture to investigate the role of the encoder-decoder architecture on multi-step-ahead prediction. The sensitivity of the selection of input time and lead time steps on the karst spring discharge prediction is evaluated. The predicted results are compared with the observed spring discharge. It implies that: (1) LSTM-ED, GRU-ED and RNN-ED models obtain similar results on predicting karst spring discharge multiple time steps ahead;(2) three hybrid multiple-step models outperform the single-step models in making consistent and accurate spring discharge predictions;(3) the multiple-step models framed with an encoder-decoder architecture obtain better spring discharge prediction results than the single-step models and multiple-step models without the encoder-decoder structure;(4) the LSTM-ED, GRU-ED and simple RNN-ED models are sensitive to the selection of lead time and insensitive to the selection of input time step. A short lead time typically yields a more accurate spring discharge prediction.

关键词： encoder-decoder GRU karst LSTM RNN sequence-to-sequence spring discharge

来源：评论

学校读者我要写书评

暂无评论

MiniCrack: A simple but efficient convolutional neural network for pixel-level narrow crack detection

引用

COMPUTERS IN INDUSTRY 2022年 141卷

作者： Lan, Zhi-Xiong Dong, Xue-Mei Zhejiang Gongshang Univ Sch Stat & Math Hangzhou 310018 Peoples R China Zhejiang Gongshang Univ Collaborat Innovat Ctr Stat Data Engn Technol & Ap Hangzhou 310018 Peoples R China

With the advancement of deep learning, the newly proposed neural networks are growing increasingly complicated to achieve great performance. In this context, we propose a simple but effective neural network called MiniCrack for narrow crack detection. We also propose a lightweight version, MiniCrack-Light, to adapt to scenarios with limited computing resources. MiniCrack and MiniCrack-Light outperform the current state-of-the-art neural networks on all three challenging testing data sets with fewer parameters and achieving stronger robustness. PixelShuffle and PixelUnshuffle designed for image super-resolution are successfully used to the field of image segmentation, which effectively alleviates the problems caused by pooling.

关键词： Convolutional neural network encoder-decoder Narrow crack detection PixelShuffle PixelUnshuffle

来源：评论

学校读者我要写书评

暂无评论

A Deep Learning-Based Chemical System for QSAR Prediction

引用

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS 2020年第10期24卷 3020-3028页

作者： Hu, ShanShan Chen, Peng Gu, Pengying Wang, Bing Anhui Univ Minist Educ Key Lab Intelligent Comp & Signal Proc Hefei 230601 Peoples R China Anhui Univ Sch Comp Sci & Technol Hefei 230601 Peoples R China Civil Aviat Flight Univ China Coll Air Traff Management Guanghan 618307 Peoples R China Anhui Univ Inst Phys Sci Hefei 230601 Peoples R China Anhui Univ Inst Informat Technol Hefei 230601 Peoples R China Anhui Univ Sch Internet Hefei 230601 Peoples R China Univ Sci & Technol China Div Life Sci & Med Affiliated Hosp USTC 1 Cadres Ward South Dist Hefei 230001 Peoples R China Anhui Univ Technol Sch Elect & Informat Engn Maanshan 243032 Peoples R China Anhui Educ Dept Key Lab Power Elect & Mot Control Maanshan 243032 Peoples R China

Research on quantitative structure-activity relationships (QSAR) provides an effective approach to determine new hits and promising lead compounds during drug discovery. In the past decades, various works have gained good performance for QSAR with the development of machine learning. The rise of deep learning, along with massive accessible chemical databases, made improvement on the QSAR performance. This article proposes a novel deep-learning-based method to implement QSAR prediction by the concatenation of end-to-end encoder-decoder model and convolutional neural network (CNN) architecture. The encoder-decoder model is mainly used to generate fixed-size latent features to represent chemical molecules;while these features are then input into CNN framework to train a robust and stable model and finally to predict active chemicals. Two models with different schemes are investigated to evaluate the validity of our proposed model on the same data sets. Experimental results showed that our proposed method outperforms other state-of-the-art methods in successful identification of chemical molecule whether it is active.

关键词： Chemicals Predictive models Inhibitors Biological system modeling Drugs Machine learning Compounds QSAR CNN encoder-decoder active molecule

来源：评论

学校读者我要写书评

暂无评论

Dense Dilated Network With Probability Regularized Walk for Vessel Detection

引用

IEEE TRANSACTIONS ON MEDICAL IMAGING 2020年第5期39卷 1392-1403页

作者： Mou, Lei Chen, Li Cheng, Jun Gu, Zaiwang Zhao, Yitian Liu, Jiang Wuhan Univ Sci & Technol Sch Comp Sci & Technol Wuhan 430081 Peoples R China Wuhan Univ Sci & Technol Hubei Prov Key Lab Intelligent Informat Proc & Re Wuhan 430081 Peoples R China Chinese Acad Sci Cixi Inst Biomed Engn Ningbo 315201 Peoples R China UBTech Robot Corp Ltd UBTech Res Shenzhen 518055 Peoples R China Southern Univ Sci & Technol Dept Comp Sci & Engn Shenzhen 518055 Peoples R China

The detection of retinal vessel is of great importance in the diagnosis and treatment of many ocular diseases. Many methods have been proposed for vessel detection. However, most of the algorithms neglect the connectivity of the vessels, which plays an important role in the diagnosis. In this paper, we propose a novel method for retinal vessel detection. The proposed method includes a dense dilated network to get an initial detection of the vessels and a probability regularized walk algorithm to address the fracture issue in the initial detection. The dense dilated network integrates newly proposed dense dilated feature extraction blocks into an encoder-decoder structure to extract and accumulate features at different scales. A multi-scale Dice loss function is adopted to train the network. To improve the connectivity of the segmented vessels, we also introduce a probability regularized walk algorithm to connect the broken vessels. The proposed method has been applied on three public data sets: DRIVE, STARE and CHASE_DB1. The results show that the proposed method outperforms the state-of-the-art methods in accuracy, sensitivity, specificity and also area under receiver operating characteristic curve.

关键词： Feature extraction Retinal vessels Image segmentation Diseases Blood vessels Biomedical imaging Vessel segmentation encoder-decoder deep learning regularized walk vessel reconnection

来源：评论

学校读者我要写书评

暂无评论

Multi-Stream End-to-End Speech Recognition

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2020年 28卷 646-655页

作者： Li, Ruizhi Wang, Xiaofei Mallidi, Sri Harish Watanabe, Shinji Hori, Takaaki Hermansky, Hynek JHU Baltimore MD 21219 USA Amazon Seattle WA 98121 USA MERL Cambridge MA USA

Attention-based methods and Connectionist Temporal Classification (CTC) network have been promising research directions for end-to-end (E2E) Automatic Speech Recognition (ASR). The joint CTC/Attention model has achieved great success by utilizing both architectures during multi-task training and joint decoding. In this article, we present a multi-stream framework based on joint CTC/Attention E2E ASR with parallel streams represented by separate encoders aiming to capture diverse information. On top of the regular attention networks, the Hierarchical Attention Network (HAN) is introduced to steer the decoder toward the most informative encoders. A separate CTC network is assigned to each stream to force monotonic alignments. Two representative framework have been proposed and discussed, which are Multi-encoder Multi-Resolution (MEM-Res) framework and Multi-encoder Multi-Array (MEM-Array) framework, respectively. In MEM-Res framework, two heterogeneous encoders with different architectures, temporal resolutions and separate CTC networks work in parallel to extract complementary information from same acoustics. Experiments are conducted on Wall Street Journal (WSJ) and CHiME-4, resulting in relative Word Error Rate (WER) reduction of 18.0-32.1% and the best WER of 3.6% in the WSJ eval92 test set. The MEM-Array framework aims at improving the far-field ASR robustness using multiple microphone arrays which are activated by separate encoders. Compared with the best single-array results, the proposed framework has achieved relative WER reduction of 3.7% and 9.7% in AMI and DIRHA multi-array corpora, respectively, which also outperforms conventional fusion strategies.

关键词： End-to-end speech recognition joint CTC attention encoder-decoder connectionist temporal classification hierarchical attention network (HAN) multi-encoder multi-resolution (MEM-Res) multi-encoder multi-array (MEM-Array)

来源：评论

学校读者我要写书评

暂无评论

HIFUNet: Multi-Class Segmentation of Uterine Regions From MR Images Using Global Convolutional Networks for HIFU Surgery Planning

引用

IEEE TRANSACTIONS ON MEDICAL IMAGING 2020年第11期39卷 3309-3320页

作者： Zhang, Chen Shu, Huazhong Yang, Guanyu Li, Faqi Wen, Yingang Zhang, Qin Dillenseger, Jean-Louis Coatrieux, Jean-Louis Southeast Univ Lab Image Sci & Technol Nanjing 210096 Peoples R China Ctr Rech Informat Biomed Sino Francais F-35000 Rennes France Southeast Univ Minist Educ Key Lab Comp Network & Informat Integrat Nanjing 210096 Peoples R China Chongqing Med Univ Coll Biomed Engn State Key Lab Ultrasound Engn Med Chongqing 400016 Peoples R China Natl Engn Res Ctr Ultrasound Med Chongqing 401121 Peoples R China Chongqing Haifu Med Technol Co Ltd Chongqing 401121 Peoples R China Natl Inst Hlth & Med Res F-35000 Rennes France Univ Rennes 1 Lab Traitement Signal & Image F-35000 Rennes France

Accurate segmentation of uterus, uterine fibroids, and spine from MR images is crucial for high intensity focused ultrasound (HIFU) therapy but remains still difficult to achieve because of 1) the large shape and size variations among individuals, 2) the low contrast between adjacent organs and tissues, and 3) the unknown number of uterine fibroids. To tackle this problem, in this paper, we propose a large kernel encoder-decoder Network based on a 2D segmentation model. The use of this large kernel can capturemulti-scale contexts by enlarging the valid receptive field. In addition, a deep multiple atrous convolution block is also employed to enlarge the receptive field and extract denser feature maps. Our approach is compared to both conventional and other deep learning methods and the experimental results conducted on a large dataset show its effectiveness.

关键词： encoder-decoder global convolutional networks HIFU MR images segmentation uterine fibroids

来源：评论

学校读者我要写书评

暂无评论

PGCNet: patch graph convolutional network for point cloud segmentation of indoor scenes

引用

VISUAL COMPUTER 2020年第10-12期36卷 2407-2418页

作者： Sun, Yuliang Miao, Yongwei Chen, Jiazhou Pajarola, Renato Zhejiang Univ Technol Coll Comp Sci & Technol Hangzhou Peoples R China Zhejiang Sci Tech Univ Coll Informat Sci & Technol Hangzhou Peoples R China Univ Zurich Dept Informat CH-8050 Zurich Switzerland

Semantic segmentation of 3D point clouds is a crucial task in scene understanding and is also fundamental to indoor scene applications such as indoor navigation, mobile robotics, augmented reality. Recently, deep learning frameworks have been successfully adopted to point clouds but are limited by the size of data. While most existing works focus on individual sampling points, we use surface patches as a more efficient representation and propose a novel indoor scene segmentation framework called patch graph convolution network (PGCNet). This framework treats patches as input graph nodes and subsequently aggregates neighboring node features by dynamic graph U-Net (DGU) module, which consists of dynamic edge convolution operation inside U-shaped encoder-decoder architecture. The DGU module dynamically update graph structures at each level to encode hierarchical edge features. Incorporating PGCNet, we can segment the input scene into two types, i.e., room layout and indoor objects, which is afterward utilized to carry out final rich semantic labeling of various indoor scenes. With considerable speedup training, the proposed framework achieves effective performance equivalent to state-of-the-art for segmenting standard indoor scene dataset.

关键词： Point cloud Scene segmentation Surface patch Graph convolutional network Edge convolution encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：