检索结果-内蒙古大学图书馆

Asia-Pacific-Signal-and-Information-Processing-Association Annual Summit and Conference (APSIPA ASC)

作者： Chien, Jen-Tzung Lin, Ting-An Natl Chiao Tung Univ Dept Elect & Comp Engn Hsinchu Taiwan

ISBN: (纸本)9789881476883

Attention over an observed image or natural sentence is run by spotting or locating the region or position of interest for pattern classification. The attention parameter is seen as a latent variable, which was indirectly calculated by minimizing the classification loss. Using such an attention mechanism, the target information may not be correctly identified. Therefore, in addition to minimizing the classification error, we can directly attend the region of interest by minimizing the reconstruction error due to supporting data. Our idea is to learn how to attend through the so-called supportive attention when the supporting information is available. A new attention mechanism is developed to conduct the attentive learning for translation invariance which is applied for image caption. The derived information is helpful for generating caption from input image. Moreover, this paper presents an association network which does not only implement the word-to-image attention, but also carry out the image-to-image attention via self attention. The relations between image and text are sufficiently represented. Experiments on MS-COCO task show the benefit of the proposed supportive and self attentions for image caption with the key-value memory network.

关键词： image caption attention mechanism encoder-decoder network association network

来源：评论

学校读者我要写书评

暂无评论

MwUnet: A semantic segmentation deep learning method for the ultrasonic image of hydronephrosis in children

MwUnet: A semantic segmentation deep learning method for the...

引用

IEEE International Conference on Systems, Man, and Cybernetics (SMC)

作者： Peng, Haoran Guan, Yu Li, Jianqiang Xu, Xi Wen, Pengceng Yang, Jijiang Jia, Yanhe Xie, Xianghui Li, Minglei Wang, Xiaoman Xin, Yue He, Yuzhu Beijing Univ Technol Fac Informat Technol Beijing 100124 Peoples R China Tsinghua Univ Tsinghua Natl Lab Informat Sci & Technol Beijing 100084 Peoples R China Beijng Informat Sci & Technol Univ Sch Econ & Management Beijing 100192 Peoples R China Capital Med Univ Beijing Childrens Hosp Natl Ctr Childrens Hlth Beijing 100045 Peoples R China

ISBN: (纸本)9781665442077

Hydronephrosis may lead to many potential diseases, and the diagnosis of hydronephrosis is time-consuming and laborious. To assist physicians in hydronephrosis diagnosis and treatment planning, an accurate and automatic kidney segmentation method is highly required in clinical practice. In recent years, deep convolutional neural networks such as Unet plays a key role in the field of image segmentation, but Unet itself cannot adjust the receptive field actively, which may result in poor attention to the characteristics of the segmented target. We propose an encoder-decoder network with weighted skip connections and the idea of hierarchical equal resolution that can manually control the receptive field. We evaluated our method by comparing it with various classical networks using a dataset of 1850 annotated images. The MPA of the model is 94.12 and the MIoU is 89.49, which outperformed other classical networks we compared to.

关键词： B-ultrasound image renal ultrasound image ureteropelvic junction obstruction (UPJO) hydronephrosis encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

Dynamic Feature Selection for Structural Image Content Recognition 29th

Dynamic Feature Selection for Structural Image Content Recog...

引用

29th International Conference on MultiMedia Modeling (MMM)

作者： Fu, Yingnan Zheng, Shu Cai, Wenyuan Gao, Ming Jin, Cheqing Zhou, Aoying East China Normal Univ Sch Data Sci & Engn Shanghai Peoples R China East China Normal Univ Shanghai Key Lab Mental Hlth & Psychol Crisis Int Sch Psychol & Cognit Sci Shanghai Peoples R China Shanghai Hypers Data Technol Inc Shanghai Peoples R China

ISBN: (纸本)9783031278174;9783031278181

Structural image content recognition (SICR) aims to transcribe a two-dimensional structural image (e.g., mathematical expression, chemical formula, or music score) into a token sequence. Existing methods are mainly encoder-decoder based and overlook the importance of feature selection and spatial relation extraction in the feature map. In this paper, we propose DEAL (shorted for Dynamic fEAture seLection) for SICR, which contains a dynamic feature selector and a spatial relation extractor as two cornerstone modules. Specifically, we propose a novel loss function and random exploration strategy to dynamically select useful image cells for target sequence generation. Further, we consider the positional and surrounding information of cells in the feature map to extract spatial relations. We conduct extensive experiments to evaluate the performance of DEAL. Experimental results show that DEAL outperforms other state-of-the-arts significantly.

关键词： structural image content recognition mathematical expression recognition encoder-decoder network feature selection

来源：评论

学校读者我要写书评

暂无评论

3DCSCN: 3D Cascade Shape Completion network Based on Single Depth View 3

3DCSCN: 3D Cascade Shape Completion Network Based on Single ...

引用

3rd International Conference on Electronic Information Engineering and Computer Communication, EIECC 2023

作者： Chen, Yali Liu, Caixia Zhu, Minhong Li, Haisheng School of Computer and Artificial Intelligence Beijing Technology and Business University Beijing China

ISBN: (纸本)9798350359961

Depth images of objects can be easily obtained by depth cameras, but they can only provide limited shape information. Current widely learning-based methods generate complete 3D shapes from images, but reconstructed 3D models have low resolutions and noise. To this end, this paper proposes a 3D Cascade Shape Completion network (3DCSCN) for predicting the complete 3D structure from a single-depth view. 3DCSCN uses an encoder-decoder network to generate rough prediction results and then introduces a point refinement network to update points with high uncertainty for fine-grained prediction results. The experimental results show that 3DCSCN is better than the current methods by an improvement in average IoU and CE by 0.91% and 3.83% on the public ShapeNet dataset, respectively. © 2023 IEEE.

关键词： 3D completion Depth images encoder-decoder network Point refinement network Voxels

来源：评论

学校读者我要写书评

暂无评论

EDPNet: An Encoding-Decoding network with Pyramidal Representation for Semantic Image Segmentation

引用

SENSORS 2023年第6期23卷 3205-3205页

作者： Chen, Dong Li, Xianghong Hu, Fan Mathiopoulos, P. Takis Di, Shaoning Sui, Mingming Peethambaran, Jiju Nanjing Forestry Univ Coll Civil Engn Nanjing 210037 Peoples R China Natl & Kapodistrian Univ Athens Dept Informat & Telecommun Athens 15784 Greece Cent South Univ Sch Geosci & Info Phys Changsha 410083 Peoples R China St Marys Univ Dept Math & Comp Sci Halifax NS B3P 2M6 Canada

This paper proposes an encoding-decoding network with a pyramidal representation module, which will be referred to as EDPNet, and is designed for efficient semantic image segmentation. On the one hand, during the encoding process of the proposed EDPNet, the enhancement of the Xception network, i.e., Xception+ is employed as a backbone to learn the discriminative feature maps. The obtained discriminative features are then fed into the pyramidal representation module, from which the context-augmented features are learned and optimized by leveraging a multi-level feature representation and aggregation process. On the other hand, during the image restoration decoding process, the encoded semantic-rich features are progressively recovered with the assistance of a simplified skip connection mechanism, which performs channel concatenation between high-level encoded features with rich semantic information and low-level features with spatial detail information. The proposed hybrid representation employing the proposed encoding-decoding and pyramidal structures has a global-aware perception and captures fine-grained contours of various geographical objects very well with high computational efficiency. The performance of the proposed EDPNet has been compared against PSPNet, DeepLabv3, and U-Net, employing four benchmark datasets, namely eTRIMS, Cityscapes, PASCAL VOC2012, and CamVid. EDPNet acquired the highest accuracy of 83.6% and 73.8% mIoUs on eTRIMS and PASCAL VOC2012 datasets, while its accuracy on the other two datasets was comparable to that of PSPNet, DeepLabv3, and U-Net models. EDPNet achieved the highest efficiency among the compared models on all datasets.

关键词： semantic segmentation semantic parsing pyramidal representation encoder-decoder network convolution neural network

来源：评论

学校读者我要写书评

暂无评论

Abdominal computed tomography localizer image generation: A deep learning approach

引用

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022年 214卷 106575-106575页

作者： Liu, Zongxi Zhao, Huimin Fang, Xiang Huo, Donglai Univ Wisconsin Sheldon B Lubar Sch Business 3202 N Maryland Ave Milwaukee WI 53201 USA Univ Colorado Sch Med Dept Radiol 13001 East 17th Pl Aurora CO 80045 USA

Background and Objective: Computed Tomography (CT) has become an important clinical imaging modality, as well as the leading source of radiation dose from medical imaging procedures. Modern CT exams are usually led by two quick orthogonal localization scans, which are used for patient positioning and diagnostic scan parameter definition. These two localization scans contribute to the patient dose but are not used for diagnosis purposes. In this study, we investigate the possibility of using deep learning models to reconstruct one localization scan image from the other, thus reducing the patient dose and simplifying the clinical workflow. Methods: We propose a modified encoder-decoder network and a scaled mixture loss function specifically for the focal task. In this study, 12,487 clinical abdominal exams were retrieved from a clinical medical imaging storage system and randomly split for training, validation, and test in the ratio of 7:1:2. Reconstructed images were compared with the ground truth in terms of location prediction error, profile prediction error, and attenuation prediction error. Results: The average location error, profile error, and attenuation error were 1.02 +/- 3.37 mm, 4.43 +/- 2.02%, and 6.2 +/- 2.94% for lateral prediction, and 6.46 +/- 6.43 mm, 3.9 +/- 2.32%, and 7.12 +/- 3.54% for AP prediction, respectively. Conclusions: We conclude that although the reconstructed abdominal CT localization images may lack some details on the internal organ structures, they could be used effectively for tube current modulation calculation and patient positioning purposes, leading to a reduction of radiation dose and scan time in clinical CT exams. (C) 2021 Elsevier B.V. All rights reserved.

关键词： CT localizer image Image generation Deep learning encoder-decoder network Scaled mixture loss

来源：评论

学校读者我要写书评

暂无评论

Microseismic First-Arrival Picking Using Fine-Tuning Feature Pyramid networks

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2022年 19卷 1页

作者： Liu, Naihao Chen, Jiamin Wu, Hao Li, Fangyu Gao, Jinghuai Xi An Jiao Tong Univ Sch Informat & Commun Engn Xian 710049 Shaanxi Peoples R China China Univ Geosci Sch Earth Resources Wuhan 430074 Hubei Peoples R China Beijing Univ Technol Fac Informat Technol Engn Res Ctr Digital CommunityMinist Educ Beijing Key Lab Computat Intelligence & Intellige Beijing 100124 Peoples R China Beijing Univ Technol Beijing Lab Urban Mass Transit Beijing 100124 Peoples R China

Microseismic event picking is one of the key steps in seismic processing and imaging. Manually picking is a widely used way to pick the microseismic events, which is time-consuming. The standard short-term average/long-term average (STA/LTA) is a traditional method to pick the microseismic first arrivals, which would lead to inaccurate first-arrival picks in case of low signal-to-noise ratio (SNR). We developed a workflow to automatically pick the microseismic first arrivals by using the feature pyramid networks (FPNs). To train the proposed model, we first randomly select part of the microseismic traces and manually pick the time index of the first arrivals. Next, we segment every selected trace into two parts based on the time index of the manual picking and then assign each part a label. Afterward, we train the proposed fine-tuning FPN model by using the training data and the corresponding labels. It should be noticed that we proposed a loss function, named the point-aware loss, for solving the microseismic first-arrival picking issue. Finally, we predict the microseismic first arrivals by using the well-trained fine-tuning FPN model. The numerical examples demonstrate that our proposed model successfully identifies the microseismic first arrivals. The microseismic first arrivals predicted by using our proposed model are more robust and more accurate than those obtained by using the STA/LTA and the encoder-decoder network.

关键词： Data models Predictive models Noise measurement Numerical models Computational modeling Training data Feature extraction Deep learning encoder-decoder network feature pyramid networks (FPNs) first-arrival picking

来源：评论

学校读者我要写书评

暂无评论

A diffeomorphic unsupervised method for deformable soft tissue image registration

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2020年 120卷 103708-103708页

作者： Zhang, Shuo Liu, Peter Xiaoping Zheng, Minhua Shi, Wen Beijing Jiaotong Univ Sch Mech Elect & Control Engn Beijing 100044 Peoples R China Carleton Univ Dept Syst & Comp Engn Ottawa ON K1S 5B6 Canada

Background and Objectives: The image registration methods for deformable soft tissues utilize nonlinear transformations to align a pair of images precisely. In some situations, when there is huge gray scale difference or large deformation between the images to be registered, the deformation field tends to fold at some local voxels, which will result in the breakdown of the one-to-one mapping between images and the reduction of invertibility of the deformation field. In order to address this issue, a novel registration approach based on unsupervised learning is presented for deformable soft tissue image registration. Methods: A novel unsupervised learning based registration approach, which consists of a registration network, a velocity field integration module and a grid sampling module, is presented for deformable soft tissue image registration. The main contributions are: (1) A novel encoder-decoder network is presented for the evaluation of stationary velocity field. (2) A Jacobian determinant based penalty term (Jacobian loss) is developed to reduce the folding voxels and to improve the invertibility of the deformation field. Results and Conclusions: The experimental results show that a new pair of images can be accurately registered using the trained registration model. In comparison with the conventional state-of-the-art method, SyN, the invertibility of the deformation field, accuracy and speed are all improved. Compared with the deep learning based method, VoxelMorph, the proposed method improves the invertibility of the deformation field.

关键词： Deformable soft tissue image registration Unsupervised learning Invertibility Jacobian loss encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

Image Deblurring Using Feedback Mechanism and Dual Gated Attention network

引用

NEURAL PROCESSING LETTERS 2024年第2期56卷 88-88页

作者： Chen, Jian Ye, Shilin Jiang, Zhuwu Fang, Zhenghan Fujian Univ Technol Sch Elect Elect Engn & Phys Fuzhou 350118 Fujian Peoples R China Fujian Univ Technol Sch Ecol Environm & Urban Construct Fuzhou 350118 Fujian Peoples R China Johns Hopkins Univ Dept Biomed Engn Baltimore MD 21218 USA

Recently, image deblurring task driven by the encoder-decoder network has made a tremendous amount of progress. However, these encoder-decoder-based networks still have two disadvantages: (1) due to the lack of feedback mechanism in the decoder design, the reconstruction results of existing networks are still sub-optimal;(2) these networks introduce multiple modules, such as the self-attention mechanism, to improve the performance, which also increases the computational burden. To overcome these issues, this paper proposes a novel feedback-mechanism-based encoder-decoder network (namely, FMNet) that is equipped with two key components: (1) the feedback-mechanism-based decoder and (2) the dual gated attention module. To improve reconstruction quality, the feedback-mechanism-based decoder is proposed to leverage the feedback information via the feedback attention module, which adaptively selects useful features in the feedback path. To decrease the computational cost, an efficient dual gated attention module is proposed to perform the attention mechanism in the frequency domain twice, which improves deblurring performance while reducing the computational cost by avoiding redundant convolutions and feature channels. The superiority of FMNet in terms of both deblurring performance and computational efficiency is demonstrated via comparisons with state-of-the-art methods on multiple public datasets.

关键词： Image deblurring encoder-decoder network Feedback mechanism Gated attention

来源：评论

学校读者我要写书评

暂无评论

FGAM: A pluggable light-weight attention module for medical image segmentation

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2022年 146卷 105628-105628页

作者： Qiu, Zhongxi Hu, Yan Zhang, Jiayi Chen, Xiaoshan Liu, Jiang Southern Univ Sci & Technol Dept Comp Sci & Engn Shenzhen 51805 Guangdong Peoples R China Chinese Acad Sci Cixi Inst Biomed Engn Beijing Peoples R China Southern Univ Sci & Technol Dept Comp Sci & Engn Guangdong Prov Key Lab Brain inspired Intelligent Shenzhen 51805 Guangdong Peoples R China Southern Univ Sci & Technol Res Inst Trustworthy Autonomous Syst Shenzhen 51805 Guangdong Peoples R China

Medical image segmentation is fundamental for computer-aided diagnosis or surgery. Various attention modules are proposed to improve segmentation results, which exist some limitations for medical image segmentation, such as large computations, weak framework applicability, etc. To solve the problems, we propose a new attention module named FGAM, short for Feature Guided Attention Module, which is a simple but pluggable and effective module for medical image segmentation. The FGAM tries to dig out the feature representation ability in the encoder and decoder features. Specifically, the decoder shallow layer always contains abundant information, which is taken as a queryable feature dictionary in the FGAM. The module contains a parameter-free activator and can be deleted after various encoder-decoder networks' training. The efficacy of the FGAM is proved on various encoder-decoder models based on five datasets, including four publicly available datasets and one inhouse dataset.

关键词： Medical image segmentation Attention mechanism encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：