检索结果-内蒙古大学图书馆

11th International Conference on Image Processing Theory, Tools and Applications (IPTA)

作者： Beddiar, Djamila Oussalah, Mourad Tapio, Seppanen Univ Oulu CMVS Oulu Finland MIPT Fac Med Oulu Finland

ISBN: (纸本)9781665469647

Medical image captioning is the process of generating clinically significant descriptions to medical images, which has many applications among which medical report generation is the most frequent one. In general, automatic captioning of medical images is of great interest for medical experts since it offers assistance in diagnosis, disease treatment and automating the workflow of the health practitioners. Recently, many efforts have been put forward to obtain accurate descriptions but medical image captioning still provides weak and incorrect descriptions. To alleviate this issue, it is important to explain why the model produced a particular caption based on some specific features. This is performed through Artificial Intelligence Explainability (XAI), which aims to unfold the 'black-box' feature of deeplearning based models. We present in this paper an explainable module for medical image captioning that provides a sound interpretation of our attention-based encoder-decoder model by explaining the correspondence between visual features and semantic features. We exploit for that, self-attention to compute word importance of semantic features and visual attention to compute relevant regions of the image that correspond to each generated word of the caption in addition to visualization of visual features extracted at each layer of the Convolutional Neural Network (CNN) encoder. We finally evaluate our model using the ImageCLEF medical captioning dataset.

关键词： Image Captioning Medical images encoder-decoder Attention-maps Artificial Intelligence Explainability

来源：评论

学校读者我要写书评

暂无评论

Plant Leaf Disease Detection and Classification Using Segmentation encoder Techniques

Open Agriculture Journal

引用

Open Agriculture Journal 2024年 18卷

作者： Trivedi, Payal Narayan, Yogendra Ravi, Vinayakumar Kumar, Prashant Kaur, Prabhjot Tabianan, Kayalvily Singh, Prabhishek Diwakar, Manoj Department of Biotechnology Chandigarh University Punjab Mohali India School of Technology Woxsen University Telangana Hyderabad India Chitkara University Institute of Engineering and Technology Chitkara University Punjab India School of Computer Science Engineering and Technology Bennett University Greater Noida India Department of CSE Graphic Era Deemed to be University Uttarakhand Dehradun India Graphic Era Hill University Uttarakhand Dehradun India Center for Artificial Intelligence Prince Mohammad Bin Fahd University Khobar Saudi Arabia Faculty of Information Technology Inti International University Persiaran Perdana BBN Putra Nilai Negeri Sembilan Nilai 71800 Malaysia

Aims: Agriculture is one of the fundamental elements of human civilization. Crops and plant leaves are susceptible to many illnesses when grown for agricultural purposes. There may be less possibility of further harm to the plants if the illnesses are identified and classified accurately and early on. Background: Plant leaf diseases are typically predicted and classified by farmers tediously and inaccurately. Manual identification of diseases may take more time and may not accurately detect the disease. There could be a major drop in production if crop plants are destroyed due to slow detection and classification of plant illnesses. Radiologists used to segment leaf lesions manually, which takes a lot of time and work. Objective: It is established that deep learning models are superior to human specialists in the diagnosis of lesions on plant leaves. Here, the “Deep Convolutional Neural Network (DCNN)” based encoder-decoder architecture is suggested for the semantic segmentation of leaf lesions. Methods: A proposed semantic segmentation model is based on the Dense-Net encoder. The LinkNet-34 segmentation model performance is compared with two other models, SegNet and PSPNet. Additionally, the two encoders, ResNeXt and InceptionV3, have been compared to the performance of DenseNet-121, the encoder used in the LinkNet-34 model. After that, two different optimizers, such as Adam and Adamax, are used to optimize the proposed model. Results: The DenseNet-121 encoder utilizing Adam optimizer has been outperformed by the LinkNet-34 model, with a dice coefficient of 95% and a Jaccard Index of 93.2% with a validation accuracy of 97.57%. Conclusion: The detection and classification of leaf disease with deep learning models gives better results in comparison with other models. © 2024 The Author(s).

关键词： Classification Convolutional neural network (CNN) Detection encoder-decoder Plant leaf disease Semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

An Experimentation for generating Text-based Medical Report from Chest X-Ray images using Attention based Deep Learning Models

引用

Procedia Computer Science 2025年 258卷 658-668页

作者： Jainil Rana Amit Thakkar Bela Shah Department of Computer Science & Engineering CSPIT CHARUSAT Gujarat India Department of Computer Science & Engineering PIET Parul University Gujarat India

This study investigates the efficacy of attention-based deep learning models for generating text-based medical reports from chest X-ray images. Four distinct models were developed and evaluated: a basic encoder-decoder model (Model 1), an encoder-decoder architecture using an attention mechanism (Model 2), a model incorporating spatial feature preservation (Model 3), and a model with a bidirectional GRU in the decoder (Model 4). We trained and evaluated these models using the Indiana University Chest X-ray dataset (Open-i), employing the BLEU score as the primary performance metric. Model 1, using a greedy search decoding strategy, achieved an average BLEU score of 0.619. Incorporating an attention mechanism in Model 2 resulted in a modest improvement, reaching a BLEU score of 0.667 with beam search decoding. Model 3, preserving spatial information during feature extraction, further enhanced performance, achieving a BLEU score of 0.718. Finally, Model 4, integrating a bidirectional GRU, yielded the highest performance with a BLEU score of 0.745. Our results highlight the significant impact of attention mechanisms and spatial feature preservation in generating more accurate and detailed medical reports. The study highlights the potential of deep learning models for automating medical report generation, paving the way for further research and development in this domain.

关键词： X-Ray encoder-decoder Attention CheXNet spatial features BiGRU

来源：评论

学校读者我要写书评

暂无评论

A review on the attention mechanism of deep learning

引用

NEUROCOMPUTING 2021年 452卷 48-62页

作者： Niu, Zhaoyang Zhong, Guoqiang Yu, Hui Ocean Univ China Dept Comp Sci & Technol Qingdao 266100 Peoples R China Univ Portsmouth Sch Creat Technol Portsmouth PO1 2DJ Hants England

Attention has arguably become one of the most important concepts in the deep learning field. It is inspired by the biological systems of humans that tend to focus on the distinctive parts when processing large amounts of information. With the development of deep neural networks, attention mechanism has been widely used in diverse application domains. This paper aims to give an overview of the state-of-theart attention models proposed in recent years. Toward a better general understanding of attention mechanisms, we define a unified model that is suitable for most attention structures. Each step of the attention mechanism implemented in the model is described in detail. Furthermore, we classify existing attention models according to four criteria: the softness of attention, forms of input feature, input representation, and output representation. Besides, we summarize network architectures used in conjunction with the attention mechanism and describe some typical applications of attention mechanism. Finally, we discuss the interpretability that attention brings to deep learning and present its potential future trends. (c) 2021 Elsevier B.V. All rights reserved.

关键词： Attention mechanism Deep learning Recurrent Neural Network (RNN) Convolutional Neural Network (CNN) encoder-decoder Unified attention model Computer vision applications Natural language processing applications

来源：评论

学校读者我要写书评

暂无评论

SDPNet: A Deep Network for Pan-Sharpening With Enhanced Information Representation

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2021年第5期59卷 4120-4134页

作者： Xu, Han Ma, Jiayi Shao, Zhenfeng Zhang, Hao Jiang, Junjun Guo, Xiaojie Wuhan Univ Elect Informat Sch Wuhan 430072 Peoples R China Wuhan Univ State Key Lab Informat Engn Surveying Mapping & R Wuhan 430079 Peoples R China Harbin Inst Technol Sch Comp Sci & Technol Harbin 150001 Peoples R China Tianjin Univ Coll Intelligence & Comp Tianjin 300350 Peoples R China

In this article, we propose a surface- and deep-level constraint-based pan- sharpening network, termed SDPNet, to address the pan-sharpening problem. Focusing on the two primary goals of pan-sharpening, i. e., spatial and spectral information preservations, we first design two encoder-decoder networks to extract deep-level features from two types of source images, in addition to surface-level characteristics, as the enhanced information representation. The unique feature maps that characterize the unique information in source images can be obtained through the deep-level feature extraction. We further design a pan-sharpening network with densely connected blocks to strengthen feature propagation and reduce parameter number, where the unique feature maps are utilized to efficiently constrain the similarity between the pan-sharpened result and the ground truth, thus avoiding information distortion. Both qualitative and quantitative comparisons on the reduced-resolution and full-resolution source images demonstrate the advantages of our method over state-of-the-art methods. Our code is publicly available at https://***/hanna-xu/SDPNet.

关键词： encoder-decoder feature extraction image fusion pan-sharpening

来源：评论

学校读者我要写书评

暂无评论

Automatic Report Generation for Chest X-Ray Images via Adversarial Reinforcement Learning

引用

IEEE ACCESS 2021年 9卷 21236-21250页

作者： Hou, Daibing Zhao, Zijian Liu, Yuying Chang, Faliang Hu, Sanyuan Shandong Univ Sch Control Sci & Engn Jinan 250061 Peoples R China Shandong First Med Univ Affiliated Hosp 1 Dept Gen Surg Jinan 250014 Peoples R China

An adversarial reinforced report-generation framework for chest x-ray images is proposed. Previous medical-report-generation models are mostly trained by minimizing the cross-entropy loss or further optimizing the common image-captioning metrics, such as CIDEr, ignoring diagnostic accuracy, which should be the first consideration in this area. Inspired by the generative adversarial network, an adversarial reinforcement learning approach is proposed for report generation of chest x-ray images considering both diagnostic accuracy and language fluency. Specifically, an accuracy discriminator (AD) and fluency discriminator (FD) are built that serve as the evaluators by which a report based on these two aspects is scored. The FD checks how likely a report originates from a human expert, while the AD determines how much a report covers the key chest observations. The weighted score is viewed as a "reward" used for training the report generator via reinforcement learning, which solves the problem that the gradient cannot be passed back to the generative model when the output is discrete. Simultaneously, these two discriminators are optimized by maximum-likelihood estimation for better assessment ability. Additionally, a multi-type medical concept fused encoder followed by a hierarchical decoder is adopted as the report generator. Experiments on two large radiograph datasets demonstrate that the proposed model outperforms all methods to which it is compared.

关键词： Medical diagnostic imaging Decoding Training Generators Semantics Feature extraction X-ray imaging Medical report generation encoder-decoder adversarial training reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Cross-Domain Energy Consumption Prediction via ED-LSTM Networks

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2021年第8期E104D卷 1204-1213页

作者： Tao, Ye Kong, Fang Ju, Wenjun Li, Hui Hou, Ruichun Qingdao Univ Sci & Technol Coll Informat Sci & Technol Qingdao 266071 Peoples R China Qingdao Univ Sci & Technol Qingdao 266071 Peoples R China Haier Technol Co Ltd Qingdao 266000 Peoples R China Ocean Univ China Qingdao 266071 Peoples R China

As an important type of science and technology service resource, energy consumption data play a vital role in the process of value chain integration between home appliance manufacturers and the state grid. Accurate electricity consumption prediction is essential for demand response programs in smart grid planning. The vast majority of existing prediction algorithms only exploit data belonging to a single domain, i.e., historical electricity load data. However, dependencies and correlations may exist among different domains, such as the regional weather condition and local residential/industrial energy consumption profiles. To take advantage of cross-domain resources, a hybrid energy consumption prediction framework is presented in this paper. This framework combines the long short-term memory model with an encoder-decoder unit (ED-LSTM) to perform sequence-to-sequence forecasting. Extensive experiments are conducted with several of the most commonly used algorithms over integrated cross-domain datasets. The results indicate that the proposed multistep forecasting framework outperforms most of the existing approaches.

关键词： cross-domain feature fusion long short-term memory encoder-decoder multistep electricity load forecast

来源：评论

学校读者我要写书评

暂无评论

A De-raining semantic segmentation network for real-time foreground segmentation

引用

JOURNAL OF REAL-TIME IMAGE PROCESSING 2021年第3期18卷 873-887页

作者： Wang, Fanyi Zhang, Yihui Zhejiang Univ State Key Lab Modern Opt Instrumentat Hangzhou 310027 Peoples R China Henan Univ Sci & Technol Sch Mechatron Engn 263 Kaiyuan Ave Luoyang Peoples R China

Few researches have been proposed specifically for real-time semantic segmentation in rainy environments. However, the demand in this area is huge and it is challenging for lightweight networks. Therefore, this paper proposes a lightweight network which is specially designed for the foreground segmentation in rainy environments, named De-raining Semantic Segmentation Network (DRSNet). By analyzing the characteristics of raindrops, the MultiScaleSE Block is targetedly designed to encode the input image, it uses multi-scale dilated convolutions to increase the receptive field, and SE attention mechanism to learn the weights of each channels. To combine semantic information between different encoder and decoder layers, it is proposed to use Asymmetric Skip, that is, the higher semantic layer of encoder employs bilinear interpolation and the output passes through pointwise convolution, then added element-wise to the lower semantic layer of the decoder. According to the control experiments, the performances of MultiScaleSE Block and Asymmetric Skip compared with SEResNet18 and Symmetric Skip respectively are improved to a certain degree on the Foreground Accuracy index. The parameters and the floating point of operations (FLOPs) of DRSNet are only 0.54M and 0.20GFLOPs separately. The state-of-the-art results and real-time performances are achieved on both the UESTC all-day Scenery add rain (UAS-add-rain) and the Baidu People Segmentation add rain (BPS-add-rain) benchmarks with the input sizes of 192*128, 384*256 and 768*512. The speed of DRSNet exceeds all the networks within 1GFLOPs, and Foreground Accuracy index is also the best among the similar magnitude networks on both benchmarks.

关键词： Real-time Rainy environments Foreground segmentation encoder-decoder Lightweight network

来源：评论

学校读者我要写书评

暂无评论

Image Noise Reduction Based on a Fixed Wavelet Frame and CNNs Applied to CT

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2021年 30卷 9386-9401页

作者： Zavala-Mondragon, Luis Albert de With, Peter H. N. van der Sommen, Fons Eindhoven Univ Technol Dept Elect Engn VCA Lab NL-5612 Eindhoven Netherlands

Radiation exposure in CT imaging leads to increased patient risk. This motivates the pursuit of reduced-dose scanning protocols, in which noise reduction processing is indispensable to warrant clinically acceptable image quality. Convolutional Neural Networks (CNNs) have received significant attention as an alternative for conventional noise reduction and are able to achieve state-of-the art results. However, the internal signal processing in such networks is often unknown, leading to sub-optimal network architectures. The need for better signal preservation and more transparency motivates the use of Wavelet Shrinkage Networks (WSNs), in which the Encoding-Decoding (ED) path is the fixed wavelet frame known as Overcomplete Haar Wavelet Transform (OHWT) and the noise reduction stage is data-driven. In this work, we considerably extend the WSN framework by focusing on three main improvements. First, we simplify the computation of the OHWT that can be easily reproduced. Second, we update the architecture of the shrinkage stage by further incorporating knowledge of conventional wavelet shrinkage methods. Finally, we extensively test its performance and generalization, by comparing it with the RED and FBPConvNet CNNs. Our results show that the proposed architecture achieves similar performance to the reference in terms of MSSIM (0.667, 0.662 and 0.657 for DHSN2, FBPConvNet and RED, respectively) and achieves excellent quality when visualizing patches of clinically important structures. Furthermore, we demonstrate the enhanced generalization and further advantages of the signal flow, by showing two additional potential applications, in which the new DHSN2 is used as regularizer: (1) iterative reconstruction and (2) ground-truth free training of the proposed noise reduction architecture. The presented results prove that the tight integration of signal processing and deep learning leads to simpler models with improved generalization.

关键词： Noise reduction Discrete wavelet transforms Wireless sensor networks Computed tomography Kernel Convolution Image reconstruction CNN wavelets shrinkage encoder-decoder denoising

来源：评论

学校读者我要写书评

暂无评论

Integrating Part-Object Relationship and Contrast for Camouflaged Object Detection

引用

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2021年 16卷 5154-5166页

作者： Liu, Yi Zhang, Dingwen Zhang, Qiang Han, Jungong Changzhou Univ Sch Comp Sci & Artificial Intelligence Aliyun Sch Big Data Changzhou 213164 Jiangsu Peoples R China Changzhou Univ Sch Software Changzhou 213164 Jiangsu Peoples R China Northwestern Polytech Univ Sch Automat Xian 710071 Shaanxi Peoples R China Xidian Univ Sch Mechanoelect Engn Xian 710071 Shaanxi Peoples R China Aberystwyth Univ Dept Comp Sci Aberystwyth SY23 3DB Dyfed Wales

Object detectors that solely rely on image contrast are struggling to detect camouflaged objects in images because of the high similarity between camouflaged objects and their surroundings. To address this issue, in this paper, we investigate the role of the part-object relationship for camouflaged object detection. Specifically, we propose a Part-Object relationship and Contrast Integrated Network (POCINet) covering both search and identification stages, where each stage adopts an appropriate scheme to engage the contrast information and part-object relational knowledge for camouflaged pattern decoding. Besides, we bridge these two stages via a Search-to-Identification Guidance (SIG) module, in which the search result, as well as decoded semantic knowledge, jointly enhances the features encoding ability of the identification stage. Experimental results demonstrate the superiority of our algorithm on three datasets. Notably, our algorithm raises $F_\beta $ of the best existing method by approximately 17 points on the CPD1K dataset. The source code will be released soon.

关键词： Object detection Search problems Feature extraction Decoding Pipelines Semantics Image segmentation Camouflaged object detection contrast part-object relationships encoder-decoder multi-stage

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：