检索结果-内蒙古大学图书馆

KIDBA-Net: A Multi-Feature Fusion Brain Tumor Segmentation Network Utilizing Kernel Inception Depthwise Convolution and Bi-Cross Attention

引用

INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY 2025年第2期35卷

作者： Min, Jie Huang, Tongyuan Huang, Boxiong Hu, Chuanxin Zhang, Zhixing Chongqing Univ Technol Sch Artificial Intelligence Chongqing Peoples R China

Automatic brain tumor segmentation technology plays a crucial role in tumor diagnosis, particularly in the precise delineation of tumor subregions. It can assist doctors in accurately assessing the type and location of brain tumors, potentially saving patients' lives. However, the highly variable size and shape of brain tumors, along with their similarity to healthy tissue, pose significant challenges in the segmentation of multi-label brain tumor subregions. This paper proposes a network model, KIDBA-Net, based on an encoder-decoder architecture, aimed at solving the issue of pixel-level classification errors in multi-label tumor subregions. The proposed Kernel Inception Depthwise Block (KIDB) employs multi-kernel depthwise convolution to extract multi-scale features in parallel, accurately capturing the feature differences between tumor types to mitigate misclassification. To ensure the network focuses more on the lesion areas and excludes the interference of irrelevant tissues, this paper adopts Bi-Cross Attention as a skip connection hub to bridge the semantic gap between layers. Additionally, the Dynamic Feature Reconstruction Block (DFRB) exploits the complementary advantages of convolution and dynamic upsampling operators, effectively aiding the model in generating high-resolution prediction maps during the decoding phase. The proposed model surpasses other state-of-the-art brain tumor segmentation methods on the BraTS2018 and BraTS2019 datasets, particularly in the segmentation accuracy of smaller and highly overlapping tumor core (TC) and enhanced tumor (ET), achieving DSC scores of 87.8%, 82.0%, and 90.2%, 88.7%, respectively;Hausdorff distances of 2.8, 2.7 mm, and 2.7, 2.0 mm.

关键词： Bi-Cross Attention brain tumor segmentation encoder-decoder architecture Kernel Inception Depthwise Block MRI

来源：评论

学校读者我要写书评

暂无评论

Attribute-Driven Filtering: A new attributes predicting approach for fine-grained image captioning

引用

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2024年第PartA期137卷

作者： Hossen, Md. Bipul Ye, Zhongfu Abdussalam, Amr Ul Hassan, Shabih Univ Sci & Technol China Sch Informat Sci & Technol Hefei 230027 Anhui Peoples R China

Fine-grained image captioning with attribute information has garnered significant attention in the realms of computer vision and natural language processing, demanding precise and contextually relevant descriptions of visual content. While previous attribute-driven image captioning models have shown improvements, challenges remain, such as the independence of attribute predictors and caption generators and the semantic gap between images and attributes. Another common issue is the inclusion of all attributes at every time step, despite most attributes being irrelevant to the word currently being generated. This can divert the model's attention toward erroneous semantic details, resulting in a performance decline. To address these issues, we propose a novel Attribute-Driven Filtering (ADF) captioning network designed to provide rich and nuanced descriptions. This model incorporates a unique Attribute Predictor Module (APM) that dynamically predicts the most pertinent attributes in accordance with the textual context, utilizing different attributes at various time steps. The novelty of this approach lies in recognizing that not all attributes hold equal relevance at each time step, and the APM filters out irrelevant attributes to generate precise and contextually relevant captions. Furthermore, this model features a fusion mechanism that integrates visual information from a conventional attention module with attribute information predicted by the APM, aiming to reduce the visual semantic gap between images and attributes. Extensive experimentation demonstrates that the ADF model outperforms advanced models, achieving impressive CIDEr-D scores of 72.0 (Flickr30K) and 123.3 (MS-COCO) through reinforcement learning optimization. It consistently surpasses baseline models across diverse evaluation metrics, highlighting its effectiveness and robustness.

关键词： Fine-grained captioning Fusion mechanism encoder-decoder architecture Attribute predictor module

来源：评论

学校读者我要写书评

暂无评论

Two-stage deep image restoration network with application to single image shadow removal

引用

APPLIED SOFT COMPUTING 2024年 167卷

作者： Yeh, Chia-Hung Zhan, Zhi-Xiang Kang, Li-Wei Natl Taiwan Normal Univ Dept Elect Engn Taipei Taiwan Natl Sun Yat Sen Univ Dept Elect Engn Kaohsiung Taiwan

In this paper, we introduce a two-stage deep learning-based image restoration network and its application to remove shadow information from a single image, named by ESCNet (encoder-decoder based Shadow removal with Colorization Network). Most existed single image-based shadow removal methods may suffer from that the shadow contains multiple regions of different colors or rich image details. To tackle with the problems, our key idea is to first remove shadow(s) from an image followed by repainting the shadow-removed region(s) in this image. To accomplish this, we present a deep two-stage network, cascading a shadow removal network (SRN) and a colorization network (CN). The presented encoder-decoder-based SRN with fusion of global and local feature information is used to remove the shadow(s) in the grayscale domain of the input image while recovering the image details for the shadow-removed region(s). Then the proposed CN aims at repainting the removed shadow region(s) via re-colorization. The proposed deep model has been well trained and well evaluated on the two well-known public datasets, i.e., ISTD (Image Shadow Triplets Dataset) and SRD (Shadow Removal Dataset). Experimental results have shown that the proposed method outperforms the compared state-of-the-art (SOTA) shadow removal approaches quantitatively and qualitatively.

关键词： Convolutional neural networks Shadow removal encoder-decoder architecture Colorization Deep learning

来源：评论

学校读者我要写书评

暂无评论

S3L: Spectrum Transformer for Self-Supervised Learning in Hyperspectral Image Classification

引用

REMOTE SENSING 2024年第6期16卷 970页

作者： Guo, Hufeng Liu, Wenyi North Univ China Sch Instrument & Elect State Key Lab Dynam Measurement Technol Taiyuan 030051 Peoples R China Henan Coll Transportat Dept Transportat Informat Engn Zhengzhou 451460 Peoples R China

In the realm of Earth observation and remote sensing data analysis, the advancement of hyperspectral imaging (HSI) classification technology is of paramount importance. Nevertheless, the intricate nature of hyperspectral data, coupled with the scarcity of labeled data, presents significant challenges in this domain. To mitigate these issues, we introduce a self-supervised learning algorithm predicated on a spectral transformer for HSI classification under conditions of limited labeled data, with the objective of enhancing the efficacy of HSI classification. The S3L algorithm operates in two distinct phases: pretraining and fine-tuning. During the pretraining phase, the algorithm learns the spatial representation of HSI from unlabeled data, utilizing a masking mechanism and a spectral transformer, thereby augmenting the sequence dependence of spectral features. Subsequently, in the fine-tuning phase, labeled data is employed to refine the pretrained weights, thereby improving the precision of HSI classification. Within the comprehensive encoder-decoder framework, we propose a novel spectral transformer module specifically engineered to synergize spatial feature extraction with spectral domain analysis. This innovative module adeptly navigates the complex interplay among various spectral bands, capturing both global and sequential spectral dependencies. Uniquely, it incorporates a gated recurrent unit (GRU) layer within the encoder to enhance its ability to process spectral sequences. Our experimental evaluations across several public datasets reveal that our proposed method, distinguished by its spectral transformer, achieves superior classification performance, particularly in scenarios with limited labeled samples, outperforming existing state-of-the-art approaches.

关键词： hyperspectral image classification self-supervised learning spectral transformer encoder-decoder architecture limited labeled data

来源：评论

学校读者我要写书评

暂无评论

Are Graphs and GCNs necessary for short-term metro ridership forecasting?

引用

EXPERT SYSTEMS WITH APPLICATIONS 2024年 254卷

作者： Yang, Qiong Xu, Xianghua Wang, Zihang Yu, Juan Hu, Xiaodong Hangzhou Dianzi Univ Sch Comp Sci Hangzhou 310018 Peoples R China Zhejiang Ind Polytech Coll Dept Art & Design Shaoxing 312000 Peoples R China Zhejiang Normal Univ Sch Comp Sci & Technol Jinhua 321004 Peoples R China

Short-term metro ridership prediction is of great significance to efficient and economic operation of Urban Rail Transit (URT) systems. With the popularity of Graph Convolution Networks (GCN) and Transformers, the recent notable metro ridership forecasting methods are GCN-based and Transformer -based models. However, existing methods face the following drawbacks. First, GCN-based models fail to effectively capture global spatial correlations which are significant for accurate prediction. Second, Transformer -based models are prone to loss temporal information due to the permutation -invariant and anti -order properties of the self -attention which they used for capturing temporal correlations. To overcome the drawbacks, we propose a novel sequence -tosequence metro ridership prediction model, named SDT-GRU, with Stacked DT-GRU layers as both encoder and decoder. The core component of our model is DT-GRU, which integrates Dual -branch Transformer decoder into the GRU to effectively capture global spatial correlations and temporal correlations with Transformer decoder and GRU, separately. In particular, the DT-GRU module uses one branch Transformer encoder layer to capture spatial correlations within the same timestamp, and adopts another Transformer encoder layer to implicitly capture spatio-temporal correlations among previous timestamps. Then, outputs of the two Transformer encoder layers are fed into a GRU layer to capturing spatio-temporal patterns. To evaluate the effectiveness of the proposed SDT-GRU, we conduct comprehensive experiments on three real -world metro ridership datasets from Beijing, Shanghai and Hangzhou. Experimental results demonstrate that our SDT-GRU achieves better prediction performance than the state-of-the-art baselines.

关键词： Metro ridership prediction Transformer encoder GRU encoder-decoder architecture Graph convolutional networks (GCNs)

来源：评论

学校读者我要写书评

暂无评论

Lightweight Self-Attention Network for Semantic Segmentation

Lightweight Self-Attention Network for Semantic Segmentation

引用

IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) / IEEE World Congress on Computational Intelligence (IEEE WCCI) / International Joint Conference on Neural Networks (IJCNN) / IEEE Congress on Evolutionary Computation (IEEE CEC)

作者： Zhou, Yan Zhou, Haibin Li, Nanjun Li, Jianxun Wang, Dongli Xiangtan Univ Sch Automat & Elect Informat Xiangtan 411105 Peoples R China Xiangtan Univ Sch Math & Computat Sci Xiangtan 411105 Peoples R China Shenzhen CBPM KEXIN Banking Technol CO LTD Shenzhen 518000 Peoples R China Shanghai Jiao Tong Univ Sch Elect Informat & Elect Engn Shanghai 200240 Peoples R China

ISBN: (数字)9781728186719

ISBN: (纸本)9781728186719

The deep neural network model based on self-attention (SA) for obtaining rich contextual information has been widely adopted in semantic segmentation. However, the computational complexity of the standard self-attentive module is high, which partly limits the use of this module. In this work, we propose the lightweight self-attention network (LSANet) for semantic segmentation. Specifically, the Lightweight Self-Attentive Module (LSAM) captures information using a hand-designed compact feature representation, and weighted fusion of position information. In the decoder structure, an improved up-sampling module is proposed. Compared with the bilinear upsampling, this method achieves better results in restoring image details. The experimental results on PASCAL VOC 2012, and Cityscapes datasets show the effectiveness of our method, which simplifies operations and improves performance.

关键词： Semantic segmentation Attention module encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

LocalBins: Improving Depth Estimation by Learning Local Distributions 17th

LocalBins: Improving Depth Estimation by Learning Local Dist...

引用

17th European Conference on Computer Vision (ECCV)

作者： Bhat, Shariq Farooq Alhashim, Ibraheem Wonka, Peter KAUST Thuwal Saudi Arabia Saudi Data & Artificial Intelligence Authority SD Natl Ctr Artificial Intelligence NCAI Riyadh Saudi Arabia

ISBN: (纸本)9783031197680;9783031197697

We propose a novel architecture for depth estimation from a single image. The architecture itself is based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks. We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways. First, instead of predicting global depth distributions, we predict depth distributions of local neighborhoods at every pixel. Second, instead of predicting depth distributions only towards the end of the decoder, we involve all layers of the decoder. We call this new architecture LocalBins. Our results demonstrate a clear improvement over the state-of-the-art in all metrics on the NYU-Depth V2 dataset. Code and pretrained models will be made publicly available (https://***/sharigfarooq123/LocalBins).

关键词： Single image depth estimation encoder-decoder architecture Deep learning Dense regression Histogram prediction

来源：评论

学校读者我要写书评

暂无评论

Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil

引用

ATMOSPHERE 2024年第7期15卷 761页

作者： Barbosa, Humberto A. Buriti, Catarina O. Kumar, T. V. Lakshmi Univ Fed Alagoas Lab Analise & Proc Imagens Satelites LAPIS Inst Ciencias Atmosfer AC Simoes Campus BR-57072900 Maceio Brazil Minist Sci Technol & Innovat MCTI Natl Semiarid Inst INSA BR-58100000 Campina Grande Brazil Jawaharlal Nehru Univ Sch Environm Sci New Mehrauli Rd New Delhi 110067 India

Flash droughts (FDs) pose significant challenges for accurate detection due to their short duration. Conventional drought monitoring methods have difficultly capturing this rapidly intensifying phenomenon accurately. Machine learning models are increasingly useful for detecting droughts after training the models with data. Northeastern Brazil (NEB) has been a hot spot for FD events with significant ecological damage in recent years. This research introduces a novel 2D convolutional neural network (CNN) designed to identify spatial FDs in historical simulations based on multiple environmental factors and thresholds as inputs. Our model, trained with hydro-climatic data, provides a probabilistic drought detection map across northeastern Brazil (NEB) in 2012 as its output. Additionally, we examine future changes in FDs using the Coupled Model Intercomparison Project Phase 6 (CMIP6) driven by outputs from Shared Socioeconomic Pathways (SSPs) under the SSP5-8.5 scenario of 2024-2050. Our results demonstrate that the proposed spatial FD-detecting model based on 2D CNN architecture and the methodology for robust learning show promise for regional comprehensive FD monitoring. Finally, considerable spatial variability of FDs across NEB was observed during 2012 and 2024-2050, which was particularly evident in the S & atilde;o Francisco River Basin. This research significantly contributes to advancing our understanding of flash droughts, offering critical insights for informed water resource management and bolstering resilience against the impacts of flash droughts.

关键词： flash drought convolutional neural network encoder-decoder architecture Caatinga climate change hydro-climatic data

来源：评论

学校读者我要写书评

暂无评论

Identification of Flow Pressure-Driven Leakage Zones Using Improved EDNN-PP-LCNetV2 with Deep Learning Framework in Water Distribution System

引用

PROCESSES 2024年第9期12卷 1992页

作者： Dong, Bo Shu, Shihu Li, Dengxin Donghua Univ Coll Environm Sci & Engn State Environm Protect Engn Ctr Pollut Treatment & Shanghai 201620 Peoples R China Chuzhou Vocat & Tech Coll Architectural Engn Inst Chuzhou 239000 Peoples R China

This study introduces a novel deep learning framework for detecting leakage in water distribution systems (WDSs). The key innovation lies in a two-step process: First, the WDS is partitioned using a K-means clustering algorithm based on pressure sensitivity analysis. Then, an encoder-decoder neural network (EDNN) model is employed to extract and process the pressure and flow sensitivities. The core of the framework is the PP-LCNetV2 architecture that ensures the model's lightweight, which is optimized for CPU devices. This combination ensures rapid, accurate leakage detection. Three cases are employed to evaluate the method. By applying data augmentation techniques, including the demand and measurement noises, the framework demonstrates robustness across different noise levels. Compared with other methods, the results show this method can efficiently detect over 90% of leakage across different operating conditions while maintaining a higher recognition of the magnitude of leakages. This research offers a significant improvement in computational efficiency and detection accuracy over existing approaches.

关键词： leakage detection water distribution systems deep learning network partitioning encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

Advancing Digital Image-Based Recognition of Soil Water Content: A Case Study in Bailu Highland, Shaanxi Province, China

引用

WATER 2024年第8期16卷 1133-1133页

作者： Zhang, Yaozhong Zhang, Han Lan, Hengxing Li, Yunchuang Liu, Honggang Sun, Dexin Wang, Erhao Dong, Zhonghong Changan Univ Key Lab Highway Construct Technol & Equipment Minist Educ Xian 710064 Peoples R China Chinese Acad Sci Inst Geog Sci & Nat Resources Res State Key Lab Resources & Environm Informat Syst Beijing 100101 Peoples R China Changan Univ Sch Geol Engn & Geomat Xian 710064 Peoples R China China Construct First Grp Corp Ltd Xian 710075 Peoples R China

Soil water content (SWC) plays a vital role in agricultural management, geotechnical engineering, hydrological modeling, and climate research. Image-based SWC recognition methods show great potential compared to traditional methods. However, their accuracy and efficiency limitations hinder wide application due to their status as a nascent approach. To address this, we design the LG-SWC-R3 model based on an attention mechanism to leverage its powerful learning capabilities. To enhance efficiency, we propose a simple yet effective encoder-decoder architecture (PVP-Transformer-ED) designed on the principle of eliminating redundant spatial information from images. This architecture involves masking a high proportion of soil images and predicting the original image from the unmasked area to aid the PVP-Transformer-ED in understanding the spatial information correlation of the soil image. Subsequently, we fine-tune the SWC recognition model on the pre-trained encoder of the PVP-Transformer-ED. Extensive experimental results demonstrate the excellent performance of our designed model (R2 = 0.950, RMSE = 1.351%, MAPE = 0.081, MAE = 1.369%), surpassing traditional models. Although this method involves processing only a small fraction of original image pixels (approximately 25%), which may impact model performance, it significantly reduces training time while maintaining model error within an acceptable range. Our study provides valuable references and insights for the popularization and application of image-based SWC recognition methods.

关键词： soil water content (SWC) image processing deep learning attention mechanism encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：