检索结果-内蒙古大学图书馆

Attention based lightweight asymmetric network for real-time semantic segmentation

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2024年 130卷

作者： Liu, Qian Wang, Cunbao Li, Zhensheng Qi, Youwei Fang, Jiongtao Nanjing Univ Informat Sci & Technol Sch Artificial Intelligence Sch Future Technol Nanjing 210044 Peoples R China Nanjing Univ Informat Sci & Technol Sch Comp Sci Nanjing 210044 Peoples R China

Real-time semantic segmentation is one of the important tasks in the field of computer vision, which is widely used in the fields of autonomous driving and medical imaging. Existing lightweight networks usually improve inference speed at the sacrifice of segmentation accuracy. How to achieve a balance between accuracy and speed is still a challenging problem for real-time semantic segmentation. In this paper, we propose an attention based lightweight asymmetric network (ALANet) to address this problem. Specifically, in the encoder, a channel-wise attention based depth-wise asymmetric block (CADAB) is designed to extract sufficient features, which has a small number of parameters. In the decoder, a spatial attention based pyramid pooling (SAPP) module is presented to aggregate multi-scale context information by using a few convolutions and poolings;and a pixel-wise attention based multi-scale feature fusion (PAMFF) module is developed to fuse features from different scales and generate pixel-wise attention for improving image restoration. Our ALANet has only 1.32M parameters. Experimental results on the Cityscapes and CamVid datasets show that ALANet obtains the segmentation accuracy (mIoU) of 74.4% and 69.5% and the inference speed of 115.6FPS and 113.2FPS, respectively. These results demonstrate that ALANet achieves a good balance between accuracy and speed.

关键词： Autonomous driving Real-time semantic segmentation Lightweight asymmetric network encoder-decoder architecture Attention

来源：评论

学校读者我要写书评

暂无评论

Lightweight Self-Attention Network for Semantic Segmentation

Lightweight Self-Attention Network for Semantic Segmentation

引用

IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) / IEEE World Congress on Computational Intelligence (IEEE WCCI) / International Joint Conference on Neural Networks (IJCNN) / IEEE Congress on Evolutionary Computation (IEEE CEC)

作者： Zhou, Yan Zhou, Haibin Li, Nanjun Li, Jianxun Wang, Dongli Xiangtan Univ Sch Automat & Elect Informat Xiangtan 411105 Peoples R China Xiangtan Univ Sch Math & Computat Sci Xiangtan 411105 Peoples R China Shenzhen CBPM KEXIN Banking Technol CO LTD Shenzhen 518000 Peoples R China Shanghai Jiao Tong Univ Sch Elect Informat & Elect Engn Shanghai 200240 Peoples R China

ISBN: (数字)9781728186719

ISBN: (纸本)9781728186719

The deep neural network model based on self-attention (SA) for obtaining rich contextual information has been widely adopted in semantic segmentation. However, the computational complexity of the standard self-attentive module is high, which partly limits the use of this module. In this work, we propose the lightweight self-attention network (LSANet) for semantic segmentation. Specifically, the Lightweight Self-Attentive Module (LSAM) captures information using a hand-designed compact feature representation, and weighted fusion of position information. In the decoder structure, an improved up-sampling module is proposed. Compared with the bilinear upsampling, this method achieves better results in restoring image details. The experimental results on PASCAL VOC 2012, and Cityscapes datasets show the effectiveness of our method, which simplifies operations and improves performance.

关键词： Semantic segmentation Attention module encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

LocalBins: Improving Depth Estimation by Learning Local Distributions 17th

LocalBins: Improving Depth Estimation by Learning Local Dist...

引用

17th European Conference on Computer Vision (ECCV)

作者： Bhat, Shariq Farooq Alhashim, Ibraheem Wonka, Peter KAUST Thuwal Saudi Arabia Saudi Data & Artificial Intelligence Authority SD Natl Ctr Artificial Intelligence NCAI Riyadh Saudi Arabia

ISBN: (纸本)9783031197680;9783031197697

We propose a novel architecture for depth estimation from a single image. The architecture itself is based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks. We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways. First, instead of predicting global depth distributions, we predict depth distributions of local neighborhoods at every pixel. Second, instead of predicting depth distributions only towards the end of the decoder, we involve all layers of the decoder. We call this new architecture LocalBins. Our results demonstrate a clear improvement over the state-of-the-art in all metrics on the NYU-Depth V2 dataset. Code and pretrained models will be made publicly available (https://***/sharigfarooq123/LocalBins).

关键词： Single image depth estimation encoder-decoder architecture Deep learning Dense regression Histogram prediction

来源：评论

学校读者我要写书评

暂无评论

Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil

引用

ATMOSPHERE 2024年第7期15卷 761页

作者： Barbosa, Humberto A. Buriti, Catarina O. Kumar, T. V. Lakshmi Univ Fed Alagoas Lab Analise & Proc Imagens Satelites LAPIS Inst Ciencias Atmosfer AC Simoes Campus BR-57072900 Maceio Brazil Minist Sci Technol & Innovat MCTI Natl Semiarid Inst INSA BR-58100000 Campina Grande Brazil Jawaharlal Nehru Univ Sch Environm Sci New Mehrauli Rd New Delhi 110067 India

Flash droughts (FDs) pose significant challenges for accurate detection due to their short duration. Conventional drought monitoring methods have difficultly capturing this rapidly intensifying phenomenon accurately. Machine learning models are increasingly useful for detecting droughts after training the models with data. Northeastern Brazil (NEB) has been a hot spot for FD events with significant ecological damage in recent years. This research introduces a novel 2D convolutional neural network (CNN) designed to identify spatial FDs in historical simulations based on multiple environmental factors and thresholds as inputs. Our model, trained with hydro-climatic data, provides a probabilistic drought detection map across northeastern Brazil (NEB) in 2012 as its output. Additionally, we examine future changes in FDs using the Coupled Model Intercomparison Project Phase 6 (CMIP6) driven by outputs from Shared Socioeconomic Pathways (SSPs) under the SSP5-8.5 scenario of 2024-2050. Our results demonstrate that the proposed spatial FD-detecting model based on 2D CNN architecture and the methodology for robust learning show promise for regional comprehensive FD monitoring. Finally, considerable spatial variability of FDs across NEB was observed during 2012 and 2024-2050, which was particularly evident in the S & atilde;o Francisco River Basin. This research significantly contributes to advancing our understanding of flash droughts, offering critical insights for informed water resource management and bolstering resilience against the impacts of flash droughts.

关键词： flash drought convolutional neural network encoder-decoder architecture Caatinga climate change hydro-climatic data

来源：评论

学校读者我要写书评

暂无评论

Identification of Flow Pressure-Driven Leakage Zones Using Improved EDNN-PP-LCNetV2 with Deep Learning Framework in Water Distribution System

引用

PROCESSES 2024年第9期12卷 1992页

作者： Dong, Bo Shu, Shihu Li, Dengxin Donghua Univ Coll Environm Sci & Engn State Environm Protect Engn Ctr Pollut Treatment & Shanghai 201620 Peoples R China Chuzhou Vocat & Tech Coll Architectural Engn Inst Chuzhou 239000 Peoples R China

This study introduces a novel deep learning framework for detecting leakage in water distribution systems (WDSs). The key innovation lies in a two-step process: First, the WDS is partitioned using a K-means clustering algorithm based on pressure sensitivity analysis. Then, an encoder-decoder neural network (EDNN) model is employed to extract and process the pressure and flow sensitivities. The core of the framework is the PP-LCNetV2 architecture that ensures the model's lightweight, which is optimized for CPU devices. This combination ensures rapid, accurate leakage detection. Three cases are employed to evaluate the method. By applying data augmentation techniques, including the demand and measurement noises, the framework demonstrates robustness across different noise levels. Compared with other methods, the results show this method can efficiently detect over 90% of leakage across different operating conditions while maintaining a higher recognition of the magnitude of leakages. This research offers a significant improvement in computational efficiency and detection accuracy over existing approaches.

关键词： leakage detection water distribution systems deep learning network partitioning encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

Advancing Digital Image-Based Recognition of Soil Water Content: A Case Study in Bailu Highland, Shaanxi Province, China

引用

WATER 2024年第8期16卷 1133-1133页

作者： Zhang, Yaozhong Zhang, Han Lan, Hengxing Li, Yunchuang Liu, Honggang Sun, Dexin Wang, Erhao Dong, Zhonghong Changan Univ Key Lab Highway Construct Technol & Equipment Minist Educ Xian 710064 Peoples R China Chinese Acad Sci Inst Geog Sci & Nat Resources Res State Key Lab Resources & Environm Informat Syst Beijing 100101 Peoples R China Changan Univ Sch Geol Engn & Geomat Xian 710064 Peoples R China China Construct First Grp Corp Ltd Xian 710075 Peoples R China

Soil water content (SWC) plays a vital role in agricultural management, geotechnical engineering, hydrological modeling, and climate research. Image-based SWC recognition methods show great potential compared to traditional methods. However, their accuracy and efficiency limitations hinder wide application due to their status as a nascent approach. To address this, we design the LG-SWC-R3 model based on an attention mechanism to leverage its powerful learning capabilities. To enhance efficiency, we propose a simple yet effective encoder-decoder architecture (PVP-Transformer-ED) designed on the principle of eliminating redundant spatial information from images. This architecture involves masking a high proportion of soil images and predicting the original image from the unmasked area to aid the PVP-Transformer-ED in understanding the spatial information correlation of the soil image. Subsequently, we fine-tune the SWC recognition model on the pre-trained encoder of the PVP-Transformer-ED. Extensive experimental results demonstrate the excellent performance of our designed model (R2 = 0.950, RMSE = 1.351%, MAPE = 0.081, MAE = 1.369%), surpassing traditional models. Although this method involves processing only a small fraction of original image pixels (approximately 25%), which may impact model performance, it significantly reduces training time while maintaining model error within an acceptable range. Our study provides valuable references and insights for the popularization and application of image-based SWC recognition methods.

关键词： soil water content (SWC) image processing deep learning attention mechanism encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block 23

GLD-Net: Improving Monaural Speech Enhancement by Learning G...

引用

Interspeech Conference

作者： Xu, Xinmeng Wang, Yang Jia, Jie Chen, Binbin Hao, Jianjun Trinity Coll Dublin Elect & Elect Engn Dublin Ireland Vivo AI Lab Shenzhen Peoples R China Hubei Univ Chinese Med Sch Foreign Languages Wuhan Peoples R China

For monaural speech enhancement, contextual information is important for accurate speech estimation. However, commonly used convolution neural networks (CNNs) are weak in capturing temporal contexts since they only build blocks that process one local neighborhood at a time. To address this problem, we learn from human auditory perception to introduce a two-stage trainable reasoning mechanism, referred as global-local dependency (GLD) block. GLD blocks capture long-term dependency of time-frequency bins both in global level and local level from the noisy spectrogram to help detecting correlations among speech part, noise part, and whole noisy input. What is more, we conduct a monaural speech enhancement network called GLD-Net, which adopts encoder-decoder architecture and consists of speech object branch, interference branch, and global noisy branch. The extracted speech feature at global-level and local-level are efficiently reasoned and aggregated in each of the branches. We compare the proposed GLD-Net with existing state-of-art methods on WSJ0 and DEMAND dataset. The results show that GLD-Net outperforms the state-of-the-art methods in terms of PESQ and STOI.

关键词： monaural speech enhancement global and local dependency encoder-decoder architecture two-stage trainable reasoning mechanism

来源：评论

学校读者我要写书评

暂无评论

C-LIENet: A Multi-Context Low-Light Image Enhancement Network

引用

IEEE ACCESS 2021年 9卷 31053-31064页

作者： Ravirathinam, Praveen Goel, Divyam Ranjani, J. Jennifer Birla Inst Technol & Sci Dept Comp Sci & Informat Syst Pilani 333031 Rajasthan India

Enhancement of low-light images is a challenging task due to the impact of low brightness, low contrast, and high noise. The inability to collect natural labeled data intensifies this problem further. Many researchers have attempted to solve this problem using learning-based approaches;however, most models ignore the impact of noise in low-lit images. In this paper, an encoder-decoder architecture, made up of separable convolution layers that solve the issues encountered in low-light image enhancement, is proposed. The architecture is trained end-to-end on a custom low-light image dataset (LID), comprising both clean and noisy images. We introduce a unique multi-context feature extraction module (MC-FEM) where the input first passes through a feature pyramid of dilated separable convolutions for hierarchical-context feature extraction followed by separable convolutions for feature compression. The model is optimized using a novel three-part loss function that focuses on high-level contextual features, structural similarity, and patch-wise local information. We conducted several ablation studies to determine the optimal model for low-light image enhancement under noisy and noiseless conditions. We have used performance metrics like peak-signal-to-noise ratio, structural similarity index matrix, visual information fidelity, and average brightness to demonstrate the superiority of the proposed work against the state-of-the-art algorithms. Qualitative results presented in this paper prove the strength and suitability of our model for real-time applications.

关键词： Feature extraction Convolution Image enhancement Noise measurement Lighting Image coding Computer architecture encoder-decoder architecture separable convolution dilated convolution ASPP perceptual loss low-light image enhancement

来源：评论

学校读者我要写书评

暂无评论

Deep Learning With Noisy Labels for Spatiotemporal Drought Detection

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2024年 62卷

作者： Cortes-Andres, Jordi Fernandez-Torres, Miguel-Angel Camps-Valls, Gustau Univ Valencia UV Image Proc Lab IPL Valencia 46980 Paterna Spain

Droughts pose significant challenges for accurate monitoring due to their complex spatiotemporal characteristics. Data-driven machine learning (ML) models have shown promise in detecting extreme events when enough well-annotated data is available. However, droughts do not have a unique and precise definition, which leads to noise in human-annotated events and presents an imperfect learning scenario for deep learning models. This article introduces a 3-D convolutional neural network (CNN) designed to address the complex task of drought detection, considering spatiotemporal dependencies and learning with noisy and inaccurate labels. Motivated by the shortcomings of traditional drought indices, we leverage supervised learning with labeled events from multiple sources, capturing the shared conceptual space among diverse definitions of drought. In addition, we employ several strategies to mitigate the negative effect of noisy labels (NLs) during training, including a novel label correction (LC) method that relies on model outputs, enhancing the robustness and performance of the detection model. Our model significantly outperforms state-of-the-art drought indices when detecting events in Europe between 2003 and 2015, achieving an AUROC of 72.28%, an AUPRC of 7.67%, and an ECE of 16.20%. When applying the proposed LC method, these performances improve by +5%, +15%, and +59%, respectively. Both the proposed model and the robust learning methodology aim to advance drought detection by providing a comprehensive solution to label noise and conceptual variability.

关键词： Droughts Noise measurement Training Data models Three-dimensional displays Noise Biological system modeling Solid modeling Predictive models Europe Convolutional neural networks (CNNs) drought detection encoder-decoder architecture hydro-climatological data label correction (LC) noisy labels (NLs) spatiotemporal data

来源：评论

学校读者我要写书评

暂无评论

The exploration of a Temporal Convolutional Network combined with encoder-decoder framework for runoff forecasting

引用

HYDROLOGY RESEARCH 2020年第5期51卷 1136-1149页

作者： Lin, Kangling Sheng, Sheng Zhou, Yanlai Liu, Feng Li, Zhiyu Chen, Hua Xu, Chong-Yu Chen, Jie Guo, Shenglian Wuhan Univ State Key Lab Water Resources & Hydropower Engn S Wuhan 430072 Peoples R China Wuhan Univ Hubei Prov Key Lab Water Syst Sci Sponge City Con Wuhan 430072 Peoples R China Univ Oslo Dept Geosci POB 1047 N-0316 Oslo Norway Wuhan Univ Sch Comp Sci Wuhan 430072 Peoples R China Univ Illinois Dept Geog & Geog Informat Sci Urbana IL 61801 USA

The Temporal Convolutional Network (TCN) and TCN combined with the encoder-decoder architecture (TCN-ED) are proposed to forecast runoff in this study. Both models are trained and tested using the hourly data in the Jianxi basin, China. The results indicate that the forecast horizon has a great impact on the forecast ability, and the concentration time of the basin is a critical threshold to the effective forecast horizon for both models. Both models perform poorly in the low flow and well in the medium and high flow at most forecast horizons, while it is subject to the forecast horizon in forecasting peak flow. TCN-ED has better performance than TCN in runoff forecasting, with higher accuracy, better stability, and insensitivity to fluctuations in the rainfall process. Therefore, TCN-ED is an effective deep learning solution in runoff forecasting within an appropriate forecast horizon.

关键词： artificial neural network deep learning encoder-decoder architecture runoff forecasting Temporal Convolutional Network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：