检索结果-内蒙古大学图书馆

Predicting water quality in municipal water management systems using a hybrid deep learning model

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2024年第PartE期133卷

作者： Luo, Wenxian Huang, Leijun Shu, Jiabin Feng, Hailin Guo, Wenjie Xia, Kai Fang, Kai Wang, Wei Zhejiang A&F Univ 666 Wusu St Hangzhou 311300 Zhejiang Peoples R China Quzhou Digital Rural Construct Ctr 139 Fushi Rd Quzhou 324000 Zhejiang Peoples R China Shenzhen MSU BIT Univ Artificial Intelligence Res Inst Guangdong Hong Kong Macao Joint Lab Emot Intellige Shenzhen 518172 Guangdong Peoples R China

Increasing municipal waste generation puts more and more municipal water resources at high risk. Accurate prediction of water quality becomes critical for effective protection of the water resources. Due to the nonlinear and non -stationary characteristics of water quality data of the municipal water resources, it is challenging to achieve high prediction accuracy, especially for medium -term and long-term predictions. To address this issue, we propose a novel hybrid deep learning model to predict water quality multiple steps ahead. The proposed model adopts the encoder-decoder structure in the form of two long short-term memory (LSTM) networks, integrated with the attention mechanism and a convolutional neural network (CNN). The model extracts the complex correlation between multiple water quality features through the CNN, and uses the two LSTM networks to transfer historical information to predictions, with an attention layer assigning different weights to the different parts of the historical information. Using three years of water quality data collected from an urban river, we experimentally show that the proposed model outperforms the baseline models by 11%-34% in root mean squared error (RMSE) when predicting dissolved oxygen multiple steps ahead, and by 1%-7% when predicting total phosphorus. Similar improvement has also been found in Nash-Sutcliffeefficiency (NSE) and mean absolute error (MAE). The proposed model is a feasible solution for multi -step medium -term water quality prediction.

关键词： Water quality Multi-step prediction encoder-decoder structure Long short-term memory Convolutional neural network Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

A Crowd Counting and Localization Network Based on Adaptive Feature Fusion and Multi-Scale Global Attention Up Sampling

引用

IEEE ACCESS 2024年 12卷 12919-12939页

作者： Wang, Min Huang, Li Yan, Jingke Huang, Jin Yang, Tao Sichuan Technol & Business Univ Coll Elect & Informat Engn Chengdu 611745 Peoples R China Southwest Jiaotong Univ State Key Lab Rail Transit Vehicle Syst Chengdu 610031 Peoples R China Southwest Jiaotong Univ Sch Elect Engn Chengdu 611756 Peoples R China

Crowd counting is an important research topic in the fields of computer vision and image processing, with monitoring and management of crowded scenes becoming an increasingly prominent issue. Existing methods still suffer from the problem of severe overlap in density maps within dense areas, leading to inadequate counting and localization accuracy. This paper presents innovative research on crowd counting and localization. Firstly, addressing the limitations of density maps in localization performance in existing algorithms, we optimize the generation method of FIDT maps, decoupling the counting and localization tasks. By avoiding the problem of overlap in dense areas, the optimized label maps achieve a good balance between counting accuracy and localization, with MAE and MSE reaching 64.1 and 103.9 in SHHA, and 10.9 and 17.4 in SHHB, ***, to address the scale insensitivity of the encoder and the potential loss of critical features during the encoding process, we propose the Adaptive Feature Fusion Module and the Multi-Scale Global Attention Upsampling Module, constructing the CALNET network. By reducing redundant features inside and outside the separable branch, the model achieves global fusion of shallow features during the decoding process. The F1-m scores obtained on the SHHA and SHHB datasets reach 72.9% and 79.4% respectively, significantly improving the model's ***, this paper extends the application of crowd counting and localization algorithms to different domains such as citrus orchards, vehicles, and campus crowds. Through experiments, the robustness and transferability of the network are validated, expanding the application areas of crowd counting and localization algorithms and providing a broader space for future research.

关键词： Label map encoder-decoder structure adaptive feature fusion multi-scale global attention upsampling crowd counting and localization

来源：评论

学校读者我要写书评

暂无评论

SMART: Supervised multi-class image retargeting generative model based on a long-range sampling strategy

引用

DIGITAL SIGNAL PROCESSING 2024年 154卷

作者： Cui, Jia Jiang, Hao Qi, Meng Gu, Zhenyu Lu, Hongju State Key Lab Subtrop Bldg Sci Guangzhou 510640 Peoples R China South China Univ Technol Sch Design Guangzhou 510006 Peoples R China HCruiser Informat sgesellschaft mbH D-80807 Munich Germany Shandong Normal Univ Sch Informat Sci & Engn Jinan 250300 Peoples R China Shanghai Jiao Tong Univ Sch Design Shanghai 205530 Peoples R China Guangzhou City Univ Technol Sch Management Guangzhou 510800 Peoples R China

Content-aware image retargeting (CAIR) techniques are crucial in multimedia processing for displaying images on various devices while preserving visually salient contents with desirable visual effects. There are discrete and continuous algorithms. For the former, the artefacts happen when the foreground proportion is larger than the retargeting ratio;for the latter, the salient regions are prone to be squeezed. In this paper, we reformulate the retargeting process into sampling the salient signal and reconstruction under aesthetic supervision, the supervised multi-class image retargeting reconstruction (SMART) framework. The target images can be represented into complementary parts, the masked and unmasked ones, according to the saliency influences in the encoder phrase. The long-range sampling algorithm is proposed to calculate similarities through an 8-connected planar path while considering spatial distance and feature correlation. The sampled embeddings in latent space reconstruct the retargeted images under supervised signals for aesthetic quality. The semantic loss Lsem from the pretrained CLIP model can maintain consistency for both content and semantics. The supervised loss, Lir, is introduced to ensure the retargeted qualities are close to the preferred labels. Then, we release a new retargeting dataset comprising seven image classes (animal, building, car, flower, indoor, landscape and people) with supervised labels collected from designers for further aesthetic retargeting study. The ablation studies are conducted to confirm the effectiveness of the new dataset, and comparative experiments with state-of-the-art baselines demonstrate the advantages of the proposed method.

关键词： Image retargeting Long-range sampling Multi-class image reconstruction Supervised signals encoder-decoder structure

来源：评论

学校读者我要写书评

暂无评论

Efficient real-time semantic segmentation: accelerating accuracy with fast non-local attention

引用

VISUAL COMPUTER 2024年第8期40卷 5783-5796页

作者： Lan, Tianye Dou, Furong Feng, Ziliang Zhang, Chengfang Sichuan Univ Coll Comp Sci Chengdu Peoples R China Sichuan Police Coll Intelligent Policing Key Lab Sichuan Prov Luzhou Peoples R China

As an essential aspect of semantic segmentation, real-time semantic segmentation poses significant challenge in achieving trade-off between segmentation accuracy and inference speed. Standard non-local block can effectively capture the long-range dependencies that are critical to semantic segmentation, while its huge computational cost is unacceptable for real-time semantic segmentation. To confront this issue, we propose fast non-local attention network (FNANet) with encoder-decoder structure for real-time semantic segmentation. FNANet relies on the utilization of fast non-local attention module and fast non-local attention fusion module. These modules serve the dual purpose of reducing computational demands and capturing essential contextual information, thereby achieving an equilibrium between enhanced segmentation accuracy and minimized computational overhead. Furthermore, improved non-local attention is incorporated to augment feature representation, consequently facilitating precise class label prediction. Experimental results demonstrate that FNANet outperforms state-of-the-art methods in terms of segmentation accuracy and speed on Cityscapes and CamVid.

关键词： Semantic segmentation Fast non-local attention Attentional feature fusion Real-time speed encoder-decoder structure

来源：评论

学校读者我要写书评

暂无评论

Learning Feature Embedding Refiner for Solving Vehicle Routing Problems

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024年第11期35卷 15279-15291页

作者： Li, Jingwen Ma, Yining Cao, Zhiguang Wu, Yaoxin Song, Wen Zhang, Jie Chee, Yeow Meng Sichuan Normal Univ Dept Comp Sci Chengdu 610101 Peoples R China Natl Univ Singapore Dept Ind Syst Engn & Management Coll Design & Engn Singapore 119077 Singapore Singapore Management Univ Sch Comp & Informat Syst Singapore 188065 Singapore Eindhoven Univ Technol Fac Ind Engn & Innovat Sci NL-5600 MB Eindhoven Netherlands Shandong Univ Inst Marine Sci & Technol Qingdao 266237 Peoples R China Nanyang Technol Univ Sch Comp Sci & Engn Singapore 639798 Singapore

While the encoder-decoder structure is widely used in the recent neural construction methods for learning to solve vehicle routing problems (VRPs), they are less effective in searching solutions due to deterministic feature embeddings and deterministic probability distributions. In this article, we propose the feature embedding refiner (FER) with a novel and generic encoder-refiner-decoder structure to boost the existing encoder-decoder structured deep models. It is model-agnostic that the encoder and the decoder can be from any pretrained neural construction method. Regarding the introduced refiner network, we design its architecture by combining the standard gated recurrent units (GRU) cell with two new layers, i.e., an accumulated graph attention (AGA) layer and a gated nonlinear (GNL) layer. The former extracts dynamic graph topological information of historical solutions stored in a diversified solution pool to generate aggregated pool embeddings that are further improved by the GRU, and the latter adaptively refines the feature embeddings from the encoder with the guidance of the improved pool embeddings. To this end, our FER allows current neural construction methods to not only iteratively refine the feature embeddings for boarder search range but also dynamically update the probability distributions for more diverse search. We apply FER to two prevailing neural construction methods including attention model (AM) and policy optimization with multiple optima (POMO) to solve the traveling salesman problem (TSP) and the capacitated VRP (CVRP). Experimental results show that our method achieves lower gaps and better generalization than the original ones and also exhibits competitive performance to the state-of-the-art neural improvement methods.

关键词： Probability distribution Decoding Search problems Vehicle routing Routing Logic gates Costs encoder-decoder structure neural combinatorial optimization reinforcement learning vehicle routing problems (VRPs)

来源：评论

学校读者我要写书评

暂无评论

RAFNet: Reparameterizable Across-Resolution Fusion Network for Real-Time Image Semantic Segmentation

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2024年第2期34卷 1212-1227页

作者： Chen, Lei Dai, Huhe Zheng, Yuan Inner Mongolia Univ Coll Elect Informat Engn Hohhot 010021 Peoples R China Inner Mongolia Univ Coll Comp Sci Natl & Local Joint Engn Res Ctr Intelligent Inform Hohhot 010021 Peoples R China

The demand to implement semantic segmentation networks on mobile devices has increased dramatically. However, existing real-time semantic segmentation methods still suffer from a large number of network parameters, unsuitable for mobile devices with limited memory resources. The reason mainly arises from the fact that most existing methods take the backbone networks (e.g., ResNet-18 and MobileNet) as an encoder. To alleviate this problem, we propose a novel Reparameterizable Channel & Dilation (RCD) block and construct a considerably lightweight yet effective encoder by stacking several RCD blocks according to three guidelines. The strengths of the proposed encoder result in the abilities not only to extract discriminative feature representations via channel convolutions and dilated convolutions, but also to reduce computational burdens while maintaining segmentation accuracy with the help of re-parameterization technique. Except for encoder, we also present a simple but effective decoder that adopts an across-resolution fusion strategy to fuse multi-scale feature maps generated from the encoder instead of a bottom-up pathway fusion. With such an encoder and a decoder, we provide a Reparameterizable Across-resolution Fusion Network (RAFNet) for real-time semantic segmentation. Extensive experiments demonstrate that our RAFNet achieves a promising trade-off between segmentation accuracy, inference speed and network parameters. Specifically, our RAFNet with only 0.96M parameters obtains 75.3% mIoU at 107 FPS and 75.8% mIoU at 195 FPS on Cityscapes and CamVid test sets for full-resolution inputs, respectively. After quantization and deployment on a Xilinx ZCU104 device, our RAFNet obtains a favorable segmentation performance with only 1.4W power.

关键词： Real-time image segmentation encoder-decoder structure lightweight network hardware deployment

来源：评论

学校读者我要写书评

暂无评论

FashionSegNet: a model for high-precision semantic segmentation of clothing images

引用

VISUAL COMPUTER 2024年第3期40卷 1711-1727页

作者： Xiang, Zhong Zhu, Chenglin Qian, Miao Shen, Yujia Shao, Yizhou Zhejiang Sci Tech Univ Sch Mech Engn 928-2 Main StXiasha Higher Educ Pk Hangzhou 310018 Zhejiang Peoples R China

Clothing image segmentation is a method to predict the clothing category label of each pixel in the input image. We reduced the influence of the variability of image shots, the similarity of clothing categories, and the complexity of boundaries on the segmentation accuracy of clothing images by developing an advanced ResNet50-based semantic segmentation model in this study whose primary structure is the encoder-decoder. An improved spatial pyramid pooling module combined with a global feature extraction branch of a large convolution kernel is developed to achieve multi-scale feature fusion and improve the model's ability to identify clothing and its boundary features in different shots. Furthermore, to balance the clothing shape and category information in the model, a spatial and semantic information enhancement module is proposed, which can enhance the circulation of the information between different stages of the network through cross-stage connection technology. The model was finally trained and tested on the Deepfashion2 dataset. The comparison experiment demonstrates that the proposed model obtained the highest mIoU and Boundary IoU of 74.55% and 57.51%, respectively, compared with the DeepLabv3+, PSPNet, and other networks.

关键词： Clothing image segmentation Deep learning Spatial pyramid pooling encoder-decoder structure Large convolution kernel Information enhancement

来源：评论

学校读者我要写书评

暂无评论

A Dual-encoder-Single-decoder Based Low-Dose CT Denoising Network

引用

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS 2022年第7期26卷 3251-3260页

作者： Han, Zefang Shangguan, Hong Zhang, Xiong Zhang, Pengcheng Cui, Xueying Ren, Huiying Taiyuan Univ Sci & Technol Shanxi Prov China Taiyuan 030024 Peoples R China North Univ China Shanxi Prov China Taiyuan 030051 Peoples R China

Generative adversarial networks (GAN) have shown great potential for image quality improvement in low-dose CT (LDCT). In general, the shallow features of generator include more shallow visual information such as edges and texture, while the deep features of generator contain more deep semantic information such as organization structure. To improve the network's ability to categorically deal with different kinds of information, this paper proposes a new type of GAN with dual-encoder- single-decoder structure. In the structure of the generator, firstly, a pyramid non-local attention module in the main encoder channel is designed to improve the feature extraction effectiveness by enhancing the features with self-similarity;Secondly, another encoder with shallow feature processing module and deep feature processing module is proposed to improve the encoding capabilities of the generator;Finally, the final denoised CT image is generated by fusing main encoder's features, shallow visual features, and deep semantic features. The quality of the generated images is improved due to the use of feature complementation in the generator. In order to improve the adversarial training ability of discriminator, a hierarchical-split ResNet structure is proposed, which improves the feature's richness and reduces the feature's redundancy in discriminator. The experimental results show that compared with the traditional single-encoder- single-decoder based GAN, the proposed method performs better in both image quality and medical diagnostic acceptability. Code is available in https://***/hanzefang/DESDGAN.

关键词： Feature extraction Noise reduction Computed tomography Image edge detection Bioinformatics Testing X-ray imaging encoder-decoder structure Generative adversarial network hierarchical-split ResNet image denoising LDCT

来源：评论

学校读者我要写书评

暂无评论

Research on Building Energy Consumption Prediction Based on Hybrid GRU Neural Network

引用

ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2025年 1-15页

作者： Gao, Zhiyuan Zhang, Xuewei Wang, Changsheng Xing, Jianchun Deng, Zhongkai Chen, Tao Army Engn Univ PLA Natl Def Engn Coll Nanjing Peoples R China Natl Def Mobilizat Off Command Informat Assurance Zhangjiakou Peoples R China

The accurate prediction of building energy consumption provides technology and data support for the construction of intelligent building energy systems. Moreover, it is also a crucial means of responding to the national "Carbon Peaking and Carbon Neutrality Goals." Traditional methods can yield poor results because they fail to consider the nonlinear, nonstationary, and multi-seasonal characteristics of the building energy consumption data. To overcome these limitations, this paper proposes an asymmetric energy consumption prediction approach based on the encoder-decoder architecture. The proposed approach employs the CEEMDAN algorithm for data preprocessing to enhance the reliability of building energy consumption data. Subsequently, the convolutional gated recurrent unit (Conv-GRU) model is utilized to extract high-dimensional features and capture nonlinear relationships from the input energy consumption data. Finally, by employing the GRU-Attention algorithm to assign feature weights, this approach enhances the accuracy of building energy consumption prediction. Experimental evaluations conducted on real datasets demonstrate the superiority of the proposed approach over the existing classic methods.

关键词： Building energy consumption prediction Deep learning encoder-decoder structure Attention mechanism Building energy analysis

来源：评论

学校读者我要写书评

暂无评论

Depth feature fusion based surface defect region identification method for steel plate manufacturing

引用

COMPUTERS & ELECTRICAL ENGINEERING 2024年 116卷

作者： Bai, Dongxu Li, Gongfa Jiang, Du Tao, Bo Yun, Juntong Hao, Zhiqiang Zhou, Dalin Ju, Zhaojie Wuhan Univ Sci & Technol Key Lab Met Equipment & Control Technol Minist Educ Wuhan 430081 Peoples R China Wuhan Univ Sci & Technol Hubei Key Lab Mech Transmiss & Mfg Engn Wuhan 430081 Peoples R China Wuhan Univ Sci & Technol Res Ctr Biomimet Robot & Intelligent Measurement & Wuhan 430081 Peoples R China Wuhan Univ Sci & Technol Precis Mfg Res Inst Wuhan 430081 Peoples R China Univ Portsmouth Sch Comp Portsmouth PO1 3HE England

Computers and electrical engineering have made great strides in steel plate manufacturing. Defect recognition techniques have also evolved. However, due to the large scale of defects, diverse features and sample imbalance problems of steel plates, the general algorithms often suffer from low recognition accuracy and weak robustness in practical detection. Aiming at the problems in recognition, this study proposes an improved defect segmentation network with coder-decoder structure to realize multi-scale interaction of features. Using the split-attention feature extraction module, defect features are learned adaptively. Meanwhile, combined with the group normalization module, a surface defect region recognition model based on depth feature fusion is established. The model was trained for comparative ablation using a migration learning approach. The experimental results confirm the efficiency of the technique. 89.11 % IoU and 94.24 % Dice can be achieved on the Severstal dataset using this method. The research can be applied as an intelligent system for quality monitoring throughout the production process, guiding its rational decision-making and control to realize the improvement of strip steel product quality.

关键词： Steel surface defect identification Depth feature fusion Defect segmentation network Defect detection encoder-decoder structure

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：