The p-Median Problem (PMP) is a classical discrete facility location problem with significant implications for optimizing the placement of urban public service facilities. Improved heuristics, a well-established metho...
详细信息
The p-Median Problem (PMP) is a classical discrete facility location problem with significant implications for optimizing the placement of urban public service facilities. Improved heuristics, a well-established method for solving the PMP, aim to iteratively enhance solution quality through efficient neighborhood exploration. In this study, we model the neighborhood exploration process as a Markov decision process and propose a novel deep reinforcement learning approach to solving the PMP, achieving higher problem-solving efficiency and quality. The proposed method introduces an encoder-decoder structure, consisting of an Interactive Attention encoder (IAE), a Node Removal decoder (NRD), and a Node Insertion decoder (NID), aimed at learning an optimal strategy for node selection. The experimental results demonstrate that our approach outperforms genetic algorithms in terms of both accuracy and computational efficiency. While the solution time is slightly longer than that of the Attention Model (AM), our method achieves a reduced gap to the optimal solution. Furthermore, ablation studies confirm that the proposed adaptive interactive encoder and the two decoders significantly enhance the model performance. Finally, we applied the Adaptive Interactive Attention Model (AIAM) to a realworld scenario, demonstrating its practical utility in guiding medical facility location decisions.
The encoder-decoder architecture is widely adopted in convolutional neural network (CNN)-based road extraction methods. However, due to the absence of long-range dependencies in the upsampling process of a decoder mod...
详细信息
The encoder-decoder architecture is widely adopted in convolutional neural network (CNN)-based road extraction methods. However, due to the absence of long-range dependencies in the upsampling process of a decoder model and the global spatial correlation of occluded road regions, existing methods often produce fragmented roads with poor connectivity. Besides, shallow encoder layer features may capture massive needless low-level textures that can be regarded as noises due to the limited receptive field, blurring the boundaries of roads. To enhance the connectivity and refine the boundary of roads, the connectivity enhancement module (CEM) and the boundary refinement module (BRM) are proposed, respectively. Our CEM is able to capture global relations and extract the linear features of roads effectively, which are critical for topological correctness. By leveraging the output of the CEM as a source of semantic guidance, BRM is introduced to filter out the local noises from encoder features. Then, we construct an interactive global-local decoder (IGD) based on the two proposed modules to enable dynamic interactions between them, resulting in further mutually beneficial improvements. Based on these innovations, our CEBRNet method is proposed for road detection from satellite imagery. With extensive experiments on two public benchmark DeepGlobe and SpaceNet datasets, we demonstrate the superiority of our method against other state-of-the-art methods.
Deep learning-based medical image segmentation requires a large number of labeled data to train the model. Obtaining large-scale labeled medical image datasets is time-consuming and expensive. In contrast, it is easy ...
详细信息
Deep learning-based medical image segmentation requires a large number of labeled data to train the model. Obtaining large-scale labeled medical image datasets is time-consuming and expensive. In contrast, it is easy to obtain unlabeled data, which also deserve to be effectively explored to improve the segmentation quality. To solve this problem, we proposed a semi-supervised deep learning method based on Generative Adversarial Network (GAN) in combination with a pyramid attention mechanism and transfer learning (TP-GAN). In this work, TP-GAN consisted of a generator (segmentation network) and a discriminator (evaluation network). The generator adopted the encoder-decoder architecture for image segmentation (the output was called the predicted map), and the discriminator adopted convolutional neural network (CNN) to evaluate the quality of the predicted map. Through adversarial training between generator and discriminator, TP-GAN could achieve high segmentation quality since discriminator guides the generator to generate more accurate segmentation maps with more similar distribution as ground truth for unlabeled data in semi-supervised learning. Furthermore, the encoder in generator utilized the VGG16 model which had been trained for image classification on ImageNet data, meanwhile constituted a new segmentation model with the decoder. Transfer learning strategy could reduce the training time and overcome the limitation of small-scale labeled data in semi-supervised learning. And the generator used image pyramid attention mechanism to extract more detailed features to enhance the information of feature maps. The proposed TP-GAN model and other segmentation models were trained and tested on two different datasets (Hippocampus and Spleen). The results demonstrated that TP-GAN could achieve higher segmentation accuracy on the Hippocampus and Spleen than other semi-supervised segmentation methods based on different evaluation metrics (Dice, IoU, HD, and RVE). The propos
An essential process in prognostics and health management (PHM) is remaining useful life (RUL) prediction. The traditional Recurrent Neural Networks (RNNs) and their variants are not very efficient at solving the regr...
详细信息
An essential process in prognostics and health management (PHM) is remaining useful life (RUL) prediction. The traditional Recurrent Neural Networks (RNNs) and their variants are not very efficient at solving the regression problems of RUL prediction. Given this problem, an attention-based Gate Recurrent Unit (ABGRU) for RUL prediction is proposed in this paper. Firstly, the dataset is preprocessed, and the RUL labels are modeled using the piecewise linear degradation method. Then, a GRU network based on an encoder-decoder framework with an attention mechanism is proposed. The network can assign weights according to the importance of feature information and effectively use the feature information to predict RUL. The validity of the proposed framework is verified in the NASA C-MAPSS benchmark dataset. The results show that the presented method outperforms the existing state-of-the-art approaches and provides a new solution for RUL Prediction. & COPY;2023 Elsevier B.V. All rights reserved.
Fine particulate matter (PM2.5) concentration in ambient air has become a major concern across the globe. All major cities of India have reported an elevated concentration of PM2.5 that has severe consequences to the ...
详细信息
Fine particulate matter (PM2.5) concentration in ambient air has become a major concern across the globe. All major cities of India have reported an elevated concentration of PM2.5 that has severe consequences to the health, economy, and ecosystem of the region. As a result, it becomes imperative to develop adequate tools for forecasting particulate matter concentration. Most of the research works mostly focused on single-step prediction horizon, thereby limiting their use. In the present work, a hybrid model has been proposed to forecast multi-step ahead concentrations of PM2.5 in ambient air across India covering different agroclimatic zones. The hybrid model architecture was an encoder-decoder-based sequence to sequence model framework that was built with convolutional long short-term memory (LSTM), bidirectional LSTM and 3D convolution neural network. The model was tested across 26 Indian cities covering 13 major agroclimatic zones of India. The performance of the model was also analysed for consecutive hour sequential prediction taking last 24-h data as input to the model. The model output was also compared with signal to noise ratio to explore the reason for variations in model performance. A distinct trend was found between signal to noise ratio and model output. As noise increases, the model performances suffer. Overall, the model was found to be stable as its performance errors across different time horizon has little variations. The proposed model has the potential to be used for long-term forecasting by incorporating other predictor variables series.
作者:
Sun, JieYan, SenboSong, XiaowenZhejiang Univ
Sch Mech Engn State Key Lab Fluid Power & Mechatron Syst Hangzhou 310027 Peoples R China Zhejiang Univ
Sch Mech Engn Key Lab Adv Mfg Technol Zhejiang Prov Hangzhou 310027 Peoples R China Zhejiang Univ
Coll Comp Sci & Technol State Key Lab CAD & CG Hangzhou 310027 Peoples R China
Building upon fully convolutional networks (FCNs), deep learning-based salient object detection (SOD) methods achieve gratifying performance in many vision tasks, including surface defect detection. However, most exis...
详细信息
Building upon fully convolutional networks (FCNs), deep learning-based salient object detection (SOD) methods achieve gratifying performance in many vision tasks, including surface defect detection. However, most existing FCN-based methods still suffer from the coarse object edge predictions. The state-of-the-art methods employ intricate feature aggregation techniques to refine boundaries, but they are often too computational cost to deploy in the real application. This paper proposes a semantics guided detection paradigm for salient object detection. Guided atrous pyramid module is first applied on the top feature to segment complete salient semantics. Query context modules are further used to build relation maps between saliency and structural information from the top-down pathway. These two modules allow the semantic features to flow throughout the decoder phase, yielding detail enriched saliency predictions. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on surface defect detection and SOD benchmarks. In addition, this method can detect at 27 FPS in a fully convolutional fashion without any post-processing, which has the potential for real-time detection.
Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentenc...
详细信息
Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.
Images captured under low-illumination conditions usually suffer from severe degradations, such as fading and low contrast, drastically affecting the performance of systems relying on images under low-illumination con...
详细信息
Images captured under low-illumination conditions usually suffer from severe degradations, such as fading and low contrast, drastically affecting the performance of systems relying on images under low-illumination conditions. To address such problems, this study proposes a linear contrast enhancement network (LCENet) for low-illumination image enhancement. It consists of three subnets: two encoder-decoder-based subnets for gradient map restoration and brightness enhancement, respectively, and a backbone network for adaptive brightness and contrast adjustment. In addition, a linear contrast enhancement adaptive instance normalization (LCEAIN) module with linear contrast enhancement ability is proposed in the backbone network, which can avoid the problem of ignoring contrast enhancement when enhancing image brightness. Considerable evaluations on both synthetic and real low-illumination images show that the proposed method performs favorably against other existing similar methods. Moreover, our method can handle complex low-illuminance conditions and has good generalization for low-illuminance scenes with backlighting, night scenes with light sources, as well as underwater scenes with low illuminance. Code: https://***/zhouzhaorun/LCENet.
As a technically challenging topic, visual storytelling aims at generating an imaginary and coherent story with narrative multi-sentences from a group of relevant images. Existing methods often generate direct and rig...
详细信息
As a technically challenging topic, visual storytelling aims at generating an imaginary and coherent story with narrative multi-sentences from a group of relevant images. Existing methods often generate direct and rigid descriptions of apparent image-based contents, because they are not capable of exploring implicit information beyond images. Hence, these schemes could not capture consistent dependencies from holistic representation, impairing the generation of reasonable and fluent stories. To address these problems, a novel knowledge-enriched attention network with group-wise semantic model is proposed. Three main novel components are designed and supported by substantial experiments to reveal practical advantages. First, a knowledge-enriched attention network is designed to extract implicit concepts from external knowledge system, and these concepts are followed by a cascade cross-modal attention mechanism to characterize imaginative and concrete representations. Second, a group-wise semantic module with second-order pooling is developed to explore the globally consistent guidance. Third, a unified one-stage story generation model with encoder-decoder structure is proposed to simultaneously train and infer the knowledge-enriched attention network, group-wise semantic module and multi-modal story generation decoder in an end-to-end fashion. Substantial experiments on the visual storytelling datasets with both objective and subjective evaluation metrics demonstrate the superior performance of the proposed scheme as compared with other state-of-the-art methods. The source code of this work can be found in https://***.
Land cover segmentation has been a significant research area because of its multiple applications including the infrastructure development, forestry, agriculture, urban planning, and climate change research. In this p...
详细信息
Land cover segmentation has been a significant research area because of its multiple applications including the infrastructure development, forestry, agriculture, urban planning, and climate change research. In this paper, we propose a novel segmentation method, called Frequency-guided Position-based Attention Network (FPA-Net), for land cover image segmentation. Our method is based on encoder-decoder improved U-Net architecture with position-based attention mechanism and frequency-guided component. The position-based attention block is used to capture the spatial dependency among different feature maps and obtain the relationship among relevant patterns across the image. The frequency-guided component provides additional support with high-frequency features. Our model is simple and efficient in terms of time and space complexities. Experimental results on the Deep Globe, GID-15, and Land Cover AI datasets show that the proposed FPA-Net can achieve the best performance in both quantitative and qualitative measures as compared against other existing approaches.
暂无评论