Wind energy is a clean energy source that is characterised by significant uncertainty. The electricity generated from wind power also exhibits strong unpredictability, which when integrated can have a substantial impa...
详细信息
Wind energy is a clean energy source that is characterised by significant uncertainty. The electricity generated from wind power also exhibits strong unpredictability, which when integrated can have a substantial impact on the security of the power grid. In the context of integrating wind power into the grid, accurate prediction of wind power generation is crucial in order to minimise damage to the grid system. This paper proposes a novel composite model (MLL-MPFLA) that combines a multilayer perceptron (MLP) and an LSTM-based encoder-decoder network for short-term prediction of wind power generation. In this model, the MLP first extracts multidimensional features from wind power data. Subsequently, an LSTM-based encoder-decoder network explores the temporal characteristics of the data in depth, combining multidimensional features and temporal features for effective prediction. During decoding, an improved focused linear attention mechanism called multi-point focused linear attention is employed. This mechanism enhances prediction accuracy by weighting predictions from different subspaces. A comparative analysis against the MLP, LSTM, LSTM-Attention-LSTM, LSTM-Self_Attention-LSTM, and CNN-LSTM-Attention models demonstrates that the proposed MLL-MPFLA model outperforms the others in terms of MAE, RMSE, MAPE, and R2, thereby validating its predictive performance.
This paper proposes a novel two-stream encoder-decoder network that utilizes both the high-level and the low-level image features for precisely localizing forged regions in a manipulated image. This is motivated by th...
详细信息
This paper proposes a novel two-stream encoder-decoder network that utilizes both the high-level and the low-level image features for precisely localizing forged regions in a manipulated image. This is motivated by the fact that the forgery creation process generally introduces both the high-level artefacts (e.g., unnatural contrast) and the low-level artefacts (e.g., noise inconsistency) to the forged images. In the proposed two-stream network, one stream learns the low-level manipulation-related features in the encoder side by extracting noise residuals through a set of high-pass filters in the first layer. In the second stream, the encoder learns the highlevel image manipulation features from the input image RGB values. The coarse feature maps each encoder are upsampled by the corresponding decodernetwork to produce the dense feature maps. The dense feature maps of the two streams are concatenated and fed to a final convolutional layer with sigmoidal activation to produce the pixel-wise prediction. We have carried out experimental analyses on multiple standard forensics datasets to evaluate the performance of the proposed method. The experimental results show the efficacy of the proposed method with respect to the state-of-the-art.
Land cover segmentation is an important and challenging task in the field of remote sensing. Even though convolutional neural networks (CNNs) provide great support for semantic segmentation, standard models are still ...
详细信息
Land cover segmentation is an important and challenging task in the field of remote sensing. Even though convolutional neural networks (CNNs) provide great support for semantic segmentation, standard models are still difficult to capture global information and long-range dependencies in remote sensing images. To overcome these limitations, we proposed an attention guided encoder-decoder network with multi-scale context aggregation to achieve more accurate segmentation of land cover. Based on the structure of the encoder-decoder network, we introduce a multi-scale feature fusion module with two attention modules to the top of the encoder. The multi-scale feature fusion module is employed to aggregate multi-scale features and capture global correlations. The attention modules are used to exploit the long-range dependencies and the interdependence between channels from the perspective of space and channel respectively. The experimental results on the GF-2 images show that our proposed method achieves state-of-the-art performance, with an OA of 84.1% and the mIoU of 62.3%. Compared with the baseline network, our method improves the OA by 3.3% and the mIoU by 4.4%. The comparative experiments also demonstrate that the proposed approach can significantly improve the accuracy of land cover segmentation than other compared methods.
The panoramic dental X-ray images are an essential diagnostic tool used by dentists to detect the symptoms in an early stage and develop appropriate treatment plans. In recent years, deep learning methods have been ap...
详细信息
The panoramic dental X-ray images are an essential diagnostic tool used by dentists to detect the symptoms in an early stage and develop appropriate treatment plans. In recent years, deep learning methods have been applied to achieve tooth segmentation of dental X-rays, which aims to assist dentists in making clinical decisions. Because the original images contain plenty of useless information, it is necessary to extract the region-of-interest (ROI) to obtain more accurate results by focusing on the maxillofacial region. However, a fast and accurate maxillofacial segmentation without hand-crafted features is challenging due to the poor image quality. In this study, we create a large maxillofacial dataset and propose an efficient encoder-decoder network model named EED-Net to solve this problem. This dataset consists of 2602 panoramic dental X-ray images and corresponding segmentation masks annotated by the trained experts. Based on the original structure of U-Net, our model structure contains three major modules: a feature encoder, a corresponding decoder, and a multipath feature extractor that connects the encoding path and the decoding path. In order to obtain more semantic features from the depth and breadth, we replace the convolution layer with the residual block in the encoder and adopt Inception-ResNet block in the multipath feature extractor. Inspired by the skip connection in FCN-8s, the lightweight decoder has the same channel dimension as the number of segmented objects. Besides, a weighted loss function is used to enhance segmentation accuracy. The comprehensive experimental results on the new dataset demonstrate that our model achieves better accuracy and speed trade-offs for maxillofacial segmentation than the latest methods.
Nuclei segmentation is a prerequisite and an essential step in cancer detection and prognosis. Automatic nuclei segmentation from the histopathological images is challenging due to nuclear overlap, disease types, chro...
详细信息
Nuclei segmentation is a prerequisite and an essential step in cancer detection and prognosis. Automatic nuclei segmentation from the histopathological images is challenging due to nuclear overlap, disease types, chromatic stain variability, and cytoplasmic morphology differences. Furthermore, it is demanding to develop a single accurate method for segmenting nuclei of different organs because of the diversity in nuclei size, shape, and appearance across the various organs. To address these challenges, we developed a robust encoder-decoder network for nuclei segmentation from the multi-organ histopathological images. In this approach, we utilize a pre-trained EfficientNet-B4 as an encoder subnetwork and design a new decoder subnetwork architecture. Additionally, we have applied morphological operation-based post-processing to improve the segmentation results. The performance of our approach has been evaluated on three public datasets, namely, Kumar, TNBC, and CPM-17 datasets, which contain histopathological images of seven organs, one organ, and four organs, respectively. The proposed method achieved an aggregated Jacquard index of 0.636, 0.611, and 0.706 on Kumar, TNBC, and CPM-17 datasets, respectively. Our proposed approach also shows superiority over the existing methods.
According to the characteristics of the road features,an encoder-decoder deep semantic segmentation network is designed for the road extraction of remote sensing ***,as the features of the road target are rich in loca...
详细信息
According to the characteristics of the road features,an encoder-decoder deep semantic segmentation network is designed for the road extraction of remote sensing ***,as the features of the road target are rich in local details and simple in semantic features,an encoder-decoder network with shallow layers and high resolution is designed to improve the ability to represent detail ***,as the road area is a small proportion in remote sensing images,the cross-entropy loss function is improved,which solves the imbalance between positive and negative samples in the training *** on large road extraction datasets show that the proposed method gets the recall rate 83.9%,precision 82.5%and F1-score 82.9%,which can extract the road targets in remote sensing images completely and *** encoder-decoder network designed in this paper performs well in the road extraction task and needs less artificial participation,so it has a good application prospect.
Text recognition in natural scene images has always been a hot topic in the field of document-image related visual sensors. The previous literature mostly solved the problem of horizontal text recognition, but the tex...
详细信息
Text recognition in natural scene images has always been a hot topic in the field of document-image related visual sensors. The previous literature mostly solved the problem of horizontal text recognition, but the text in the natural scene is usually inclined and irregular, and there are many unsolved problems. For this reason, we propose a scene text recognition algorithm based on a text position correction (TPC) module and an encoder-decoder network (EDN) module. Firstly, the slanted text is modified into horizontal text through the TPC module, and then the content of horizontal text is accurately identified through the EDN module. Experiments on the standard data set show that the algorithm can recognize many kinds of irregular text and get better results. Ablation studies show that the proposed two network modules can enhance the accuracy of irregular scene text recognition.
Road extraction from remote sensing images is of great significance to urban planning, navigation, disaster assessment, and other applications. Although deep neural networks have shown a strong ability in road extract...
详细信息
Road extraction from remote sensing images is of great significance to urban planning, navigation, disaster assessment, and other applications. Although deep neural networks have shown a strong ability in road extraction, it remains a challenging task due to complex circumstances and factors such as occlusion. To improve the accuracy and connectivity of road extraction, we propose an inner convolution integrated encoder-decoder network with the post-processing of directional conditional random fields. Firstly, we design an inner convolutional network which can propagate information slice-by-slice within feature maps, thus enhancing the learning of road topology and linear features. Additionally, we present the directional conditional random fields to improve the quality of the extracted road by adding the direction of roads to the energy function of the conditional random fields. The experimental results on the Massachusetts road dataset show that the proposed approach achieves high-quality segmentation results, with the F1-score of 84.6%, which outperforms other comparable "state-of-the-art" approaches. The visualization results prove that the proposed approach is able to effectively extract roads from remote sensing images and can solve the road connectivity problem produced by occlusions to some extent.
Compressive sensing (CS) technology is introduced into space optical remote sensing image acquisition stage, which could make wireless image sensor network node quickly and accurately obtain images in the case of two ...
详细信息
Compressive sensing (CS) technology is introduced into space optical remote sensing image acquisition stage, which could make wireless image sensor network node quickly and accurately obtain images in the case of two constraints of limited battery power and expensive sensor costs. On this basis, in order to further improve the quality of CS image reconstruction, we propose fused features and perceptual loss encoder-decoder residual network (FFPL-EDRNet) for image reconstruction. FFPL-EDRNet consists of a convolution layer and a reconstruction network. We train FFPL-EDRNet end-to-end, thus greatly simplifying the pre-processing and post-processing process and eliminating the block effect of reconstructed images. The reconstruction network is based on residual network, which introduces multi-scale feature extraction, multi-scale feature combination and multi-level feature combination. Feature fusion integrates low-level information with high-level information to reduce reconstruction error. The perceptual loss function based on pretrained InceptionV3 uses the weighted mean square error to define the loss value between the reconstructed image feature and the label image feature, which makes the reconstructed image more semantically similar to label image. In the measurement procedure, we use convolution to achieve block compression measurement, so as to obtain full image measurements. For image reconstruction, we firstly use a deconvolution layer to initially reconstruct the image and then use the residual network to refine the initial reconstructed image. The experimental results show that: in the case of measurement rates (MRs) of 0.25, 0.10, 0.04 and 0.01, the peak signal-to-noise ratio (PSNR) = 27.502, 26.804, 24.593, 21.359 and structural similarity (SSIM) = 0.842, 0.816, 0.720, 0.568 of the reconstructed images obtained by FFPL-EDRNet. Therefore, Our FFPL-EDRNet could enhance the quality of image reconstruction.
Sclera segmentation is revealed to be of noteworthy importance for ocular biometrics. The paramount step for biometric recognition methods is the segmentation of the area of interest, i.e., the sclera in our case. The...
详细信息
Sclera segmentation is revealed to be of noteworthy importance for ocular biometrics. The paramount step for biometric recognition methods is the segmentation of the area of interest, i.e., the sclera in our case. The sclera segmentation process plays a pivotal part in retaining the accuracy of the sclera-based recognition schemes by restraining the errors. However, accurate sclera segmentation in the images from various sensors in a real environment is quite challenging due to the saturated and/or defocused vessel patterns and the vessel structure, which has complex nonlinear deformations due to the multilayered sclera. With the development of deep learning algorithms, studies that are based on the sclera segmentation using convolutional neural networks (CNNs) have achieved promising results for sclera recognition. However, previous CNN-based methods are based on the repeated subsampling stages of convolution strides, or spatial pooling leads to losing much of the finer image structure that significantly decreases overall performance in tasks, such as semantic segmentation. In this paper, we present Sclera-Net, a residual encoder and decodernetwork that exploits identity and non-identity mapping residual skip connections to take benefit of the high-frequency information from the prior layers of both encoder and decodernetworks to determine the accurate sclera region as well as other ocular regions. In this way, the finer image structure that was being lost due to repeated subsampling during convolution and pooling can be reutilized using residual skip connections to enhance overall performance. Furthermore, the proposed Sclera-Net does not enhance the performance on the cost of increasing depth, complexity, or the number of parameters. We performed comprehensive experiments and obtained optimum performance not only on sclera datasets but also on the iris datasets. In particular, we achieved an equal error rate and mean F1-score of 0.0093 and 96.2421, respectively
暂无评论