Electricity price forecasting (EPF) is a complex task due to market volatility and nonlinearity, which cause rapid and unpredictable fluctuations and introduce heteroscedasticity in forecasting. These factors result i...
详细信息
Electricity price forecasting (EPF) is a complex task due to market volatility and nonlinearity, which cause rapid and unpredictable fluctuations and introduce heteroscedasticity in forecasting. These factors result in varying prediction errors over time, making it difficult for models to capture stable patterns and leading to poor performance. This study introduces the Heteroscedastic Temporal Convolutional Network (HeTCN), a novel encoder-decoder framework designed for day-ahead EPF. HeTCN utilizes a Temporal Convolutional Network (TCN) to capture long-term dependencies and cyclical patterns in electricity prices. A key innovation is the heteroscedastic output layer, which directly represents variable uncertainty, enhancing performance under fluctuating market conditions. Additionally, a multi-view feature selection algorithm identifies crucial factors for specific periods, improving forecast precision. The framework employs an improved loss function based on maximum likelihood estimation (MLE), which adjusts for the heteroscedastic nature of electricity prices by predicting both the mean and variance of the price distribution. This approach mitigates the impact of extreme price spikes and reduces overfitting, resulting in robust and reliable predictions. Comprehensive evaluations demonstrate HeTCN's superiority over existing solutions such as DeepAR and the Temporal Fusion Transformer (TFT), with average improvements of 25.3%, 24.9%, and 17.4% in the mean absolute error (MAE), symmetric mean absolute percentage error (sMAPE), and the root of mean squared error (RMSE) compared to DeepAR, and 17.6%, 14.4%, and 13.6% relative to TFT across five distinct electricity markets. These results underscore HeTCN's effectiveness in managing volatility and heteroscedasticity, marking a significant advancement in electricity price forecasting.
The contribution of deep learning in medical image diagnosis has gained extensive interest due to its excellent performance. Furthermore, the interest has also grown in digital pathology since it is considered the gol...
详细信息
The contribution of deep learning in medical image diagnosis has gained extensive interest due to its excellent performance. Furthermore, the interest has also grown in digital pathology since it is considered the golden standard for tumor detection and diagnosis in digital Whole Slide Images (WSIs). This paper proposes an end-toend cone-shaped encoder-decoder framework called a Multi-scale 3-stacked-Layer coned U-Net (Ms3LcU-Net) framework. It boosts performance by using many enhancements and integrating techniques such as blended mutual attention, dilated fusion, edge enhancement, and atrous pooling. Furthermore, the morphological postprocessing and test time augmentation techniques are used in Ms3LcU-Net to refine and smooth the generated segmentations. The experimental results from a quantitative perspective using multiple evaluation metrics and from a qualitative viewpoint by visualizing the generated segmentation predictions conducted on the public PAIP 2019 and DigestPath datasets demonstrated the effectiveness and competitiveness of the proposed model for tumor segmentation in WSIs. The proposed framework yielded an average clipped Jaccard Index value of 0.7211 on the validation set of the PAIP 2019 dataset. In contrast, the DigestPath dataset achieved an average dice coefficient and F1-score of 0.833 and 0.897, respectively. The code will be available publicly upon acceptance of the paper at https://***/Heba-AbdeNabi/Ms3LcU-Net-.
Automatic Natural language interpretation of medical images is an emerging field of Artificial Intelligence (AI). The task combines two fields of AI;computer vision and natural language processing. This is a chal-leng...
详细信息
Automatic Natural language interpretation of medical images is an emerging field of Artificial Intelligence (AI). The task combines two fields of AI;computer vision and natural language processing. This is a chal-lenging task that goes beyond object detection, segmentation, and classification because it also requires the understanding of the relationship between different objects of an image and the actions performed by these objects as visual representations. Image interpretation is helpful in many tasks like helping vi-sually impaired persons, information retrieval, early childhood learning, producing human like natural interaction between robots, and many more applications. Recently this work fascinated researchers to use the same approach by using more complex biomedical images. It has been applied from generat -ing single sentence captions to multi sentence paragraph descriptions. Medical image captioning can as-sist and speed up the diagnosis process of medical professionals and generated report can be used for many further tasks. This is a comprehensive review of recent years' research of medical image caption-ing published in different international conferences and journals. Their common parameters are extracted to compare their methods, performance, strengths, limitations, and our recommendations are discussed. Further publicly available datasets and evaluation measures used for deep-learning based captioning of medical images are also discussed. (C) 2021 Elsevier Ltd. All rights reserved.
Remote sensing images contain a wealth of Earth-observation information. Efficient extraction and application of hidden knowledge from these images will greatly promote the development of resource and environment moni...
详细信息
Remote sensing images contain a wealth of Earth-observation information. Efficient extraction and application of hidden knowledge from these images will greatly promote the development of resource and environment monitoring, urban planning and other related fields. Remote sensing image caption (RSIC) involves obtaining textual descriptions from remote sensing images through accurately capturing and describing the semantic-level relationships between objects and attributes in the images. However, there is currently no comprehensive review summarizing the progress in RSIC based on deep learning. After defining the scope of the papers to be discussed and summarizing them all, the paper begins by providing a comprehensive review of the recent advancements in RSIC, covering six key aspects: encoder-decoder framework, attention mechanism, reinforcement learning, learning with auxiliary task, large visual language models and few-shot learning. Subsequently a brief explanation on the datasets and evaluation metrics for RSIC is given. Furthermore, we compare and analyze the results of the latest models and the pros and cons of different deep learning methods. Lastly, future directions of RSIC are suggested. The primary objective of this review is to offer researchers a more profound understanding of RSIC.
Visual understanding has become more significant in gathering information in many real-life applications. For a human, it is a trivial task to understand the content in a visual, however the same is a challenging task...
详细信息
Visual understanding has become more significant in gathering information in many real-life applications. For a human, it is a trivial task to understand the content in a visual, however the same is a challenging task for a machine. Generating captions for images and videos for better understanding the situation is gaining more importance as they have wide application in assistive technologies, automatic video captioning, video summarizing, subtitling, blind navigation, and so on. The visual understanding framework will analyse the content present in the video to generate semantically accurate caption for the visual. Apart from the visual understanding of the situation, the gained semantics must be represented in a natural language like English, for which we require a language model. Hence, the semantics and grammar of the sentences being generated in English is yet another challenge. The captured description of the video is supposed to collect information of not just the objects contained in the scene, but it should also express how these objects are related to each other through the activity described in the scene, thus making the entire process a complex task for a machine. This work is an attempt to peep into the various methods for video captioning using deep learning methodologies, datasets that are widely used for these tasks and various evaluation metrics that are used for the performance comparison. The insights that we gained from our premiere work and the extensive literature review made us capable of proposing a practical, efficient video captioning architecture using deep learning which that will utilize the audio clues, external knowledge and attention context to improve the captioning process. Quantum deep learning architectures can bring about extraordinary results in object recognition tasks and feature extraction using convolutions.
Remote sensing (RS) image captioning has been recently attracting the attention of the community as it provides more semantic information with respect to the traditional tasks such as scene classification. Image capti...
详细信息
ISBN:
(数字)9781728121901
ISBN:
(纸本)9781728121918
Remote sensing (RS) image captioning has been recently attracting the attention of the community as it provides more semantic information with respect to the traditional tasks such as scene classification. Image captioning aims to generate a coherent and comprehensive description that summarizes the content of an image. The description can be obtained directly from the ground truth descriptions of similar images (retrieval based image captioning) or can be generated through the encoder-decoder framework. The former has the limitation of not generating new descriptions. The latter may be affected by misrecognition of scenes or semantic objects. In this paper we try to address these issues by proposing a new framework which is a combination of generation and retrieval based image captioning. First a CNN-RNN framework combined with beam-search generates multiple captions for a target image. Then the best caption is selected on the basis of its lexical similarity with the reference captions of most similar images. Experimental results on RSCID dataset are reported and discussed.
Generating a natural language description of an image is a challenging but meaningful *** task combines two significant artificial intelligent fields:computer vision and natural language *** task is valuable for many ...
详细信息
Generating a natural language description of an image is a challenging but meaningful *** task combines two significant artificial intelligent fields:computer vision and natural language *** task is valuable for many applications,such as searching images and assisting the people who have visually impaired to view the world,*** approaches adopt an encoder-decoder framework,and some of the future methods are improved on the basis of this *** these methods,image features are extracted by VGG net or other networks,but the feature map will lose important information during the *** this paper,we fusing different kinds of image features extracted by the two networks:VGG19 and Resnet50,and put it into the neural network to *** also add an attention into the a basic neural encoder-decoder model for generating natural sentence descriptions,at each time step,our model will attend to the image feature and pick up the most meaningful parts to generate *** test our model on the benchmark dataset called I APR TC-12,comparing with other methods,we validate our model have state-of-the-art performance.
暂无评论