image captioning develops a relationship between visual and text information to generate a sequence of words as captions. Transformers perform machine translation and language comprehension together using encoder and ...
详细信息
ISBN:
(纸本)9781665464956
image captioning develops a relationship between visual and text information to generate a sequence of words as captions. Transformers perform machine translation and language comprehension together using encoder and decoder structure. With the aim of building a lightweight and production deployment friendly model, we present the Lightweight Transformer with a GRU integrated decoder for image Captioning. In the presented model, the number of encoders and decoders in standard architecture are reduced to single encoder and GRU integrated decoder. Also, Multi-level rich visual features from incepptionV3 improves single unit encoder encoding performance. To validate the efficiency of the proposed Lightweight Transformer architecture extensive experiments are carried out on MSCOCO image captioning dataset. The model achieves appreciable performance in comparison to other state-of-the-art.
With the rapid development of the internet of Things industry today, it is imperative to classify the internet of Things scenarios to adapt to the needs of the internet of Things in different scenarios. In recent year...
详细信息
Division is one of the most commonly sort after algorithm for performing image processing operations such as normalization, filtering, enhancement, deconvolution etc. Hence, the design of efficient division algorithm ...
详细信息
This work proposes an interpretable classifier for automatic Covid-19 classification using chest X-ray images. It is based on a deep learning model, in particular, a triplet network, devoted to finding an effective im...
详细信息
ISBN:
(纸本)9781665464956
This work proposes an interpretable classifier for automatic Covid-19 classification using chest X-ray images. It is based on a deep learning model, in particular, a triplet network, devoted to finding an effective image embedding. Such embedding is a non-linear projection of the images into a space of reduced dimension, where homogeneity and separation of the classes measured by a predefined metric are improved. A KNearest Neighbor classifier is the interpretable model used for the final classification. Results on public datasets show that the proposed methodology can reach comparable results with state of the art in terms of accuracy, with the advantage of providing interpretability to the classification, a characteristic which can be very useful in the medical domain, e.g. in a decision support system.
The Power Industrial internet serves as an essential platform for the digital transformation of the electric power industry, during its implementation, the integration of Information technology (IT) and Operational Te...
详细信息
The Electrocardiogram (ECG) signal is an important tool for cardiovascular diseases analysis. However, still today acquisition devices produce noisy signals that degrades the quality of information by corrupting impor...
详细信息
ISBN:
(纸本)9781665464956
The Electrocardiogram (ECG) signal is an important tool for cardiovascular diseases analysis. However, still today acquisition devices produce noisy signals that degrades the quality of information by corrupting important features. To improve the quality of the acquired data a filtering process is mandatory. Moreover, a real-time filtering of ECGs, in order to obtain a diagnosis as quickly as possible is a very interesting challenge. In this paper, we consider as denoising filter, the Savitzky-Golay method and we propose a parallel algorithm implementing it. The procedure exploits the computational power of Graphics Processing Units (GPUs). Results in terms of performance and quality are provided.
With the continuous increase in the number of internet users, the explosion of multimodal data on the web has led to a growing demand for image retrieval. Current text-to-image retrieval systems often rely on keyword ...
详细信息
In order to solve the problem of low precision and low stability of deep learning network for industrial equipment fault recognition under strong background noise, a method of equipment fault image recognition based o...
详细信息
The development of semantic communication has promoted the research on image transmission. Content and style are two crucial characteristics of image information. Existing studies have explored image information trans...
详细信息
The early recognition of retinal disorders such as cataracts, glaucoma, and retinal problems is highly dependent on high-resolution retinal imaging, which is a complex issue because of poor image quality and the lack ...
详细信息
暂无评论