Automatic caption generation of an image requires both computer vision and natural language processing techniques. Despite of advanced research in English caption generation, research on generating Arabic descriptions...
详细信息
ISBN:
(纸本)9781577358008
Automatic caption generation of an image requires both computer vision and natural language processing techniques. Despite of advanced research in English caption generation, research on generating Arabic descriptions of an image is extremely limited. Semitic languages like Arabic are heavily influenced by root-words. We leverage this critical dependency of Arabic and in this paper are the first to generate captions of an image directly in Arabic using root-word based Recurrent neuralnetworks and Deep neuralnetworks. We report the first BLEU score for direct Arabic caption generation. Experimental results confirm that generating image captions using root-words directly in Arabic significantly outperforms the English-Arabic translated captions using state-of-the-art methods.
The redundancy of the multiresolution representation has been clearly demonstrated in the case of fractal images, but has not been fully recognized and exploited for general images, This paper presents a new image cod...
详细信息
ISBN:
(纸本)0819424412
The redundancy of the multiresolution representation has been clearly demonstrated in the case of fractal images, but has not been fully recognized and exploited for general images, This paper presents a new image coder in which the similarity among blocks of different subbands is exploited by block prediction based on neural network. After a pyramid subband decomposition, the detail subbands are partitioned into a set of uniform non-overlapping blocks. In order to speed up the coding procedure and improve the coding efficiency, a new classifying criteria is presented, the blocks are classified into two sets : the simple block set and the edge block set. In our proposed method, the edge blocks are predicted from blocks in lower scale subband with sane orientation through neural network. The simple blocks and predictive edge error blocks are coded with arithmetic coder simulation results show that the method presented in this paper is a promising coding technique which is worth for us to do further research.
This conference proceedings contains 43 papers. The topics included are applications of neuralnetworks to medical imaging, imageprocessing, coding, speech processing, aerospace, character recognition, physics, commu...
详细信息
ISBN:
(纸本)0819405787
This conference proceedings contains 43 papers. The topics included are applications of neuralnetworks to medical imaging, imageprocessing, coding, speech processing, aerospace, character recognition, physics, communications, and in computers. Learning algorithms and machine learning are discussed in detail.
Deep neuralnetworks (DNNs) have been vastly and successfully employed in various artificial intelligence and machine learning applications (e.g., imageprocessing and natural language processing). As DNNs become deep...
详细信息
Deep neuralnetworks (DNNs) have been vastly and successfully employed in various artificial intelligence and machine learning applications (e.g., imageprocessing and natural language processing). As DNNs become deeper and enclose more filters per layer, they incur high computational costs and large memory consumption to preserve their large number of parameters. Moreover, present processing platforms (e.g., CPU, GPU, and FPGA) have not enough internal memory, and hence external memory storage is needed. Hence deploying DNNs on mobile applications is difficult, considering the limited storage space, computation power, energy supply, and real-time processing requirements. In this work, using a method based on tensor decomposition, network parameters were compressed, thereby reducing access to external memory. This compression method decomposes the network layers' weight tensor into a limited number of principal vectors such that (i) almost all the initial parameters can be retrieved, (ii) the network structure did not change, and (iii) the network quality after reproducing the parameters was almost similar to the original network in terms of detection accuracy. To optimize the realization of this method on FPGA, the tensor decomposition algorithm was modified while its convergence was not affected, and the reproduction of network parameters on FPGA was straightforward. The proposed algorithm reduced the parameters of ResNet50, VGG16, and VGG19 networks trained with Cifar10 and Cifar100 by almost 10 times. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.
Near-simultaneous, multispectral, coregistered imagery of ground target and background signatures were collected over a full diurnal cycle in the MWIR, LWIR, near-infrared, blue, green, and red wavebands using Battell...
详细信息
ISBN:
(纸本)0819412015
Near-simultaneous, multispectral, coregistered imagery of ground target and background signatures were collected over a full diurnal cycle in the MWIR, LWIR, near-infrared, blue, green, and red wavebands using Battelle's portable sensor suite. The imagery data were processed with classical statistical algorithms and artificialneuralnetworks to discriminate target signatures from background clutter and investigate automatic target detection and recognition schemes.
Morphological neuralnetworks (MNN) have been proposed as an alternative neural computation paradigm. In this paper we explore the potential of Heteroassociative MNN (HMNN) for a vision based practical task, that of s...
详细信息
ISBN:
(纸本)0819439835
Morphological neuralnetworks (MNN) have been proposed as an alternative neural computation paradigm. In this paper we explore the potential of Heteroassociative MNN (HMNN) for a vision based practical task, that of self-localization in a vision-based navigation framework for mobile robots. HMNN have a big potential for real time application because its recall process is very fast. We present some experimental results that illustrate the proposed approach.
The determination of the regularization parameter is an important sub-problem in optimizing the performances of image restoration systems. The parameter controls the relative weightings of the data-conformance and mod...
详细信息
ISBN:
(纸本)0819424412
The determination of the regularization parameter is an important sub-problem in optimizing the performances of image restoration systems. The parameter controls the relative weightings of the data-conformance and model-conformance terms in the restoration cost function. A small parameter value would lead to noisy appearances in the smooth image regions due to over-emphasis of the data term, while a large parameter results in blurring of the textured regions due to dominance of the model term. Based on the principle of adopting small parameter values for the highly textured regions for detail emphasis while using large values for noise suppression in the smooth regions, a spatially adaptive regularization scheme was derived in this paper. An initial segmentation based on the local image activity was performed and a distinct regularization parameter was associated with each segmented component. The regional value was estimated by viewing the parameter as a set of learnable neuronal weights in a Model-Based neural network. A stochastic gradient descent algorithm based on the regional spatial characteristics and specific functional form of the neuronal weights was derived to optimize the regional parameter values. The efficacy of the algorithm was demonstrated by our observation of the emergence of small parameter values in textured regions and large values in smooth regions.
This paper describes a Markov random field (MRF) approach to image segmentation. Unlike most previous MRF techniques, which are based on pixel-classification, this approach groups pixels that are similar. This removes...
详细信息
ISBN:
(纸本)0819424412
This paper describes a Markov random field (MRF) approach to image segmentation. Unlike most previous MRF techniques, which are based on pixel-classification, this approach groups pixels that are similar. This removes the need to know the number of image classes. Mean field theory and multigrid processing are used in the subsequent optimization to find a good segmentation and to alleviate local minimum problems. Variations of the MRF approach are investigated by incorporating features/schemes motivated by characteristics of the human vision system (HVS). Preliminary results are promising and indicate that multi-grid and HVS based features/schemes can significantly improve segmentation results.
A practical approach to continuos-tone color image segmentation is proposed. Unlike traditional algorithms of image segmentation which tend to use threshold methods we intend to show how neural network technique can b...
详细信息
ISBN:
(纸本)0819424412
A practical approach to continuos-tone color image segmentation is proposed. Unlike traditional algorithms of image segmentation which tend to use threshold methods we intend to show how neural network technique can be successfully applied to this problem. We used a Bacpropagation network architecture in this work. It was assumed that each image pixel has its own color, which is somehow correlated with those of the nearest neighborhood. To describe the color properties of certain neighborhood we suggested nine component feature vector for every image pixel. This set of feature components is applied to the network input neurons. By this means,every image pixel is described by the following values R, G and B (color Intensities), Mr, Mg and Mb (averages of intensities of the nearest neighborhood), sigma(r), sigma(g)land sigma(b) (.r, .m, s, deviations of color intensities). To estimate the algorithm efficiency the scalar criterion was proposed. It was shown by the results of comparative experiment that neural segmentation provides more efficiency then that of traditional, using threshold methods.
暂无评论