In such computerized systems, such as voice control units, personal identification, IP-telephony, weapon control commands, accepting applications for reference services, automated stenography, recognition of individua...
详细信息
Nowadays there are many computer vision algorithms dedicated to solve the problem of object detection, from many different perspectives. Many of these algorithms take a considerable processing time even for low resolu...
详细信息
Due to high accuracy, inherent redundancy, and embarrassingly parallel nature, the neural networks are fast becoming mainstream machine learning algorithms. However, these advantages come at the cost of high memory an...
详细信息
Due to high accuracy, inherent redundancy, and embarrassingly parallel nature, the neural networks are fast becoming mainstream machine learning algorithms. However, these advantages come at the cost of high memory and processing requirements (that can be met by either GPUs, FPGAs or ASICs). For embedded systems, the requirements are particularly challenging because of stiff power and timing budgets. Due to the availability of efficient mapping tools, GPUs are an appealing platforms to implement the neural networks. While, there is significant work that implements the image recognition (in particular Convolutional Neural Networks) on GPUs, only a few works deal with efficiently implement of speech recognition on GPUs. The work that does focus on implementing speech recognition does not address embedded systems. To tackle this issue, this paper presents SPEED (Open-source framework to accelerate speech recognition on embedded GPUs).We have used Eesen speech recognition framework because it is considered as the most accurate speech recognition technique. Experimental results reveal that the proposed techniques offer 2.6X speedup compared to state of the art.
Reading text from natural images is much more difficult than from scanned text documents since the text may appear in all colors, different sizes and types, often with distorted geometry or textures applied. The paper...
详细信息
ISBN:
(纸本)9783319238142;9783319238135
Reading text from natural images is much more difficult than from scanned text documents since the text may appear in all colors, different sizes and types, often with distorted geometry or textures applied. The paper presents the idea of high-speed image preprocessingalgorithms utilizing the quasi-local histogram based methods such as binarization, ROI filtering, line and corners detection, etc. which can be helpful for this task. Their low computational cost is provided by a reduction of the amount of processed information carried out by means of a simple random sampling. The approach presented in the paper allows to minimize some problems with the implementation of the OCR algorithms operating on natural images on devices with low computing power (e.g. mobile or embedded). Due to relatively small computational effort it is possible to test multiple hypotheses e.g. related to the possible location of the text in the image. Their verification can be based on the analysis of images in various color spaces. An additional advantage of the discussed algorithms is their construction allowing an efficient parallel implementation further reducing the computation time.
This paper proposes a novel framework called concatenated image completion via tensor augmentation and completion (ICTAC), which recovers missing entries of color images with high accuracy. Typical images are second-o...
详细信息
ISBN:
(纸本)9781509009411
This paper proposes a novel framework called concatenated image completion via tensor augmentation and completion (ICTAC), which recovers missing entries of color images with high accuracy. Typical images are second-or third-order tensors (2D/3D) depending if they are grayscale or color, hence tensor completion algorithms are ideal for their recovery. The proposed framework performs image completion by concatenating copies of a single image that has missing entries into a third-order tensor, applying a dimensionality augmentation technique to the tensor, utilizing a tensor completion algorithm for recovering its missing entries, and finally extracting the recovered image from the tensor. The solution relies on two key components that have been recently proposed to take advantage of the tensor train (TT) rank: A tensor augmentation tool called ket augmentation (KA) that represents a low-order tensor by a higher-order tensor, and the algorithm tensor completion by parallel matrix factorization via tensor train (TMac-TT), which has been demonstrated to outperform state-of-the-art tensor completion algorithms. Simulation results for color image recovery show the clear advantage of our framework against current state-of-the-art tensor completion algorithms.
The MPS approach (Minimal Path Selection) has shown in [1] to provide robust and accurate segmentation of cracks within pavement images compared to other algorithms. As a counterpart, MPS suffers from a large computin...
详细信息
The MPS approach (Minimal Path Selection) has shown in [1] to provide robust and accurate segmentation of cracks within pavement images compared to other algorithms. As a counterpart, MPS suffers from a large computing time. In this paper, we present three different ongoing improvements to reduce the computing time and to improve the overall segmentation performance. Most of the work focuses on the first three steps of the algorithm which achieve the segmentation of the crack skeleton. This is at first the improvement of the MPS methodology under Matlab coding, then, the C language MPS version and finally, the first attempt to parallelize MPS under the GPU platform. The results on pavement images illustrate the achieved improvements in terms of better segmentation and faster computational time.
The main goal of works described in the paper is to test and select algorithms to be implemented in the 'SM4Public' security system for public spaces. The paper describes the use of cascading approaches in the...
详细信息
ISBN:
(纸本)9783319238142;9783319238135
The main goal of works described in the paper is to test and select algorithms to be implemented in the 'SM4Public' security system for public spaces. The paper describes the use of cascading approaches in the scenario concerning the detection of vehicles in static images. Three feature extractors were used along with benchmark datasets in order to prepare eight various cascades of classifiers. The algorithms selected for feature extraction are Histogram of Oriented Gradients, Local Binary Patterns and Haar-like features. AdaBoost was used as a classifier. The paper briefly introduces the 'SM4Public' system characteristics, characterizes the employed algorithms and presents sample experimental results.
Shadow is formed by the interaction of light with object. Effect of shadow is very crucial in the case of satellite imageprocessing. Roads, buildings, trees etc are detected for various applications. But the interfer...
详细信息
ISBN:
(纸本)9781509033492
Shadow is formed by the interaction of light with object. Effect of shadow is very crucial in the case of satellite imageprocessing. Roads, buildings, trees etc are detected for various applications. But the interference of shadow makes mismatching of these objects. Several algorithms are being developed to detect and reconstruct the shadow region. This paper presents a Shadow detection technique based on Niblack segmentation. Niblack segmentation gives better shadow regions compared to Otsu's thresholding method and Sauvola based thresholding. Reconstruction of the shadow region is done by the Bayesian classifier. This classifier generate a training vector and reconstruct non shadow region from shadow region. Posterior probability is determined to reconstruct the non shadow image intensity level. This algorithm is successfully tested with VHSR images.
In this paper the application for generation of HDR image based on two consecutive images (underexposed and overexposed) for Android mobile operating system is presented. The implemented software preserves a lot of im...
详细信息
ISBN:
(纸本)9788362065271
In this paper the application for generation of HDR image based on two consecutive images (underexposed and overexposed) for Android mobile operating system is presented. The implemented software preserves a lot of image details and maintains a low execution time. These features are particularly important for pictures taken using mobile devices in emergency situations. Such photos may constitute evidence that a threat occurred, was properly recognized, or someone committed a crime. HDR images can be also used in mobile systems for supporting pedestrians or drivers. Obtained results indicate on a high effectiveness of the presented solution.
In this paper, we proposed new framework for human action representation, which leverages the strengths of convolutional neural networks (CNNs) and the linear dynamical system (LDS) to represent both spatial and tempo...
详细信息
ISBN:
(纸本)9781509041183
In this paper, we proposed new framework for human action representation, which leverages the strengths of convolutional neural networks (CNNs) and the linear dynamical system (LDS) to represent both spatial and temporal structures of actions in videos. We make two principal contributions: first, we incorporate image-trained CNNs to detect action clip concepts, which takes advantage of different levels of information by combining the two layers in CNNs trained from images;Second, we further propose adopting a linear dynamical system (LDS) to model the relationships between these clip concepts, which captures temporal structures of actions. We have applied the proposed method on two challenging realistic benchmark datasets, and our method achieves high performance up to 86.16% on the YouTube and 82.76% UCF50 datasets, which largely outperforms most of the state-of-the-art algorithms with more sophisticated techniques.
暂无评论