The reading process of visual codes consists of two steps, localization and data decoding. This paper presents a novel method for QR code localization using deep rectifier neural networks, trained directly in the JPEG...
详细信息
ISBN:
(纸本)9789897580413
The reading process of visual codes consists of two steps, localization and data decoding. This paper presents a novel method for QR code localization using deep rectifier neural networks, trained directly in the JPEG DCT domain, thus making image decompression unnecessary. This approach is efficient with respect to both storage and computation cost, being convenient, since camera hardware can provide JPEG stream as their output in many cases. The structure of the neural networks, regularization, and training data parameters, like input vector length and compression level, are evaluated and discussed. The proposed approach is not exclusively for QR codes, but can be adapted to Data Matrix codes or other two-dimensional code types as well.
A shape matching dynamics (SMD) is a robust and efficient elastic model based on geometric constraints. This article introduces our study [1] that adopts SMD to visual simulation of cardiac beating motion. In our tech...
详细信息
We investigate the use of intrinsic spectral analysis (ISA) for query-by-example spoken term detection (QbE-STD). In the task, spoken queries and test utterances in an audio archive are converted to ISA features, and ...
详细信息
Usage of computer-readable visual codes became common in our everyday life at industrial environments and private use. The reading process of visual codes consists of two steps, localization and data decoding. This pa...
详细信息
Usage of computer-readable visual codes became common in our everyday life at industrial environments and private use. The reading process of visual codes consists of two steps, localization and data decoding. This paper introduces a new method for QR code localization using conventional and deep rectifier neural networks. The structure of the neural networks, regularization, and training parameters, like input vector properties, amount of overlapping at samples, and effect of different block sizes are evaluated and discussed. Results are compared to localization algorithms of the literature.
This paper presents a deep neural network (DNN) approach to sentence boundary detection in broadcast news. We extract prosodic and lexical features at each inter-word position in the transcripts and learn a sequential...
详细信息
In this contribution, we present a segmentation algorithm based on thresholding to subdivide an intensity image in the regions of object and background. The optimal threshold is found by maximizing a likelihood functi...
详细信息
In this contribution, we present a segmentation algorithm based on thresholding to subdivide an intensity image in the regions of object and background. The optimal threshold is found by maximizing a likelihood function derived from a novel intensity probability density function model, which consists of the sum of two weighted four-parameter gamma distributions, as a more flexible alternative to currently used models consisting of the sum of two weighted two-parameter Gaussian distributions. According to our experiments with 132 images, the proposed algorithm is in average slightly better than the best found in the scientific literature, performing particularly good in low contrast images. The additional parameters and complexity of its likelihood function resulted in an increase of the processing time by a factor of 3, from 0.003 sec/image to 0.009 sec/image.
This paper presents a multimodal biometric system for authentication, based on the fusion of iris and palmprint. We propose an approach for feature extraction of each modality by using wavelet packet decomposition at ...
详细信息
This paper presents a multimodal biometric system for authentication, based on the fusion of iris and palmprint. We propose an approach for feature extraction of each modality by using wavelet packet decomposition at four levels. This gives 256 packets which can generate a compact binary code. It is obtained from the first three highest energy peaks to compute an adapted threshold that enable to affect 0 or 1 to each wavelet packet. Different fusion strategies were tested at different levels: feature level, score level and error level. The first fusion is a simple concatenation of iris and palmprint codes. The second employs a weighted sum rule to matching scores. The third applies the Hamacher t-norm to the errors. The proposed approach and each fusion strategy were tested for their accuracy on the Casia iris database fused with the Casia palmprint database, and then with the PolyU database. The proposed approach for multimodal biometric system achieves a recognition improvement with each fusion method.
Mutual occlusions among targets can cause track loss or target position deviation, because the observation likelihood of an occluded target may vanish even when we have the estimated location of the target. This paper...
详细信息
ISBN:
(纸本)9781479951192
Mutual occlusions among targets can cause track loss or target position deviation, because the observation likelihood of an occluded target may vanish even when we have the estimated location of the target. This paper presents a novel probability framework for multitarget tracking with mutual occlusions. The primary contribution of this work is the introduction of a vectorial occlusion variable as part of the solution. The occlusion variable describes occlusion states of the targets. This forms the basis of the proposed probability framework, with the following further contributions: 1) Likelihood: A new observation likelihood model is presented, in which the likelihood of an occluded target is computed by referring to both of the occluded and occluding targets. 2) Priori: Markov random field (MRF) is used to model the occlusion priori such that less likely "circular" or "cascading" types of occlusions have lower priori probabilities. Both the occlusion priori and the motion priori take into consideration the state of occlusion. 3) Optimization: A realtime RJMCMC-based algorithm with a new move type called "occlusion state update" ispresented. Experimental results show that the proposed framework can handle occlusions well, even including long-duration full occlusions, which may cause tracking failures in the traditional methods.
Nowadays, the process of change detection is regarded as an outstanding way for urban planning and design. The major concern of this paper is to investigate the non-stationary character of multi-temporal time series. ...
详细信息
Nowadays, the process of change detection is regarded as an outstanding way for urban planning and design. The major concern of this paper is to investigate the non-stationary character of multi-temporal time series. To overcome this problem, we propose an adaptive multiplicative decomposition of non-stationary multi-temporal satellite image, which allows to decompose the series into three components: trend, seasonal and random, to properly model the evolution of land cover. We carried several experiments to validate our approach based on Landsat images covering the region of “Tres Cantos-Madrid” in Spain. The obtained results show the effectiveness of our proposed method comparing to some conventional methods.
暂无评论