Many off-line handwritten word recognition systems have been proposed since the early nineties. Most systems reported high recognition rates, however, they overlooked a very important factor in the process;speed facto...
详细信息
The co-articulation is one of the main reasons that makes the speech recognition difficult. However, the traditional Hidden Markov Models(HMM) can not model the co-articulation, because they depend on the first-order ...
详细信息
The co-articulation is one of the main reasons that makes the speech recognition difficult. However, the traditional Hidden Markov Models(HMM) can not model the co-articulation, because they depend on the first-order assumption. In this paper, for modeling the co-articulation, a more perfect HMM than traditional first order HMM is proposed on the basis of the authors’ previous works(1997, 1998) and they give a method in that this HMM is used in continuous speech recognition by means of multilayer perceptrons(MLP), i.e. the hybrid HMM/MLP method with triple MLP structure. The experimental result shows that this new hybrid HMM/MLP method decreases error rate in comparison with authors’ previous works.
This paper proposes a genetic-based algorithm for surface reconstruction of three-dimension (3-D) objects from a group of contours representing its section plane lines. The algorithm can optimize the triangulation of ...
详细信息
This paper proposes a genetic-based algorithm for surface reconstruction of three-dimension (3-D) objects from a group of contours representing its section plane lines. The algorithm can optimize the triangulation of the surface of 3-D objects with a multi-objective optimization function to meet the needs of a wide range of applications. Further, a new crossover operator for triangulation and a new 3-D quadrilateral mutation operator are also introduced.
This paper presents the overall scheme of an indexation system for broadcast video designed for very large databases. We discuss features which can possibly be extracted from a video sequence so as to be used for &quo...
详细信息
This paper presents the overall scheme of an indexation system for broadcast video designed for very large databases. We discuss features which can possibly be extracted from a video sequence so as to be used for "queries by example". We present examples of semantic features (query by content) like text reading, face localisation and classification. As this study is still at its infancy, we point out the key features and temporary achievements and present some temporary results.
We study the problem of representing images within a multimedia Database Management System (DBMS), in order to support fast retrieval operations without compromising storage efficiency. To achieve this goal, we propos...
We study the problem of representing images within a multimedia Database Management System (DBMS), in order to support fast retrieval operations without compromising storage efficiency. To achieve this goal, we propose new image coding techniques which combine a wavelet representation, embedded coding of the wavelet coefficients, and segmentation of image-domain regions in the wavelet domain. A bitstream is generated in which each image region is encoded independently of other regions, without having to explicitly store information describing the regions. Simulation results show that our proposed algorithms achieve coding performance which compares favorably, both perceptually and objectively, to that achieved using state-of-the-art image/video coding techniques while additionally providing region-based support.
Individual cues from visual modules are fallible and often ambiguous. As a result, only integrated vision systems can be expected to give a reliable performance in practice. The design of such systems is challenging s...
详细信息
Individual cues from visual modules are fallible and often ambiguous. As a result, only integrated vision systems can be expected to give a reliable performance in practice. The design of such systems is challenging since each vision module works under different and possibly conflicting sets of assumptions. We have proposed and implemented a multiresolution system which integrates perceptual grouping, segmentation, stereo, shape from shading, and line lab.lling modules. The output of the integrated system is shown to be relatively insensitive to the constraints imposed by the individual modules.< >
We investigate the performance of selected texture models for the purpose of land use classification. The texture models are evaluated based on the resulting classification error rates. Three classes of texture models...
详细信息
Due to the numerous applications of boundary maps and occlusion orientation maps (ORI-maps) in high-level vision problems, accurate estimation of these maps is a crucial task. The existing deep networks employ a singl...
详细信息
Due to the numerous applications of boundary maps and occlusion orientation maps (ORI-maps) in high-level vision problems, accurate estimation of these maps is a crucial task. The existing deep networks employ a single-stream network to estimate the relation between boundary map and ORI-map estimation. However, these networks fail to explore significant individual information separately. To resolve this problem, in this paper, we propose a novel two-stream generative adversarial network (GAN) for boundary map and ORI-map estimation, named OBP-GAN. The proposed OBP-GAN consists of two streams known as BP-GAN and OR-GAN. The BP-GAN estimates the boundary map, and the OR-GAN predicts the ORI-map. The boundary and ORI-map can also be useful cues for the task of depth-map refinement from single images. Therefore, in this work, we propose a transformer-based depth-map refinement network (TRANSDMR-GAN) for refining the depth estimated from monocular images using boundary and ORI-map. We conducted extensive analyses on indoor and outdoor datasets to validate our proposed OBP-GAN and TRANSDMR-GAN. The extensive experimental analysis and ablation study demonstrate the ability of the proposed OBP-GAN to generate state-of-the-art occlusion boundary maps. Furthermore, we show that the proposed network, TRANSDMR-GAN, can generate an edge-enhanced depth map without degrading the accuracy of the initial depth map.
We would like to welcome you to the proceedings of MRCS 2006, Workshop on Multimedia Content Representation, Classi?cation and Security, held Sept- ber 11–13, 2006, in Istanbul, Turkey. The goal of MRCS 2006 was to p...
详细信息
ISBN:
(数字)9783540393931
ISBN:
(纸本)9783540393924
We would like to welcome you to the proceedings of MRCS 2006, Workshop on Multimedia Content Representation, Classi?cation and Security, held Sept- ber 11–13, 2006, in Istanbul, Turkey. The goal of MRCS 2006 was to provide an erudite but friendly forum where academic and industrial researchers could interact, discuss emerging multimedia techniques and assess the signi?cance of content representation and security techniques within their problem domains. We received more than 190 submissions from 30 countries. All papers were subjected to thorough peer review. The ?nal decisions were based on the cri- cisms and recommendations of the reviewers and the relevance of papers to the goals of the conference. Only 52% of the papers submitted were accepted for inclusion in the program. In addition to the contributed papers, four distinguished researchers agreed to deliver keynote speeches, namely: – Ed Delp on multimedia security – Pierre Moulin on data hiding – John Smith on multimedia content-based indexing and search – Mar´ ?o A. T. Figueiredo on semi-supervised learning.
暂无评论