Historical spoken documents represent a unique segment of national cultural heritage. In order to disclose the large Czech Radio audio archive to research community and to public, we have been developing a system whos...
详细信息
ISBN:
(纸本)9783642411892;9783642411908
Historical spoken documents represent a unique segment of national cultural heritage. In order to disclose the large Czech Radio audio archive to research community and to public, we have been developing a system whose aim is to transcribe automatically the archive files, index them and make them searchable. The transcription of contemporary (1 or 2 decades old) documents is based on the lexicon and statistical language model (LM) built from a large amount of recent texts available in electronic form. From the older periods (before 1990), however, digital texts do not exist. Therefore, we needed a) to find resources that represent language of those times, b) to convert them from their original form to text, c) to utilize this text for creating epoch specific lexicons and LMs, and eventually, d) to apply them in the developed speech recognition system. In our case, the main resources included: scanned historical newspapers, shorthand notes from the national parliament and subtitles from retro TV programs. When converted into text, they allowed us to built a more appropriate lexicon and to produce a preliminary version of the transcriptions. These were reused for unsupervised retraining of the final LM. In this way, we significantly improved the accuracy of the automatically transcribed radio news broadcast in 1969-1989 era, from initial 83 % to 88 %.
Human action recognition in video streams is a fast developing field in patternrecognition and machinelearning. Local image representations, e.g. space-time interest points [1], have proven to be the current most re...
详细信息
ISBN:
(纸本)9781467361279
Human action recognition in video streams is a fast developing field in patternrecognition and machinelearning. Local image representations, e.g. space-time interest points [1], have proven to be the current most reliable choice of feature in sequences in which the region of interest is difficult to determine [2]. However, the question how to deal with more severe occlusions has large been ignored [2]. This work proposes a new approach which directly addresses heavy occlusions by modeling the skeleton-based features using a probability density functions (PDF) defined over graphs. We integrated the proposed density into an hidden Markov model (HMM) to model sequences of graphs of arbitrary sizes, i.e. occlusions setting may change over time. The approach is evaluated using a dataset embracing three action classes, studying six different types of occlusions (involving the removal of subgraphs from the graphical representation of action sequence). The presented study shows clearly that actions from even heavily occluded sequences can be reliably recognized.
The proceedings contain 59 papers. The special focus in this conference is on Rough Sets and Applications, machinelearning in patternrecognition and imageprocessing, machinelearning in Multimedia Computing, Bioinf...
ISBN:
(纸本)9783642353253
The proceedings contain 59 papers. The special focus in this conference is on Rough Sets and Applications, machinelearning in patternrecognition and imageprocessing, machinelearning in Multimedia Computing, Bioinformatics and Cheminformatics, Data Classification and Clustering, Cloud Computing and Recommender Systems, Case-Based Reasoning and Data processing, Authentication, Digital Forensics and Plagiarism Detection. The topics include: Rough sets-based machinelearning over non-deterministic data;learning a table from a table with non-deterministic information;parameterised fuzzy Petri nets for approximate reasoning in decision support systems;rough sets-based rules generation approach;automatic color image segmentation based on illumination invariant and superpixelization;wavelet based statistical adapted local binary patterns for recognizing avatar faces;solving avatar captchas automatically;comparative analysis of image fusion techniques in remote sensing;density based fuzzy thresholding for image segmentation;subjectivity and sentiment analysis of Arabic;support vector machine approach for detecting events in video streams;study of feature categories for musical instrument recognition;a genetic-CBR approach for cross-document relationship identification;improved action recognition using an efficient boosting method;towards smart Egypt -the role of large scale WSNs;language for writing descriptors of outline shape of molecules;web service based approach for viral hepatitis ontology sharing and diagnosing;sampleboost for capsule endoscopy categorization and abnormality detection;advanced parallel genetic algorithm with gene matrix for global optimization;semi-possibilistic biclustering applied to discrete and continuous data;a comparative study of localization algorithms in WSNs;test cases automatic generator (TCAG);support vector machines with weighted powered kernels for data classification;an enhanced cloud-based view materialization approach for peer-to-
This paper presents a new method for facial expression modelling and recognition based on diffeomorphic image registration parameterised via stationary velocity fields in Log-Euclidean framework. The validation and co...
详细信息
ISBN:
(纸本)9789898425980
This paper presents a new method for facial expression modelling and recognition based on diffeomorphic image registration parameterised via stationary velocity fields in Log-Euclidean framework. The validation and comparison are done using different statistical shape models (SSM) built using the Point Distribution Model (PDM), velocity fields, and deformation fields. The obtained results show that the facial expression representation based on stationary velocity field can be successfully utilised in facial expression recognition, and this parameterisation produces higher recognition rate than the facial expression representation based on deformation fields.
Outdoor urban scenes typically contain many planar surfaces, which are useful for tasks such as scene reconstruction, object recognition, and navigation, especially when only a single image is available. In such situa...
详细信息
ISBN:
(纸本)9789898425980
Outdoor urban scenes typically contain many planar surfaces, which are useful for tasks such as scene reconstruction, object recognition, and navigation, especially when only a single image is available. In such situations the lack of 3D information makes finding planes difficult;but motivated by how humans use their prior knowledge to interpret new scenes with ease, we develop a method which learns from a set of training examples, in order to identify planar image regions and estimate their orientation. Because it does not rely explicitly on rectangular structures or the assumption of a'Manhattan world', our method can generalise to a variety of outdoor environments. From only one image, our method reliably distinguishes planes from non-planes, and estimates their orientation accurately;this is fast and efficient, with application to a real-time system in mind.
Object detection and localization is a challenging task. Among several approaches, more recently hierarchical methods of feature-based object recognition have been developed and demonstrated high-end performance measu...
详细信息
ISBN:
(纸本)9789898425980
Object detection and localization is a challenging task. Among several approaches, more recently hierarchical methods of feature-based object recognition have been developed and demonstrated high-end performance measures. Inspired by the knowledge about the architecture and function of the primate visual system, the computational HMAX model has been proposed. At the same time robust visual object recognition was proposed using feature distributions, e.g. histograms of oriented gradients (HOGs). Since both models build upon an edge representation of the input image, the question arises, whether one kind of approach might be superior to the other. Introducing a new biologically inspired attention steered processing framework, we demonstrate that the combination of both approaches gains the best results.
In digital image forensics, camera model identification seeks for the source camera model information from the given images under investigation. To achieve this goal, one of the popular approaches is extracting from t...
详细信息
Understanding images in terms of hierarchical and logical structures is crucial for many semantic tasks, including image retrieval, scene understanding and robot vision. This paper combines compositional hierarchies, ...
详细信息
ISBN:
(纸本)9789898425980
Understanding images in terms of hierarchical and logical structures is crucial for many semantic tasks, including image retrieval, scene understanding and robot vision. This paper combines compositional hierarchies, qualitative spatial relations, relational instance-based learning and robust feature extraction in one framework. For each layer in the hierarchy, substructures in the images are detected, classified and then employed one layer up the hierarchy to obtain higher-level semantic structures, by making use of qualitative spatial relations. The approach is applied to street view images. We employ a four-layer hierarchy in which subsequently corners, windows and doors, and individual houses are detected.
Automatic age estimation is the process of using a computer to predict the age of a person automatically based on a given facial image. While this problem has numerous real-world applications, the high variability of ...
详细信息
ISBN:
(纸本)9784990644109;9781467322164
Automatic age estimation is the process of using a computer to predict the age of a person automatically based on a given facial image. While this problem has numerous real-world applications, the high variability of aging patterns and the sparsity of available data present challenges for model training. Here, instead of training one global aging function, we train an individual function for each person by a multi-task learning approach so that the variety of human aging processes can be modelled. To deal with the sparsity of training data, we propose a similarity measure for clustering the aging functions. During the testing stage, which involves a new person with no data used for model training, we propose a feature-based similarity measure for characterizing the test case. We conduct simulation experiments on the FG-NET and MORPH databases and compared our method with other state-of-the-art methods.
For the purpose of information management on postmark according to the date, the paper put forward a method of postmark date recognition based on machine vision, which could meet the demands of personal postmark colle...
详细信息
ISBN:
(纸本)9783037853443
For the purpose of information management on postmark according to the date, the paper put forward a method of postmark date recognition based on machine vision, which could meet the demands of personal postmark collectors. On the basis of the relative theories of machine vision, imageprocessing and patternrecognition, the overall process is introduced in the paper from postmark image acquisition to date recognition. Firstly, threshold method is used to generate binary image from smoothed postmark image. So region of date numbers could be extracted from binary image according to different region features. Then regions of date numbers which are connected or broken could be processed through mathematical morphology of binary image. Individual regions of date numbers are obtained for recognition. Finally, classification and patternrecognition based on support vector machine make date numbers classified and date recognition is implemented correctly.
暂无评论