Multimedia information plays an increasingly important role in humans daily activities. Given a set of web multimedia objects (images with corresponding texts), a challenging problem is how to group these images into ...
详细信息
Multimedia information plays an increasingly important role in humans daily activities. Given a set of web multimedia objects (images with corresponding texts), a challenging problem is how to group these images into several clusters using the available information. Previous researches focus on either adopting individual information, or simply combining image and text information together for clustering. In this paper, we propose a novel approach (Dynamic Weighted Clustering) to separate images under the "supervision" of text descriptions, Also, we provide a comparative experimental investigation on utilizing text and image information to tackle web image clustering. Empirical experiments on a manually collected web multimedia object (related to the events after disasters) dataset are conducted to demonstrate the efficacy of our proposed method.
This paper presents an objective comparative evaluation of layout analysis methods for scanned historical documents. It describes the competition (modus operandi, dataset and evaluation methodology) held in the contex...
详细信息
ISBN:
(纸本)9781457713507
This paper presents an objective comparative evaluation of layout analysis methods for scanned historical documents. It describes the competition (modus operandi, dataset and evaluation methodology) held in the context of ICDAR2011 and the international Workshop on Historical Document Imaging and Processing (HIP2011), presenting the results of the evaluation of four submitted methods. A commercial state-of-the-art system is also evaluated for comparison. Two scenarios are reported in this paper, one evaluating the ability of methods to accurately segment regions and the other evaluating the whole pipeline of segmentation and region classification (with a text extraction goal). The results indicate that there is a convergence to a certain methodology with some variations in the approach. However, there is still a considerable need to develop robust methods that deal with the idiosyncrasies of historical documents.
Automatiuc speech recognition is carried out by Mel-frequency cepstral coefficient (MFCC). Linearly-spaced at low and logarithmic-spaced filters at higher frequencies are used to capture the characteristics of speech....
详细信息
Automatiuc speech recognition is carried out by Mel-frequency cepstral coefficient (MFCC). Linearly-spaced at low and logarithmic-spaced filters at higher frequencies are used to capture the characteristics of speech. Multi-layer perceptrons (MLP) approximate continuous and non-linear functions. High dimensional patterns are not permitted due to eigen-decomposition in high dimensional image space and degeneration of scattering matrices in small size sample. Generalization, dimensionality reduction and maximizing the margins are controlled by minimizing weight vectors. Results show good pattern by SVM algorithm with Mercer kernel.
In order to model older people's behaviour in the home we must first understand it. In this paper we examine data from eight purpose-built aware homes over a six-month period, looking at presence in rooms to try t...
详细信息
In order to model older people's behaviour in the home we must first understand it. In this paper we examine data from eight purpose-built aware homes over a six-month period, looking at presence in rooms to try to determine patterns amongst the older residents. We look for homes that have similar movement patterns using cluster analysis. We also examine how movement over days clusters within individual homes. Our analysis begins to show the possibilities of distinguishing between residents in their homes based on patterns of movement.
Historical documents frequently suffer from arbitrary geometric distortions (warping and folds) due to storage conditions, use and to, some extent, the printing process of the time. In addition, page curl can be promi...
详细信息
ISBN:
(纸本)9781457713507
Historical documents frequently suffer from arbitrary geometric distortions (warping and folds) due to storage conditions, use and to, some extent, the printing process of the time. In addition, page curl can be prominent due to the scanning technique used. Such distortions adversely affect OCR and print-on-demand quality. Previous approaches to geometric restoration either focus only on the correction of page curl or require supplementary information obtained by additional scanning hardware - not practical for existing scans. This paper presents a new approach to detect and restore arbitrary warping and folds, in addition to page curl. Warped text lines and the smooth deformation between them are precisely modelled as primary and secondary flow lines that are then restored to their original linear shape. Preliminary, but representative, experimental results, in comparison to a leading page curl removal method and an industry-standard commercial system, demonstrate the effectiveness of the proposed method.
Robust human pose estimation from the given visual observations has attracted many attentions in the past two decades. However, this problem is still challenging due to the situation that observations are often corrup...
详细信息
Robust human pose estimation from the given visual observations has attracted many attentions in the past two decades. However, this problem is still challenging due to the situation that observations are often corrupted with partial occlusions or noise pollutions or both in real-world applications. In this paper, we propose to estimate human pose by using robust silhouette matching in original rectangle-coordinate space. In addition, human action model is employed to determinate reasonable matching results. Experimental results on robustness sequence of Weizman dataset reveal that our proposed approach can estimate human pose robustly and reasonably when pose observations are corrupted with partial occlusions or noise pollutions.
In this paper we deal with the problem of feature selection by introducing a new approach based on Gravitational Search Algorithm (GSA). The proposed algorithm combines the optimization behavior of GSA together with t...
详细信息
ISBN:
(纸本)9781457705380
In this paper we deal with the problem of feature selection by introducing a new approach based on Gravitational Search Algorithm (GSA). The proposed algorithm combines the optimization behavior of GSA together with the speed of Optimum-Path Forest (OPF) classifier in order to provide a fast and accurate framework for feature selection. Experiments on datasets obtained from a wide range of applications, such as vowel recognition, image classification and fraud detection in power distribution systems are conducted in order to asses the robustness of the proposed technique against Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and a Particle Swarm Optimization (PSO)-based algorithm for feature selection.
The current OCR cannot segment words and characters from video images due to complex background as well as low resolution of video images. To have better accuracy, this paper presents a new gradient based method for w...
详细信息
The current OCR cannot segment words and characters from video images due to complex background as well as low resolution of video images. To have better accuracy, this paper presents a new gradient based method for words and character segmentation from text line of any orientation in video frames for recognition. We propose a Max-Min clustering concept to obtain text cluster from the normalized absolute gradient feature matrix of the video text line image. Union of the text cluster with the output of Canny operation of the input video text line is proposed to restore missing text candidates. Then a run length algorithm is applied on the text candidate image for identifying word gaps. We propose a new idea for segmenting characters from the restored word image based on the fact that the text height difference at the character boundary column is smaller than that of the other columns of the word image. We have conducted experiments on a large dataset at two levels (word and character level) in terms of recall, precision and f-measure. Our experimental setup involves 3527 characters of English and Chinese, and this dataset is selected from TRECVID database of 2005 and 2006.
Activity modelling and discovery plays a critical role in smart home based assisted living. Existing approaches to patternrecognition using data-intensive analysis suffers from various drawbacks. To address these sho...
详细信息
Activity modelling and discovery plays a critical role in smart home based assisted living. Existing approaches to patternrecognition using data-intensive analysis suffers from various drawbacks. To address these shortcomings, this paper introduces a novel ontology-based approach to activity modelling, activity discovery and evolution. In this approach, activity modelling is undertaken through ontological engineering by leveraging domain knowledge and heuristics. The generated activity models evolve from the initial “seed” activity models through continuous activity discovery and learning. Activity discovery is performed through ontological reasoning. The paper describes the approach in the context of smart home with special emphases placed on activity discovery algorithms and evolution mechanism. The approach has been implemented in a feature-rich assistive living system in which new daily activities can be detected and further used to evolve the underlying activity models.
SVMs with the general purpose RBF kernel are widely considered as state-of-the-art supervised learning algorithms due to their effectiveness and versatility. However, in practice, SVMs often require more training data...
详细信息
SVMs with the general purpose RBF kernel are widely considered as state-of-the-art supervised learning algorithms due to their effectiveness and versatility. However, in practice, SVMs often require more training data than readily available. Prior-knowledge may be available to compensate this shortcoming provided such knowledge can be effectively passed on to SVMs. In this paper, we propose a method for the incorporation of prior-knowledge via an adaptation of the standard RBF kernel. Our practical and computationally simple approach allows prior-knowledge in a variety of forms ranging from regions of the input space as crisp or fuzzy sets to pseudo-periodicity. We show that this method is effective and that the amount of required training data can be largely decreased, opening the way for new usages of SVMs. We propose a validation of our approach for patternrecognition and classification tasks with publicly available datasets in different application domains.
暂无评论