Purpose - The need of tools for content analysis, information extraction and retrieval of multimedia objects in their native form is strongly emphasized into the judicial domain: digital videos represent a fundamental...
详细信息
Purpose - The need of tools for content analysis, information extraction and retrieval of multimedia objects in their native form is strongly emphasized into the judicial domain: digital videos represent a fundamental informative source of events occurring during judicial proceedings that should be stored, organized and retrieved in short time and with low cost. This paper seeks to address these issues. Design/methodology/approach - In this context the JUMAS system, stem from the homonymous European Project (***), takes up the challenge of exploiting semantics and machine learning techniques towards a better usability of multimedia judicial folders. Findings - In this paper one of the most challenging issues addressed by the JUMAS project is described: extracting meaningful abstracts of given judicial debates in order to efficiently access salient contents. In particular, the authors present an ontology enhanced multimedia summarization environment able to derive a synthetic representation of judicial media contents by a limited loss of meaningful information while overcoming the information overload problem. Originality/value - The adoption of ontology-based query expansion has made it possible to improve the performance of multimedia summarization algorithms with respect to the traditional approaches based on statistics. The effectiveness of the proposed approach has been evaluated on real media contents, highlighting a good potential for extracting key events in the challenging area of judicial proceedings.
In this paper, we present a method for video category classification using only social metadata from websites like YouTube. In place of content analysis, we utilize communicative and social contexts surrounding videos...
详细信息
ISBN:
(纸本)9780819484185
In this paper, we present a method for video category classification using only social metadata from websites like YouTube. In place of content analysis, we utilize communicative and social contexts surrounding videos as a means to determine a categorical genre, e. g. Comedy, Music. We hypothesize that video clips belonging to different genre categories would have distinct signatures and patterns that are reflected in their collected metadata. In particular, we define and describe social metadata as usage or action to aid in classification. We trained a Naive Bayes classifier to predict categories from a sample of 1,740 YouTube videos representing the top five genre categories. Using just a small number of the available metadata features, we compare the classifications produced by our Naive Bayes classifier with those provided by the uploader of that particular video. Compared to random predictions with the YouTube data (21% accurate), our classifier attained a mediocre 33% accuracy in predicting video genres. However, we found that the accuracy of our classifier significantly improves by nominal factoring of the explicit data features. By factoring the ratings of the videos in the dataset, the classifier was able to accurately predict the genres of 75% of the videos. We argue that the patterns of social activity found in the metadata are not just meaningful in their own right, but are indicative of the meaning of the shared video content. The results presented by this project represents a first step in investigating the potential meaning and significance of social metadata and its relation to the media experience.
Today multimediacontent comprising both text and images is growing at a rapid pace. There has been a body of work to summarize text content, but to the best of our knowledge, no method has been developed to summarize...
详细信息
ISBN:
(纸本)9783319487434;9783319487427
Today multimediacontent comprising both text and images is growing at a rapid pace. There has been a body of work to summarize text content, but to the best of our knowledge, no method has been developed to summarize multimediacontent. We propose two methods for summarizing multimediacontent. Our novel approach explicitly recognizes two desirable, normative characteristics of a summary - good coverage and diversity of the respective text and images, and that text and images should be coherent with each other. Two methods are examined - graph based and a modification to the submodular approach. Moreover, we propose a metric to measure the quality of a multimedia summary which captures coverage and diversity of text and images as well as coherence between the text and images in the summary. We experimentally demonstrate that the proposed methods achieve good quality multimedia summaries.
作者:
Ciocca, G.Cusano, C.Schettini, R.DISCo
Dipartimento di Informática Sistemistica e Comunicazione Universita degli Studi di Milano-Bicocca Viale Sarca 336 20126 Milano Italy
Although traditional content-based retrieval systems have been successfully employed in many multimedia applications, the need for explicit association of higher concepts to images has been a pressing demand from user...
详细信息
This paper addresses the segmentation of a video sequence into shots, specification of edit effects and subsequent characterization of shots in terms of color and motion content. The proposed scheme uses DC images ext...
详细信息
ISBN:
(纸本)0819426628
This paper addresses the segmentation of a video sequence into shots, specification of edit effects and subsequent characterization of shots in terms of color and motion content. The proposed scheme uses DC images extracted from MPEG compressed video and performs an unsupervised clustering for the extraction of camera shots. The specification of edit effects, such as fade-in/out and dissolve is based on the analysis of distribution of mean value for the luminance components. This step is followed by the representation of visual content of temporal segments in terms of key frames selected by similarity analysis of mean color histograms. For characterization of the similar temporal segments, motion and color characteristics are classified into different categories using a set of different features derived from motion vectors of triangular meshes and mean histograms of video shots.
This dissertation presents a solution to problems arising from the demand for fast information access and for sharing in real-time multimedia transmission over the Internet. Our solution exploits software agents that ...
This dissertation presents a solution to problems arising from the demand for fast information access and for sharing in real-time multimedia transmission over the Internet. Our solution exploits software agents that are placed throughout the network environment. These hierarchical video analysis agents process multimedia streams in real time, and automatically decompose and understand the multimediacontent so as to facilitate information access and sharing. multimediacontent contains both the perceptual content such as color, motion, or acoustic features and the conceptual content, which is specified based on concepts or semantics that can be expressed by text descriptions. Both types of contents are embedded simultaneously in multimedia streams, and usually are complementary to each other. This dissertation adaptively analyzes both kinds of video contents by combining mixed media cues from audio, video and text. First, a high-performance module for on-line video segmentation based on scene-change detection is developed. It serves as the first step of any video stream construction and analysis. To meet the high computational demand, our proposed video scene change detection algorithms are very efficient while maintaining high accuracy and recall rates for fast on-line video analysis. Second, the perceptual features of audio and video data are analyzed in a bottom-up manner and integrated so as to discriminate among the different events in any video stream effectively. An efficient decision-tree learning algorithm is used to induce a set of if-then rules which link perceptual features with the video conceptual semantic contents. These rules not only serve as a video classifier, but also guide on-line real-time video/audio feature extraction and data redistribution. A novel knowledge-based system, where knowledge is stored as learned rules, is proposed to serve as a video semantic inference/classification engine. Third, we propose a hierarchical video categorizatio
Many cross-layer design approaches for wireless multimedia transmission over various networks have been proposed. Truly cross-layer design solutions require joint optimization of multimedia coding, transport/network l...
详细信息
ISBN:
(纸本)0780388348
Many cross-layer design approaches for wireless multimedia transmission over various networks have been proposed. Truly cross-layer design solutions require joint optimization of multimedia coding, transport/network layer, medium access layer, and physical layer transmission strategies and protocols. The solution often becomes impractical because: i) its computational complexity is high, and ii) it needs to be recomputed as some inputs, such as the content characteristics or network conditions change. To this effect, we propose a novel framework for cross-layer design, which features: i) the joint design strategy selection is adaptive not only to network conditions but also to multimediacontent characteristics, ii) the joint design strategy selection is based on off-line learning to address the computational complexity issue, and iii) the off-line learning stage employs a new multiple objective optimization solution.
Although many XML-based document formats are available for printing or publishing on the Internet, none of them is well designed to support both high quality printing and web publishing. To address this problem, we pr...
详细信息
ISBN:
(纸本)9780819479334
Although many XML-based document formats are available for printing or publishing on the Internet, none of them is well designed to support both high quality printing and web publishing. To address this problem, we propose a novel XML-based document format for web publishing, called CEBX, in this paper. The proposed format is a fixed-layout document with printing quality, which has optimized document content organization, physical structure and protection scheme to support web publishing. There are four noteworthy features of CEBX documents: (1) CEBX provides original fixed layout by graphic units for high quality rendering. (2) The content in CEBX document can be reflowed to fit the display device basing on the content blocks and additional fluid information. (3) XML Document Archiving model (XDA), the packaging model used in CEBX, supports document linearization and incremental edit well. (4) By introducing a segment-based content protection scheme into CEBX, some part of a document can be previewed directly while the remaining part is protected effectively such that readers only need to purchase partial content of a book that they are interested in. This will be very helpful to document distribution and support flexible business models such as try-before-buy, on-demand reading, superdistribution, etc.
The article focuses on digital video analysis and recognition. Digital video media analysis and recognition (DVMAR) has become an active research topic in multimediasystems and computer vision areas. This is because ...
The article focuses on digital video analysis and recognition. Digital video media analysis and recognition (DVMAR) has become an active research topic in multimediasystems and computer vision areas. This is because progress of computer and communication technologies has created strong demand for many applications of digital video in a wide variety of areas, many of which require managing video in a database or information system environment; on the other hand, indexing and retrieval schemes for traditional databases and video manipulation tools in current television systems cannot manage video in an effective, interactive, and content-based manner. The goal of DVMAR is to develop algorithms, tools, and systems to extract and analyze basic elements, features and structures of video so as to make content-based access and transmission of video data feasible and more effective. In general, video media is considered to have the following basic structural elements: shots, scenes, sequences and segments.
作者:
Lin, XiaofanVobile Inc.
4699 Old Ironsides Drive Ste. 430 Santa Clara CA 95054 United States
content-based image retrieval (CBIR) has been studied for nearly two decades since IBM's research on QBIC (Query by Image content) [1]. In the past decade, another related but different area, video fingerprinting,...
详细信息
暂无评论