We present standards-compliant visible watermarking schemes for digital images and video in DCT-based compressed formats. The watermarked data is in the same compressed format as the original and can be viewed with st...
详细信息
ISBN:
(纸本)0819442429
We present standards-compliant visible watermarking schemes for digital images and video in DCT-based compressed formats. The watermarked data is in the same compressed format as the original and can be viewed with standard tools and applications. Moreover, for most of the schemes presented, the watermarked data has exactly the same compressed size as the original. The watermark can be inserted and removed using a key for applications requiring content protection. The watermark application and removal algorithms are very efficient and exploit some features of compressed data formats (such as JPEG and MPEG) which allow most of the work to be done in the compressed domain.
The proceedings contains 40 papers from 2001 SPIE on multimediasystems and Applications. Topics discussed include: universal multimediaaccess;multimediacontent management;selective retransmission protocols for mult...
详细信息
The proceedings contains 40 papers from 2001 SPIE on multimediasystems and Applications. Topics discussed include: universal multimediaaccess;multimediacontent management;selective retransmission protocols for multimedia on the Internet;adaptive multimedia service provisioning in wireless communication networks;issues in the design of multimedia user interfaces for distance learning;multidifferential video coding algorithm for video conferencing applications and fast encryption methods for audiovisual data confidentiality.
This dissertation presents a solution to problems arising from the demand for fast information access and for sharing in real-time multimedia transmission over the Internet. Our solution exploits software agents that ...
This dissertation presents a solution to problems arising from the demand for fast information access and for sharing in real-time multimedia transmission over the Internet. Our solution exploits software agents that are placed throughout the network environment. These hierarchical video analysis agents process multimedia streams in real time, and automatically decompose and understand the multimediacontent so as to facilitate information access and sharing. multimediacontent contains both the perceptual content such as color, motion, or acoustic features and the conceptual content, which is specified based on concepts or semantics that can be expressed by text descriptions. Both types of contents are embedded simultaneously in multimedia streams, and usually are complementary to each other. This dissertation adaptively analyzes both kinds of video contents by combining mixed media cues from audio, video and text. First, a high-performance module for on-line video segmentation based on scene-change detection is developed. It serves as the first step of any video stream construction and analysis. To meet the high computational demand, our proposed video scene change detection algorithms are very efficient while maintaining high accuracy and recall rates for fast on-line video analysis. Second, the perceptual features of audio and video data are analyzed in a bottom-up manner and integrated so as to discriminate among the different events in any video stream effectively. An efficient decision-tree learning algorithm is used to induce a set of if-then rules which link perceptual features with the video conceptual semantic contents. These rules not only serve as a video classifier, but also guide on-line real-time video/audio feature extraction and data redistribution. A novel knowledge-based system, where knowledge is stored as learned rules, is proposed to serve as a video semantic inference/classification engine. Third, we propose a hierarchical video categorizatio
The volume of multimedia data generated nowadays is exploding. To efficiently access and retrieve desired information, tools that enable automated analysis based on content are becoming indispensable. multimedia conte...
The volume of multimedia data generated nowadays is exploding. To efficiently access and retrieve desired information, tools that enable automated analysis based on content are becoming indispensable. multimediacontent is defined at both perceptual and conceptual levels. The former refers to the content characterized purely by intrinsic perception properties such as color, motion, or acoustic features. The latter refers to the content that is specified based on concepts or semantics such as sunset, anchors, or news headline stories. At both levels, the content is embedded in multiple forms that are usually complimentary to each other. The main objective of this thesis is to adaptively analyze the multimediacontent by integrating cues from multiple modalities, including audio, video, and text, mainly in the scope of news broadcast. At the perceptual level, news broadcast data is segmented and classified into different video events such as news reporting and commercials. Audio and visual features are developed and integrated, aiming at discriminating different events effectively. Various classification mechanisms, including linear fuzzy threshold, maximum likelihood using Gaussian Mixture Model and Hidden Markov Model, Neural Network, as well as Support Vector Machine, are benchmarked. At the conceptual level, algorithms and demonstration systems for three applications are developed. In News Broadcast Browsing System, recovering and presentation of the embedded hierarchy structure of news broadcast are addressed. Important semantic objects such as hosting characters and headline news stories are adaptively extracted using the audio/visual models that are bootstrapped from on-line data. The problem of efficient search and retrieval of segmented multimedia objects based on audio is discussed in Query-by-example in Audio System. A distance metric framework is proposed to determine the difference of mixture type Probability Density Functions, and is applied in measuring
In semantic content-based image/video browsing and navigation systems, efficient mechanisms to represent and manage a large collection of digital images/videos are needed. Traditional keyword-based indexing describes ...
详细信息
ISBN:
(纸本)0819439886
In semantic content-based image/video browsing and navigation systems, efficient mechanisms to represent and manage a large collection of digital images/videos are needed. Traditional keyword-based indexing describes the content of multimedia data through annotations such as text or keywords extracted manually by the user from a controlled vocabulary This textual indexing technique lacks the flexibility of satisfying various kinds of queries requested by database users and also requires huge amount of work for updating the information. Current content-based retrieval systems often extract a set of features such as color, texture, shape motion, speed, and position from the raw multimedia data automatically and store them as content descriptors. This content-based metadata differs from text-based metadata in that it supports wider varieties of queries and can be extracted automatically, thus providing a promising approach for efficient database access and management. When the raw data volume grows very large, explicitly extracting the content-information and storing it as metadata along with the images will improve querying performance since metadata requires much less storage than the raw image data and thus will be easier to manipulate. In this paper we maintain that storing metadata together with images will enable effective information management and efficient remote query. We also show, using a texture classification example, that this side information can be compressed while guaranteeing that the desired query accuracy is satisfied. We argue that the compact representation of the image contents not only reduces significantly the storage and transmission rate requirement, but also facilitates certain types of queries. algorithms are developed for optimized compression of this texture feature metadata given that tile goal is to maximize tile classification performance for a given rate budget.
The new MPEG-4 Audio standard provides two toolsets for synthetic Audio generation, Audio processing and multimediacontent description called Structured Audio (SA) and BInary Format for Scenes (BIFS). Moving from a s...
详细信息
ISBN:
(纸本)0780370414
The new MPEG-4 Audio standard provides two toolsets for synthetic Audio generation, Audio processing and multimediacontent description called Structured Audio (SA) and BInary Format for Scenes (BIFS). Moving from a systematic analysis of SA and from the implementation of an efficient SA decoder, this paper describes the design of a virtual DSP architecture able to exploit the data level parallelism contained in many typical audio processing algorithms. The proposed virtual DSP architecture shows good performance on general purpose platforms and can be easily adapted and optimized for parallel superscalar devices. The porting and results on a V-LIW DSP device confirm the effectiveness and flexibility of the approach, particularly suitable for standalone embedded solutions.
MPEG-7 is a standard to provide a standardized way of describing contents of multimedia information. The development of technologies related to the MPEG-7 and exploiting MPEG-7 applications have been a main research i...
详细信息
ISBN:
(纸本)081943874X
MPEG-7 is a standard to provide a standardized way of describing contents of multimedia information. The development of technologies related to the MPEG-7 and exploiting MPEG-7 applications have been a main research interest of us for the last two years. TV broadcasting is one specific area of interest among various applications to which MPEG-7 can be of benefit. The Video-Gadget (TM), which is an indexing and retrieval engine of multimedia data, is designed to aid user friendly environment of Digital TV as well as conventional analogue TV. By enabling automatic feature extraction and indexing of received m program with help of low-cost random access storage device, a dull and conventional TV can turn into an interactive personalized entertainment center. With various user-friendly functionalities such as non-linear browsing of received programs, structure-based navigation, and searching/filtering of programs, the user can view the received program anytime the user prefers in the way the user prefers. Even though the Video-Gadget (TM) project, which is currently in progress, aims an environment of the Digital TV broadcast with MPEG-7 descriptions, a version of the Video-Gadget (TM) is under development to actually generate MPEG-7-like descriptions while the analogue stream is received. The real-time generation of the content description in detail brings many problems to solve such as limited I/O bandwidth and computing power and extensive sharing of the system resources. Due to the limited resource, some features are impossible to be extract in the real time, the functionality using those features should be redesigned to use simpler features. Once the Digital TV broadcast with MPEG-7 descriptions become available, the feature extraction part of the Video-Gadget (TM) will be employed by the MPEG-7 description generation server or the service provider and only the retrieval engine including search, browsing, and filtering functionality will remain in the client part.
This paper analyzes the asymptotic performance of Maximum Likelihood (ML) channel estimation algorithms in wideband code division multiple access (WCDMA) scenarios. We concentrate on systems with periodic spreading se...
详细信息
ISBN:
(纸本)0780370414
This paper analyzes the asymptotic performance of Maximum Likelihood (ML) channel estimation algorithms in wideband code division multiple access (WCDMA) scenarios. We concentrate on systems with periodic spreading sequences (period larger than or equal to the symbol span) with high spreading factors, where the transmitted signal contains a code division multiplexed pilot for channel estimation purposes. Assuming randomized training and code sequences, we derive and compare the asymptotic covariances of the training-only (TO), semi-blind conditional ML (CML) and semi-blind Gaussian ML (GML) channel estimators.
The problem of controlling access to multimedia multicasts requires the distribution and maintenance of keying information. The conventional approach to distributing keys is to use a channel independent of the multime...
详细信息
ISBN:
(纸本)0780370414
The problem of controlling access to multimedia multicasts requires the distribution and maintenance of keying information. The conventional approach to distributing keys is to use a channel independent of the multimediacontent. We propose a second approach that involves the use of an data-dependent channel, and can be achieved for multimedia by using data embedding techniques. Using data embedding to convey rekeying messages can provide an additional layer of security when compared with the traditional approach. We then introduce multicast key distribution, and employ a recent tree-based key distribution scheme to exhibit the factors involved in transmitting keys using data embedding.
A promising class of nonlinear multiuser detectors is introduced for CDMA systems. These "iterated-decision" multiuser detectors use optimized multipass algorithms to successively cancel multiple-access inte...
详细信息
ISBN:
(纸本)0780370414
A promising class of nonlinear multiuser detectors is introduced for CDMA systems. These "iterated-decision" multiuser detectors use optimized multipass algorithms to successively cancel multiple-access interference (MAI) from received data and generate symbol decisions whose reliability increases monotonically with each iteration. They significantly outperform decorrelating detectors and linear minimum mean-square error (MMSE) multiuser detectors, but have the same order of computational complexity., When the ratio of the number of users to the spreading factor is below a certain threshold, iterated-decision multiuser detectors asymptotically achieve the performance of the "optimum" multiuser detector, i.e., maximum-likelihood (ML) decoding.
暂无评论