Tarsys is a video archive system which combines the flexible organization of multimedia databases, the efficiency of real-time filesystems and the scalability of tertiary storage (magnetic tape libraries and optical j...
详细信息
ISBN:
(纸本)0819438758
Tarsys is a video archive system which combines the flexible organization of multimedia databases, the efficiency of real-time filesystems and the scalability of tertiary storage (magnetic tape libraries and optical jukeboxes). Heavy data transfers over the network are usual between video servers and their clients. Tarsys reduces network traffic through the use of a remote manipulation protocol, so that only required fragments of multimedia data are transfered. Tarsys provides a suitable platform for the automatic extraction of content information from multimedia data. It also provides management of content based queries and efficient access to video fragments found by queries. These facilities in the access to archived videos make it ideal for large TV digital archives and scientific databases where it constitutes a platform for quick development of custom video analysis.
Collaborative filtering is an important technology for creating user-adapting Web sites. In general the efforts of improving filtering algorithms and using the predictions for the presentation of filtered objects are ...
详细信息
ISBN:
(纸本)0780365364
Collaborative filtering is an important technology for creating user-adapting Web sites. In general the efforts of improving filtering algorithms and using the predictions for the presentation of filtered objects are decoupled. Therefore, common measures (or metrics) for evaluating collaborative filtering (recommender) systems focus mainly on the prediction algorithm. It is hard to relate the classic measurements to actual user satisfaction because of the way the user interacts with the recommendations, determined by their representation, influences the benefits for the user. We propose an abstract access paradigm, which can be applied to the design of filtering systems, and at the same time formalizes the access to filtering results via multi-corridors (based on content-based categories). This leads to new measures which better relate to the user satisfaction. We use these measures to evaluate the use of various kinds of multi-corridors for our prototype user-adapting Web site the: Active WebMuseum.
作者:
Coorg, SRIBM Corp
Thomas J Watson Res Ctr Yorktown Hts NY 10598 USA
Combining 3-D graphics and video to generate a seamless visual stream is an important problem encountered in producing multimediacontent. Difficulties in dealing with perspective, occlusion, and illumination make thi...
详细信息
ISBN:
(纸本)0780365364
Combining 3-D graphics and video to generate a seamless visual stream is an important problem encountered in producing multimediacontent. Difficulties in dealing with perspective, occlusion, and illumination make this problem challenging. In this paper, I propose techniques based on standard computer vision algorithms that address perspective and occlusion problems effectively. The key insight employed is that while the general vision problem is difficult to solve, the nature of this application permits simple, effective solutions. I present results of the proposed algorithms on a sample test video sequence.
Video summarization is a key component in providing Internet users a way to quickly browse a video clip in different levels of detail, without the need to view the entire video clip. We present a hybrid approach to vi...
详细信息
ISBN:
(纸本)0780365364
Video summarization is a key component in providing Internet users a way to quickly browse a video clip in different levels of detail, without the need to view the entire video clip. We present a hybrid approach to video summary generation, which automatically process the video, creating a multimedia video summary, while providing easy-to-use interfaces for verification, correction, and augmentation of the automatically generated story segments and extracted multimediacontent. algorithms are developed to solve the sub-problems of story segmentation, story boundary refinement, and video summary generation. The use of automatic processing in conjunction with input from the user allows a user to produce meaningful video summaries efficiently.
Indexing video data is essential for providing content-based access. In this paper, we consider how database technology can offer an integrated framework for modeling and querying video data. As many concerns in Video...
详细信息
Indexing video data is essential for providing content-based access. In this paper, we consider how database technology can offer an integrated framework for modeling and querying video data. As many concerns in Video (e.g., modeling and querying) are also found in databases, databases provide an interesting angle to attack many of the problems. From a Video applications perspective, database systems provide a nice basis for future Video systems. More generally, database research will provide solutions to many video issues, even if these are partial or fragmented. From a database perspective, video applications provide beautiful challenges. Next generation database systems will need to provide support for multimedia data (e.g., image, video, audio). These data types require new techniques for their management (i.e., storing, modeling, querying, etc.). Hence, new solutions are significant. This paper develops a data model and a rule-based query language for Video content-based indexing and retrieval. The data model is designed around the object and constraint paradigms. A video sequence is split into a set of fragments. Each fragment can be analyzed to extract the information (symbolic descriptions) of interest that can be put into a database. This database can then be searched to find information of interest. Two types of information are considered: 1) the entities (objects) or interest in the domain of a video sequence, and 2) video frames which contain these entities. To represent this information, our data model allows facts as well as objects and constraints. The model consists of two layers: 1) Feature & content Layer (or Audiovisual Layer), intended to contain Video visual features such as colors, contours, etc., 2) Semantic Layer, which provides the (conceptual) content dimension of videos. We present a declarative, rule-based, constraint query language that can be used to infer relationships about information represented in the model. Queries can refer to th
In this paper, we present e-Clips, a framework for the evaluation of content-based indexing and retrieval techniques applied to music video clips. The e-Clips framework integrates different video and audio feature ext...
详细信息
ISBN:
(纸本)0819438758
In this paper, we present e-Clips, a framework for the evaluation of content-based indexing and retrieval techniques applied to music video clips. The e-Clips framework integrates different video and audio feature extraction tools, whether automatic or manual. Its goal is to compare the relevance of each type of feature for : providing a structured index that can be browsed, finding similar videos, retrieving videos that correspond to a query, pushing music videos to the user according to his preferences. Currently, over 100 distinct music video clips have been indexed. For each video, shot boundaries were detected and key frames were extracted from each shot. Each key frame image was segmented into visual objects. The sound track was analyzed for basic features. Textual data, such as a song title and its performer was added by hand. The e-Clips framework is based on a client-server architecture that can stream VHS-quality video through an 100 Mbs Intranet. It should help evaluate the relevance of the descriptors generated by content-based indexing tools and suggest appropriate graphical user interfaces for non-specialist end users.
A multicode direct-sequence code-division multiple-access system experiences large envelope variations as a result of a sum of many independently spread signals. However, large envelope variations is problematic becau...
详细信息
A multicode direct-sequence code-division multiple-access system experiences large envelope variations as a result of a sum of many independently spread signals. However, large envelope variations is problematic because it reduces the spectral efficiency, the efficiency of the power amplifiers and the performance. All these effects depend on the non-linear amplifiers that generally are used in handsets. It would of course be possible to use a linear amplifier but then the power efficiency is drastically reduced. In this paper we analyze the envelope variations of a multicode signal in terms of the crest factor and find that it increases as the square root of the number of used codes. As a consequence it is only possible to use the multicode scheme for a few parallel codes. To reduce the envelope variations a precoder is introduced. This precoder is a non-linear high-rate block code especially designed for the set of spreading codes used. However, the precoder can be made independent of the spreading codes if a user-specific spreading code is concatenated with a set of Hadamard or conference sequences. The resulting spreadig codes are orthogonal. Also, the precoder is independent of the user-specific spreading code, and can thus be used for all users. After precoding the crest factor is significantly reduced and the performance, due to introduced coding gain, improved. algorithms for the design of precoders with both reduced envelope variation and good performance are presented. Furthermore, simulations show that a preceded multicode system outperforms an uncoded multicode system in a single-user as well as in a multiuser environment.
Techniques for content-based image or video retrieval are not mature enough to recognize visual semantic completely. Whereas retrieval based on color, size, texture and shape are within the state of the art, our inves...
详细信息
Techniques for content-based image or video retrieval are not mature enough to recognize visual semantic completely. Whereas retrieval based on color, size, texture and shape are within the state of the art, our investigations on human factor analysis indicate that it is necessary to use captions or text annotations that are associated with photos and videos in contentaccess of visual data. In this paper, a framework for integration of textual and visual content searching mechanism is presented. The framework includes ontology-based semantic query expansion, database navigation in a conceptual hierarchy, and a computational model for degree of term similarity calculation. The proposed method is embedded and evaluated in our novel content-based image database system called PicDB™.
Indexing, retrieval and delivery of visual and spatio-temporal properties of video objects requires efficient data models and sound operations on the model are mandatory. However, most object-based video data models a...
详细信息
ISBN:
(纸本)0819438758
Indexing, retrieval and delivery of visual and spatio-temporal properties of video objects requires efficient data models and sound operations on the model are mandatory. However, most object-based video data models address only a single aspect of those properties. In this paper, we present an efficient video object representation method that captures the visual, spatial and temporal properties of objects in a video in the form of an unified abstracted data type. The proposed data type is a polygon mesh, named video object mesh, which is defined in a spatio-temporal domain. Based on the application needs, a contour of an object is modeled with a polygonal contour. With the contour and color information of the object, content-based triangularization is performed. A video object in a frame is modeled with two dimensional-polygon mesh. Each vertex in the mesh, color information is embedded for further use. By using motion analysis, a corresponding vertex in the adjacent frame is identified connected to the vertex that is being analyzed. These processes are continued until a video object disappears. The result of these processes is a three dimensional polygon mesh that models location variant motion and location invariant motion that that can not be captured by traditional trajectory based motion model. The proposed model is also useful camera motion analysis. Since a surface shape of a video object mesh has partial information of camera motion.
暂无评论