The usefulness of a collection of scanned graphical documents can be measured by the facilities available for their retrieval. We present an approach for indexing a collection of line drawings automatically. The index...
详细信息
ISBN:
(纸本)081941767X
The usefulness of a collection of scanned graphical documents can be measured by the facilities available for their retrieval. We present an approach for indexing a collection of line drawings automatically. The indexing is based on the textual and graphical content of the drawings. This approach has been developed to facilitate `retrieval by example' in heterogeneous collections of graphical documents. No a priori knowledge about the application domain is assumed. Starting with a raster image, candidate character patterns and graphical primitives (i.e., line segments and arcs) are extracted. Candidate character patterns are classified by an OCR method and grouped into word hypotheses. Graphical features of various types are computed from groupings of graphical primitives (e.g., sequences of adjacent lines, pairs of parallel lines). retrieval occurs with a weighted information retrieval system. Each document of the collection and each query are described with a set of indexing features with their corresponding weights. The weight of an indexing feature reflects the descriptive nature of the feature and is computed from the number of occurrences of the indexing feature in the document (feature frequency ff) and the number of documents containing the indexing feature (document frequency df).
A highly integrated wavelet-based image management system is proposed. Three solutions for key aspect of image management are derived: content-based imageretrieval (CBIR); image compression/decompression; and image t...
详细信息
A highly integrated wavelet-based image management system is proposed. Three solutions for key aspect of image management are derived: content-based imageretrieval (CBIR); image compression/decompression; and image transmission. By exploring the excellent features of wavelet, integrating key aspect of image management, the system shows a high overall performance.
A nine-direction lower-triangular (9DLT) matrix describes the relative spatial relationships among the objects in a symbolic image. In this paper, the 9DLT matrix will be transformed into a linear string, called 9DLT ...
详细信息
A nine-direction lower-triangular (9DLT) matrix describes the relative spatial relationships among the objects in a symbolic image. In this paper, the 9DLT matrix will be transformed into a linear string, called 9DLT string. Based on the 9DLT string, two metrics of similarity in image matching measures, simpler but more precise, are provided to solve the subimage and similar imageretrieval problems. Moreover, a common component binary tree (CCBT) structure will be refined to save a set of 9DLT strings. The revised CCBT structure not only eliminates the redundant information among those 9DLT strings, but also diminishes the processing time for determining the image matching distances between query frames and video frames. Experiments indicate that the storage space and the processing time are greatly reduced through the revised CCBT structure. A fast dynamic programming approach is also proposed to handle the problem of sequence matching between a query frame sequence and a video frame sequence, a zool Academic Press.
Searching in imagedatabases using image content has made the transition from the laboratory to consumer software. Storm Software is a pioneer in bringing these techniques to shrink-wrapped software applications, and ...
详细信息
ISBN:
(纸本)081941767X
Searching in imagedatabases using image content has made the transition from the laboratory to consumer software. Storm Software is a pioneer in bringing these techniques to shrink-wrapped software applications, and this presentation describes some of the methods we use in our products and some of the experiences we have had in bringing this new technology to consumers. We describe the scope of the problem we are trying to solve as well as some of the algorithms and interfaces we used. We also describe some of the rationales (based on theory as well as on user testing) we had for the various design decisions we made. Finally, we describe some of the challenges and opportunities we see ahead. Descriptions and screen shots of two software products implementing image searching (EasyPhoto and Apple PhotoFlash) are provided. Both products were developed by Storm Software.
Multimedia data are generally stored in compressed form in order to efficiently utilize the available storage facilities. Access to multimedia archives is thus dependent on our ability to browse compressed information...
Multimedia data are generally stored in compressed form in order to efficiently utilize the available storage facilities. Access to multimedia archives is thus dependent on our ability to browse compressed information. In this paper, a novel approach to multiple object tracking from compressed multimedia databases is presented. This approach is intended to operate in a distributed environment, where users initiate video searches and retrieve relevant video information simultaneously From multiple compressed video archives. The system operates on the compressed video to find and track objects of interest and determine their positions in the image. This enables more complex query formulations in terms of the relative positions of the target objects in the image. The filtering and analysis of motion information (motion vectors) is used to track objects in the video bit stream. Once the search has terminated. the system may decompress and display the query-relevant video sequences upon request. (C) 2000 Academic Press.
With a rigorous long-term archival of endoscopic surgeries, vast amounts of video and image data accumulate. Surgeons are not able to spend their valuable time to manually search within endoscopic multimedia databases...
详细信息
With a rigorous long-term archival of endoscopic surgeries, vast amounts of video and image data accumulate. Surgeons are not able to spend their valuable time to manually search within endoscopic multimedia databases (EMDBs) or manually maintain links to interesting sections in order to quickly retrieve relevant surgery sections. Enabling the surgeons to quickly access the relevant surgery scenes, we utilize the fact that surgeons record external images additionally to the surgery video and aim to link them to the appropriate video sequence in the EMDB using a query-by-example approach. We propose binary Convolutional Neural Network (CNN) features off-the-shelf and compare them to several baselines: pixel-based comparison (PSNR), image structure comparison (SSIM), hand-crafted global features (CEDD and feature signatures), as well as CNN baselines Histograms of Class Confidences (HoCC) and Neural Codes (NC). For evaluation, we use 5.5 h of endoscopic video material and 69 query images selected by medical experts and compare the performance of the aforementioned image mathing methods in terms of video hit rate and distance to the true playback time stamp (PTS) for correct video predictions. Our evaluation shows that binary CNN features are compact, yet powerful image descriptors for retrieval in the endoscopic imaging domain. They are able to maintain state-of-the-art performance, while providing the benefit of low storage space requirements and hence provide the best compromise.
Computer-assisted content-based indexing is a critical enabling technology and currently a bottleneck in productive use of video resources. This paper presents the video Classification Project, an effort toward automa...
详细信息
ISBN:
(纸本)0819414808
Computer-assisted content-based indexing is a critical enabling technology and currently a bottleneck in productive use of video resources. This paper presents the video Classification Project, an effort toward automating content-based video indexing and retrieval, at the Institute of Systems Science of the National University of Singapore. We discuss in detail three goals of the project: image processing tools for video parsing, feature extraction and retrieval; a knowledge-based approach to representing video content; and stratified tools which allow greater flexibility in browsing a video resource, either before or after performing specific retrieval operations.
The aim of this report is to be controversial and to engage a debate within the research community. Issues of whether some of the work in image and videodatabases has been directed at solutions in search of a problem...
详细信息
The aim of this report is to be controversial and to engage a debate within the research community. Issues of whether some of the work in image and videodatabases has been directed at solutions in search of a problem are discussed. Important applications in the area of media-based digital libraries that will enhance human experience are also detailed.
This paper describes the extended model for information retrieval (EMIR) designed for complex information description and retrieval and particularly well suited for image modeling. A main object in the proposed model ...
详细信息
ISBN:
(纸本)081941767X
This paper describes the extended model for information retrieval (EMIR) designed for complex information description and retrieval and particularly well suited for image modeling. A main object in the proposed model has a three parts specification: a description that is a list of attributes;a composition that is a list of component objects;and a topology that is a list of semantic relationships between component objects, expressing more semantic aspects of the main object structure. The model is well suited for image modeling for two complementary reasons. On one hand, it can distinguish between an object structure and its contents. This is achieved by relaxing the class-object classical instantiation link;thus allowing objects to have individual non categorized contents rather than those predicted in their classes. On the other hand, images have typically very different individual contents, and, therefore, cannot be easily modeled within a structured database model such as the relational model. The query language is organized according to the three-part organization of the model. A simple query has three parts: description, being some constraints on some attributes values;composition, being a set of sub-queries on the composition part of objects;topology, being the specification of special required links on the results of composition sub-queries.
image histogram is an image feature widely used in content-based imageretrieval and video segmentation. It is simple to compute yet very effective as a feature in detecting image-to-image similarity, or frame-to-fram...
详细信息
image histogram is an image feature widely used in content-based imageretrieval and video segmentation. It is simple to compute yet very effective as a feature in detecting image-to-image similarity, or frame-to-frame dissimilarity. While the image histogram captures the global distribution of different intensities or colors well, it does not contain any information about the spatial distribution of pixels. In this paper, we propose to incorporate spatial information into the image histogram by computing features from the spatial distance between pixels belonging to the same intensity or color. In addition to the frequency count of the intensity or color, the mean, variance, and entropy of the distances are computed to form an Augmented image Histogram. Using the new feature, we preformed experiments on a set of color images and a color video sequence. Experimental results demonstrate that the Augmented image Histogram performs significantly better than the conventional color histogram, both in imageretrieval and video shot segmentation.
暂无评论