The aim of this report is to be controversial and to engage a debate within the research community. Issues of whether some of the work in image and videodatabases has been directed at solutions in search of a problem...
详细信息
The aim of this report is to be controversial and to engage a debate within the research community. Issues of whether some of the work in image and videodatabases has been directed at solutions in search of a problem are discussed. Important applications in the area of media-based digital libraries that will enhance human experience are also detailed.
With a rigorous long-term archival of endoscopic surgeries, vast amounts of video and image data accumulate. Surgeons are not able to spend their valuable time to manually search within endoscopic multimedia databases...
详细信息
With a rigorous long-term archival of endoscopic surgeries, vast amounts of video and image data accumulate. Surgeons are not able to spend their valuable time to manually search within endoscopic multimedia databases (EMDBs) or manually maintain links to interesting sections in order to quickly retrieve relevant surgery sections. Enabling the surgeons to quickly access the relevant surgery scenes, we utilize the fact that surgeons record external images additionally to the surgery video and aim to link them to the appropriate video sequence in the EMDB using a query-by-example approach. We propose binary Convolutional Neural Network (CNN) features off-the-shelf and compare them to several baselines: pixel-based comparison (PSNR), image structure comparison (SSIM), hand-crafted global features (CEDD and feature signatures), as well as CNN baselines Histograms of Class Confidences (HoCC) and Neural Codes (NC). For evaluation, we use 5.5 h of endoscopic video material and 69 query images selected by medical experts and compare the performance of the aforementioned image mathing methods in terms of video hit rate and distance to the true playback time stamp (PTS) for correct video predictions. Our evaluation shows that binary CNN features are compact, yet powerful image descriptors for retrieval in the endoscopic imaging domain. They are able to maintain state-of-the-art performance, while providing the benefit of low storage space requirements and hence provide the best compromise.
Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database. This is in contrast to methods such as entering ...
详细信息
ISBN:
(纸本)0819411418
Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database. This is in contrast to methods such as entering SQL keys manually for each image as it is filed and later correctly re-entering those keys to retrieve the same image. An SQL-based approach does not take into account information that is hard to describe with text, such as sounds and images. Neural networks can be trained to translate `noisy' or chaotic image data into simpler, more reliable feature sets. By converting the images into the level of abstraction necessary for symbolic processing, standard database indexing methods can then be applied, or used in layers of associative database neural networks directly.
video contains multiple types of audio and visual information, which are difficult to extract, combine or trade-off in general video information retrieval. This paper provides an evaluation on the effects of different...
详细信息
ISBN:
(纸本)0819448214
video contains multiple types of audio and visual information, which are difficult to extract, combine or trade-off in general video information retrieval. This paper provides an evaluation on the effects of different types of information used for videoretrieval from a video collection. A number of different sources of information are present in most typical broadcast video collections and can be exploited for information retrieval. We will discuss the contributions of automatically recognized speech transcripts, image similarity matching, face detection and video OCR in the contexts of experiments performed as part of 2001 TREC videoretrieval Track evaluation performed by the National Institute of Standards and Technology. For the queries used in this evaluation, image matching and video OCR proved to be the deciding aspects of video information retrieval.
image histogram is an image feature widely used in content-based imageretrieval and video segmentation. It is simple to compute yet very effective as a feature in detecting image-to-image similarity, or frame-to-fram...
详细信息
image histogram is an image feature widely used in content-based imageretrieval and video segmentation. It is simple to compute yet very effective as a feature in detecting image-to-image similarity, or frame-to-frame dissimilarity. While the image histogram captures the global distribution of different intensities or colors well, it does not contain any information about the spatial distribution of pixels. In this paper, we propose to incorporate spatial information into the image histogram by computing features from the spatial distance between pixels belonging to the same intensity or color. In addition to the frequency count of the intensity or color, the mean, variance, and entropy of the distances are computed to form an Augmented image Histogram. Using the new feature, we preformed experiments on a set of color images and a color video sequence. Experimental results demonstrate that the Augmented image Histogram performs significantly better than the conventional color histogram, both in imageretrieval and video shot segmentation.
imageretrieval systems that compare the query image exhaustively with each individual image in the database are not scalable to large databases. A scalable search system should ensure that the search time does not in...
详细信息
imageretrieval systems that compare the query image exhaustively with each individual image in the database are not scalable to large databases. A scalable search system should ensure that the search time does not increase linearly with the number of images in the database. We present a clustering based indexing technique, where the images in the database are grouped into clusters of images with similar color content using a hierarchical clustering algorithm. At search time the query image is not compared with all the images in the database, but only with a small subset. Experiments show that this clustering based approach offers a superior response time with a high retrieval accuracy. Experiments with different database sizes indicate that for a given retrieval accuracy the search time does not increase linearly with the database size.
videodatabases can be searched for visual content by searching over automatically extracted key frames rather than the complete video sequence. Many video materials used in the humanities and social sciences contain ...
详细信息
ISBN:
(纸本)0819424331
videodatabases can be searched for visual content by searching over automatically extracted key frames rather than the complete video sequence. Many video materials used in the humanities and social sciences contain a preponderance of shots of people. In this paper, we describe our work in semantic imageretrieval of person-rich scenes (key frames) for videodatabases and libraries. We use an approach called retrieval through segmentation. A key-frame image is first segmented into human subjects and background. We developed a specialized segmentation technique that utilizes both human flesh-tone detection and contour analysis. Experimental results show that this technique can effectively segment images in a low time complexity. Once the image has been segmented, we can then extract features or pose queries about both the people and the background. We propose a retrieval framework that is based on the segmentation results and the extracted features of people and background.
The ever-increasing number of produced radiology images in medical centers makes storage, classification and retrieval of these images an important and vital issue in management of medical centers' databases. In t...
详细信息
The ever-increasing number of produced radiology images in medical centers makes storage, classification and retrieval of these images an important and vital issue in management of medical centers' databases. In this paper, we employ H.264/AVC standard for coding of radiology images and study its performance on storage, classification and retrieval of radiology images. The conducted experiments indicate that coding of radiology images by H.264/AVC standard, on general, reduces the size of coded image to less than half of the coding in PNG format. Moreover, we employ a compressed domain indexing and retrieval method for H.264/AVC coded images, which avoids full decompression of coded image and in turn reduces the indexing and retrieval time. Experimental results indicate that employing this compressed domain feature vector achieves on average 93% accuracy on classification and 85% overall precision in retrieval of radiology images.
This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test o...
详细信息
This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test on the normality and homoscedasticity (common covariance matrix) of the estimated Gaussian mixture models is performed and McLachlan's bootstrap method is used to test the number of components in a mixture. Experimental results show that the estimated Gaussian mixture model fits skin images from a large database. Applications of the estimated density function in image and videodatabases are presented.
暂无评论