This article provides a color-based image retrieval technique for RGB image databases. Our proposed CBIR system uses the query by example approach and a relevance feedback mechanism. Feature extraction process is perf...
详细信息
ISBN:
(纸本)9781424450541
This article provides a color-based image retrieval technique for RGB image databases. Our proposed CBIR system uses the query by example approach and a relevance feedback mechanism. Feature extraction process is performed by computing a global color histogram for each image. Feature vectors are compared using the histogram intersection difference metric, and a relevance feedback mechanism is used in the retrieval process.
In this paper we focus on unsupervised discovering of acoustic classes suitable for use in pattern recognition applications. Our approach is based on a two-level clustering of an initial acoustic segmentation of the a...
详细信息
ISBN:
(纸本)9780769551449
In this paper we focus on unsupervised discovering of acoustic classes suitable for use in pattern recognition applications. Our approach is based on a two-level clustering of an initial acoustic segmentation of the audio data in order to allow for discovery and correct modeling of complex acoustic classes. Initially, in a first-level, the acoustic space is densely clustered in order to provide a first layer of acoustic variance reduction. In a second-level clustering we use the acoustic segmentation to infer a smaller number of super-clusters taking advantage of the intra-segment relationships between the first-level clusters. In this paper we compare three possible clustering methods to obtain super-clusters as sub-sets or linear combinations of first-level clusters. Results indicate that the proposed two-level approach improves the balance between Purity and inverse Purity evaluation measures while significantly improving the stability of the transcriptions obtained when using the resulting models to transcribe the same acoustic events in different spoken utterances.
The H-KWS 2016, organized in the context of the ICFHR 2016 conference aims at setting up an evaluation framework for benchmarking handwritten keyword spotting (KWS) examining both the query by example (QbE) and the Qu...
详细信息
ISBN:
(纸本)9781509009817
The H-KWS 2016, organized in the context of the ICFHR 2016 conference aims at setting up an evaluation framework for benchmarking handwritten keyword spotting (KWS) examining both the query by example (QbE) and the query by String (QbS) approaches. Both KWS approaches were hosted into two different tracks, which in turn were split into two distinct challenges, namely, a segmentation-based and a segmentation-free to accommodate different perspectives adopted by researchers in the KWS field. In addition, the competition aims to evaluate the submitted training-based methods under different amounts of training data. Four participants submitted at least one solution to one of the challenges, according to the capabilities and/or restrictions of their systems. The data used in the competition consisted of historical German and English documents with their own characteristics and complexities. This paper presents the details of the competition, including the data, evaluation metrics and results of the best run of each participating methods.
A time-stretching invariant, robust audio finger-printing method, based on landmarks in the audio spectrogram is proposed in this paper. Time-stretching of audio clips or songs are done to evade copyright detection as...
详细信息
ISBN:
(纸本)9781479980581
A time-stretching invariant, robust audio finger-printing method, based on landmarks in the audio spectrogram is proposed in this paper. Time-stretching of audio clips or songs are done to evade copyright detection as most of the fingerprinting techniques are time dependent. Time-stretching is also used in music industry to produce remix & song mash-ups and in multimedia broadcasting to fit content within the required duration. The proposed algorithm is based on the audio hashing of frequency peaks in the spectrogram. It is scalable and tolerant to time-stretching. The experiment results show the method is highly tolerable to time-stretch than the state-of-the-art Shazam's audio fingerprinting.
Modern Keyword Spotting systems rely on deep learning approaches to build effective neural networks which provide state-of-the-art results. Despite their evident success, these deep models have proven to be sensitive ...
详细信息
ISBN:
(纸本)9783031065552;9783031065545
Modern Keyword Spotting systems rely on deep learning approaches to build effective neural networks which provide state-of-the-art results. Despite their evident success, these deep models have proven to be sensitive with respect to the input images;a small deformation, almost indistinguishable to the human eye, may considerably alter the resulting retrieval list. To address this issue, we propose a novel "on-the-fly" approach which deforms an input image to better match the query image, aiming to stabilize the aforementioned sensitivity. Results on the IAM dataset verify the effectiveness of the proposed method, which outperforms existing query-by-example approaches.
Advancement in technology has made the acquisition and storage of multimedia data easy and inexpensive to the end user. However for effective use of the information available in the multimedia, efficient and accurate ...
详细信息
Advancement in technology has made the acquisition and storage of multimedia data easy and inexpensive to the end user. However for effective use of the information available in the multimedia, efficient and accurate retrieval methods are required. Multimedia based retrieval systems extensively used texture based approach to interpret and recognize a scene image. The texture of an image provides clue to the orientation, smoothness, symmetry, shape, regularity and coarness of the surface. The Local Binary Pattern Variance (LBPV) is a texture feature where variance in contrast acts as adaptive factor during computation of local binary pattern (LBP) histogram. The LBP combines both structural and statistical approaches to texture analysis. This paper proposes an LBP Variance based approach to visual content based video retrieval. The proposed approach uses query by example paradigm for retrieving similar clips from the video. Experiments conducted on TRECVID dataset shows the efficacy of proposed approach. (C) 2016 The Authors. Published by Elsevier B.V.
In this paper, we focus on the problem of content-based retrieval for audio, which aims to retrieve all semantically similar audio recordings for a given audio clip query. This problem is similar to the problem of que...
详细信息
ISBN:
(纸本)9781538646588
In this paper, we focus on the problem of content-based retrieval for audio, which aims to retrieve all semantically similar audio recordings for a given audio clip query. This problem is similar to the problem of query by example of audio, which aims to retrieve media samples from a database, which are similar to the user-provided example. We propose a novel approach which encodes the audio into a vector representation using Siamese Neural Networks. The goal is to obtain an encoding similar for files belonging to the same audio class, thus allowing retrieval of semantically similar audio. Using simple similarity measures such as those based on simple euclidean distance and cosine similarity we show that these representations can be very effectively used for retrieving recordings similar in audio content.
This paper describes the development of a prototype of a Web Image Search Engine (WISE), which allows users to search for images on the WWW by image examples, in a similar fashion to current search engines that allow ...
详细信息
ISBN:
(纸本)0819439908
This paper describes the development of a prototype of a Web Image Search Engine (WISE), which allows users to search for images on the WWW by image examples, in a similar fashion to current search engines that allow users to find related Web pages using text matching on keywords. The system takes an image specified by the user and finds similar images available on the WWW by comparing the image contents using low level image features. The current version of the WISE system consists of a graphical user interface (GUI), an autonomous Web agent, an image comparison program and a query processing program. The users specify the URL of a target image and the URL of the starting Web page from where the proram will "crawl" the Web, finding images along the way and retrieve those satisfying a certain constraints. The program then computes the visual features of the retrieved images and performs content-based comparison with the target image. The results of the comparison are then sorted according to a certain similarity measure, which along with thumbnails and information associated with the images, such as the URLs;image size, etc. are then written to an HTML page. The resultant page is stored on a Web server and is outputted onto the user's Web browser once the search process is complete. A unique feature of the current version of WISE is its image content comparison algorithm. It is based on the comparison of image palettes [6] and it therefore very efficient in retrieving one of the two universally accepted image formats on the Web, "gif". In gif images, the colour palette is contained in its header and therefore it is only necessary to retrieve the header information rather than the whole image, thus making it very efficient.
In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in th...
详细信息
We propose using learning-to-rank for entity set expansion (ESE) from unstructured data, the task of finding "sibling" entities within a corpus that are from the set characterized by a small set of seed enti...
详细信息
ISBN:
(纸本)9781450380676
We propose using learning-to-rank for entity set expansion (ESE) from unstructured data, the task of finding "sibling" entities within a corpus that are from the set characterized by a small set of seed entities. We present a two-channel neural re-ranking model, NESE, that jointly learns exact and semantic matching of entity contexts through entity interaction features. Although entity set expansion has drawn increasing attention in the IR and NLP communities for its various applications, the lack of massive annotated entity sets has hindered the development of neural approaches. We describe DBPEDIA-SETS, a toolkit that automatically extracts entity sets from a plain text collection, thus providing a large amount of distant supervision data for neural model training. Experiments on real datasets of different scales from different domains show that NESE outperforms state-of-the-art approaches in terms of precision and MAP. Furthermore, evaluation through human annotations shows that the knowledge learned from the training data is generalizable.
暂无评论