Music transcription refers to the process of analyzing a piece of music to generate a sequence of constituent notes and their duration. Transcription of music from audio signals is fraught with problems due to auditor...
详细信息
ISBN:
(纸本)9781450347532
Music transcription refers to the process of analyzing a piece of music to generate a sequence of constituent notes and their duration. Transcription of music from audio signals is fraught with problems due to auditory interference such as ambient noise, multiple instruments playing simultaneously, accompanying vocals or polyphonic sounds. For several instruments, there exists added information for music transcription which can be derived from a video sequence of the instrument as it is being played. This paper proposes a method to utilize this visual information for the case of keyboard-like instruments to generate a transcript automatically, by analyzing the video frames. We present encouraging results under varying lighting conditions on different song sequences played out on a keyboard.
We present a novel algorithm to remove near regular, fence or wire like foreground patterns from an image. The fence detection or fence removal algorithms, developed so far, have poor performance in detecting the fenc...
详细信息
ISBN:
(纸本)9781450347532
We present a novel algorithm to remove near regular, fence or wire like foreground patterns from an image. The fence detection or fence removal algorithms, developed so far, have poor performance in detecting the fence. We use signal demixing to utilize the sparsity and regularity property of fences to detect them. Results demonstrate the effectiveness of our technique as compared to other state of the art techniques.
image Hallucination has many applications in areas such as imageprocessing, computational photography and image fusion. In this paper, we present an image Hallucination technique based on the template (patch) matchin...
详细信息
ISBN:
(纸本)9781450347532
image Hallucination has many applications in areas such as imageprocessing, computational photography and image fusion. In this paper, we present an image Hallucination technique based on the template (patch) matching from the database of time lapse images and learned locally affine model. Template based techniques suffer from blocky artifacts. So, we propose two approaches for imposing consistency criteria across neighbouring patches in the form of regularization. We validate our Color transfer technique by hallucinating a variety of natural images at different times the day. We compare the proposed approach with other state of the art techniques of example image based color transfer and show that the images obtained using our approach look more plausible and natural.
Bishnupur is an attractive tourist place in West Bengal, India and is known for its terracotta temples. The place is one of the prospective candidates to be included in the list of UNESCO World Heritage sites. We inte...
详细信息
ISBN:
(纸本)9781450347532
Bishnupur is an attractive tourist place in West Bengal, India and is known for its terracotta temples. The place is one of the prospective candidates to be included in the list of UNESCO World Heritage sites. We intend to preserve this heritage site digitally and also to present some virtual interaction for the tourist and researchers. In this paper, we present an image dataset of different temples (namely, Jor Bangla, Kalachand, Madan Mohan, Radha Madhav, Rasmancha, Shyamrai and Nandalal) in Bishnupur for evaluating different types of computervision and imageprocessing algorithms (like 3D reconstruction, image inpainting, texture classification and content specific image retrieval). The dataset is captured using four different cameras with different parameter settings. Some datasets are extracted and earmarked for certain applications such as texture classification, image inpainting and content specific image retrieval. Example results of baseline methods are also shown for these applications. Thus we evaluate the usefulness of this dataset. To the best of our knowledge, probably this is the first attempt of combined dataset for evaluating various types of problems for a heritage site in India.
Skin colour detection under poor or varying illumination condition is a big challenge for various imageprocessing and human-computer interaction applications. In this paper, a novel skin detection method utilizing im...
详细信息
ISBN:
(纸本)9781450347532
Skin colour detection under poor or varying illumination condition is a big challenge for various imageprocessing and human-computer interaction applications. In this paper, a novel skin detection method utilizing image pixel distribution in a given colour space is proposed. The pixel distribution of an image can provide a better localization of the actual skin colour distribution of an image. Hence, a local skin distribution model (LSDM) is derived using the image pixel distribution model and its similarity with the global skin distribution model (GSDM). Finally, a fusion-based skin model is obtained using both the GSDM and the LSDM. Subsequently, a dynamic region growing method is employed to improve the overall detection rate. Experimental results show that proposed skin detection method can significantly improve the detection accuracy in presence of varying illumination conditions.
This edited volume contains technical contributions in the field of computervision and imageprocessing presented at the First International conference on computervision and imageprocessing (CVIP 2016). The contrib...
ISBN:
(数字)9789811021077
ISBN:
(纸本)9789811021060
This edited volume contains technical contributions in the field of computervision and imageprocessing presented at the First International conference on computervision and imageprocessing (CVIP 2016). The contributions are thematically divided based on their relation to operations at the lower, middle and higher levels of vision systems, and their applications. The technical contributions in the areas of sensors, acquisition, visualization and enhancement are classified as related to low-level operations. They discuss various modern topics reconfigurable image system architecture, Scheimpflug camera calibration, real-time autofocusing, climate visualization, tone mapping, super-resolution and image resizing. The technical contributions in the areas of segmentation and retrieval are classified as related to mid-level operations. They discuss some state-of-the-art techniques non-rigid image registration, iterative image partitioning, egocentric object detection and video shot boundary detection. The technical contributions in the areas of classification and retrieval are categorized as related to high-level operations. They discuss some state-of-the-art approaches extreme learning machines, and target, gesture and action recognition. A non-regularized state preserving extreme learning machine is presented for natural scene classification. An algorithm for human action recognition through dynamic frame warping based on depth cues is given. Target recognition in night vision through convolutional neural network is also presented. Use of convolutional neural network in detecting static hand gesture is also discussed. Finally, the technical contributions in the areas of surveillance, coding and data security, and biometrics and document processing are considered as applications of computervision and imageprocessing. They discuss some contemporary applications. A few of them are a system for tackling blind curves, a quick reaction target acquisition and tracking sys
This edited volume contains technical contributions in the field of computervision and imageprocessing presented at the First International conference on computervision and imageprocessing (CVIP 2016). The contrib...
ISBN:
(数字)9789811021046
ISBN:
(纸本)9789811021039;9789811021046
This edited volume contains technical contributions in the field of computervision and imageprocessing presented at the First International conference on computervision and imageprocessing (CVIP 2016). The contributions are thematically divided based on their relation to operations at the lower, middle and higher levels of vision systems, and their applications. The technical contributions in the areas of sensors, acquisition, visualization and enhancement are classified as related to low-level operations. They discuss various modern topics reconfigurable image system architecture, Scheimpflug camera calibration, real-time autofocusing, climate visualization, tone mapping, super-resolution and image resizing. The technical contributions in the areas of segmentation and retrieval are classified as related to mid-level operations. They discuss some state-of-the-art techniques non-rigid image registration, iterative image partitioning, egocentric object detection and video shot boundary detection. The technical contributions in the areas of classification and retrieval are categorized as related to high-level operations. They discuss some state-of-the-art approaches extreme learning machines, and target, gesture and action recognition. A non-regularized state preserving extreme learning machine is presented for natural scene classification. An algorithm for human action recognition through dynamic frame warping based on depth cues is given. Target recognition in night vision through convolutional neural network is also presented. Use of convolutional neural network in detecting static hand gesture is also discussed. Finally, the technical contributions in the areas of surveillance, coding and data security, and biometrics and document processing are considered as applications of computervision and imageprocessing. They discuss some contemporary applications. A few of them are a system for tackling blind curves, a quick reaction target acquisition and tracking sys
Rotation invariance has been studied in the computervision community primarily in the context of small in-plane rotations. This is usually achieved by building invariant image features. However, the problem of achiev...
详细信息
ISBN:
(纸本)9781450347532
Rotation invariance has been studied in the computervision community primarily in the context of small in-plane rotations. This is usually achieved by building invariant image features. However, the problem of achieving invariance for large rotation angles remains largely unexplored. In this work, we tackle this problem by directly compensating for large rotations, as opposed to building invariant features. This is inspired by the neuro-scientific concept of mental rotation, which humans use to compare pairs of rotated objects. Our contributions here are three-fold. First, we train a Convolutional Neural Network (CNN) to detect image rotations. We find that generic CNN architectures are not suitable for this purpose. To this end, we introduce a convolutional template layer, which learns representations for canonical 'unrotated' images. Second, we use Bayesian Optimization to quickly sift through a large number of candidate images to find the canonical 'unrotated' image. Third, we use this method to achieve robustness to large angles in an image retrieval scenario. Our method is task-agnostic, and can be used as a pre-processing step in any computervision system.
Understanding crowd dynamics is an interesting problem in computervision owing to its various applications. We propose a dynamical system to model the dynamics of collective motion of the crowd. The model learns the ...
详细信息
ISBN:
(纸本)9781450347532
Understanding crowd dynamics is an interesting problem in computervision owing to its various applications. We propose a dynamical system to model the dynamics of collective motion of the crowd. The model learns the spatio-temporal interaction pattern of the crowd from the track data captured over a time period. The model is trained under a least square formulation with spatial and temporal constraints. The spatial constraint allows the model to consider only the neighbors of a particular agent and the temporal constraint enforces temporal smoothness in the model. We also propose an effective group detection algorithm that utilizes the eigenvectors of the interaction matrix of the model. The group detection is cast as a spectral clustering problem. Extensive experimentation demonstrates a superlative performance of our group detection algorithm over state-of-the-art methods.
Compressed sensing magnetic resonance imaging (CSMRI) have demonstrated that it is possible to accelerate MRI scan time by reducing the number of measurements in the k-space without significant loss of anatomical deta...
详细信息
ISBN:
(纸本)9781450347532
Compressed sensing magnetic resonance imaging (CSMRI) have demonstrated that it is possible to accelerate MRI scan time by reducing the number of measurements in the k-space without significant loss of anatomical details. The number of k-space measurements is roughly proportional to the sparsity of the MR signal under consideration. Recently, a few works on CSMRI have revealed that the sparsity of the MR signal can be enhanced by suitable weighting of different regularization priors. In this paper, we have proposed an efficient adaptive weighted reconstruction algorithm for the enhancement of sparsity of the MR image. Experimental results show that the proposed algorithm gives better reconstructions with less number of measurements without significant increase of the computational time compared to existing algorithms in this line.
暂无评论