Book flipping scanning refers to the process of recording a book while the user performs the flipping action of its pages. In recent years it has gained much attention as it reduces the workload of book digitization s...
详细信息
ISBN:
(纸本)9781479915880
Book flipping scanning refers to the process of recording a book while the user performs the flipping action of its pages. In recent years it has gained much attention as it reduces the workload of book digitization significantly. It is a challenging task because flipping at random speed and direction causes difficulties to identify distinct open page images (OPI) which represent each page of the book. In this paper, we propose a fast technique for removing duplicate open pages introduced in the video stream due to erroneous flipping. We present an algorithm that exploits cues from edge information of flipping pages. the nature of the cues extracted from the region of interest (ROI) of the frame, determines the flipping or an open state of a page whereas temporal position a flipping page determines the direction of the flipping. Combining these information we decide whether an open page image is a duplicate or not. Experiments are performed on video documents recorded using a standard resolution camera to validate the duplicate open page removal algorithm and we have obtained 95% accuracy.
this paper addresses the problem of reconstruction of specular surfaces using a combination of Dynamic Programming and Markov Random Fields formulation. Unlike traditional methods that require the exact position of en...
详细信息
ISBN:
(纸本)9781479915880
this paper addresses the problem of reconstruction of specular surfaces using a combination of Dynamic Programming and Markov Random Fields formulation. Unlike traditional methods that require the exact position of environment points to be known, our method requires only the relative position of the environment points to be known for computing approximate normals and infer shape from them. We present an approach which estimates the depth from dynamic programming routine and MRF stereo matching and use MRF optimization to fuse the results to get the robust estimate of shape. We used smooth color gradient image as our environment texture so that shape can be recovered using just a single shot. We evaluate our method using synthetic experiments on 3D models like Stanford bunny and show the real experiment results on golden statue and silver coated statue.
High quality depth map estimation is required for better visualization of 3D views as there is great impact of depth map quality on overall 3D image quality. If the depth is estimated from conventional ways using two ...
详细信息
ISBN:
(纸本)9781479915880
High quality depth map estimation is required for better visualization of 3D views as there is great impact of depth map quality on overall 3D image quality. If the depth is estimated from conventional ways using two or more images, some defects come into picture, mostly in regions without texture. We utilised Microsoft Kinect RGBD dataset to obtain input color images and depth maps which also includes some noise factors. We proposed a method to remove this noise and get quality depthimages. First the color and depthimages are aligned to each other using intensity based image registration. this method of image alignment is mostly used in medical field, but we applied this technique to correct kinect depth maps by which one can avoid cumbersome task of feature based point correspondence between images. there is no requirement of preprocessing or segmentation steps if we use intensity based image alignment method. Second, we proposed an algorithm to fill the unwanted gaps in kinect depth maps and upsampled it using corresponding high resolution color image. Finally we applied 9x9 median filtering on implementation results and get high quality and improved depth maps.
Compact representation of visual content has emerged as an important topic in the context of large scale image/video retrieval. the recently proposed Vector of Locally Aggregated Descriptors (VLAD) has shown to outper...
详细信息
ISBN:
(纸本)9781479915880
Compact representation of visual content has emerged as an important topic in the context of large scale image/video retrieval. the recently proposed Vector of Locally Aggregated Descriptors (VLAD) has shown to outperform other existing techniques for retrieval. In this paper, we propose two spatio-temporal features for constructing VLAD vectors for videos in the context of large scale video retrieval. Given a particular query video, our aim is to retrieve similar videos from the database. Experiments are conducted on UCF50 and HMDB51 datasets, which pose challenges in the form of camera motion, view-point variation, large intra-class variation, etc. the paper proposes the following two spatio-temporal features for constructing VLADs i) Local Histogram of Oriented Optical Flow (LHOOF), and ii) Space-Time Invariant Points (STIP). the performance of these proposed features are compared with SIFT based spatial feature. the mean average precision (MAP) indicates the better retrieval performance of the proposed spatio-temporal feature over spatial feature.
Performance of an OCR system is badly affected due to presence of hand-drawn annotation lines in various forms, such as underlines, circular lines, and other text-surrounding curves. Such annotation lines are drawn by...
详细信息
ISBN:
(纸本)9781479915880
Performance of an OCR system is badly affected due to presence of hand-drawn annotation lines in various forms, such as underlines, circular lines, and other text-surrounding curves. Such annotation lines are drawn by a reader usually in free hand in order to summarize some text or to mark the keywords within a document page. In this paper, we propose a generalized scheme for detection and removal of these handdrawn annotations from a scanned document page. An underline drawn by hand is roughly horizontal or has a tolerable undulation, whereas for a hand-drawn curved line, the slope usually changes at a gradual pace. Based on this observation, we detect the cover of an annotation object-be it straight or curvedas a sequence of straight edge segments. the novelty of the proposed method lies in its ability to compute the exact cover of the annotation object, even when it touches or passes through any text character. After getting the annotation cover, an effective method of inpainting is used to quantify the regions where text reconstruction is needed. We have done our experimentation with various documents written in English, and some results are presented here to show the efficiency and robustness of the proposed method.
Online handwriting recognition research has recently received significant thrust. Specifically for indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Gove...
详细信息
ISBN:
(纸本)9781479915880
Online handwriting recognition research has recently received significant thrust. Specifically for indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Government funding through the group on Technology Development for indian Languages (TDIL) of the Ministry of Communication & Information Technology (MC&IT), Govt. of India, research in this area has received due attention and several groups are now engaged in research and development works for online handwriting recognition in different indian scripts. An extensive bottleneck of the desired progress in this area is the difficulty of collection of large sample databases of online handwriting in various scripts. Towards the same, recently a user-friendly tool on Android platform has been developed to collect data on handheld devices. this tool is called ISIgraphy and has been uploaded in the Google Play for free download. this application is designed well enough to store handwritten data samples in large scales in user-given file names for distinct users. Its use is script independent, meaning that it can collect and store handwriting samples written in any language, not necessarily an indian script. It has an additional module for retrieval and display of stored data. Moreover, it can directly send the collected data to others via electronic mail.
Real time anomaly detection is the need of the hour for any security applications. In this paper, we have proposed a real-time anomaly detection algorithm by utilizing cues from the motion vectors in H.264/AVC compres...
详细信息
ISBN:
(纸本)9781479915880
Real time anomaly detection is the need of the hour for any security applications. In this paper, we have proposed a real-time anomaly detection algorithm by utilizing cues from the motion vectors in H.264/AVC compressed domain. the discussed work is principally motivated by the observation that motion vectors (MVs) exhibit different characteristics during anomaly. We have observed that H.264 motion vector magnitude contains relevant information which can be used to model the usual behavior (UB) effectively. this is subsequently extended to detect abnormality/anomaly based on the probability of occurrence of a behavior. Additionally, we have suggested a hierarchical approach through Motion Pyramid for High Resolution videos to further increase the detection rate. the proposed algorithm has performed extremely well on UMN and Peds Anomaly Detection Video datasets, with a detection speed of > 150 and 65 - 75 frames per sec in respective datasets resulting in more than 200x speedup along with comparable accuracy to pixel domain state-of-the-art algorithms.
In this paper, a shape recognition method is proposed for a few common geometrical shapes including straight line, circle, ellipse, triangle, quadrilateral, pentagon and hexagon. In the present work, two indices namel...
详细信息
ISBN:
(纸本)9781479915880
In this paper, a shape recognition method is proposed for a few common geometrical shapes including straight line, circle, ellipse, triangle, quadrilateral, pentagon and hexagon. In the present work, two indices namely Unique Shape Signature (USS) and Condensibility (C) are employed for shape recognition of an object. Using the USS index, all the above mentioned non-circular shapes are neatly recognized, whereas, the C index recognized the circular objects. An added advantage of the proposed method is that it can further differentiate triangles, quadrilaterals and both symmetric and non-symmetric shapes of pentagon and hexagon using distance variance (Var(d(si))) parameter calculated from USS. Applying the proposed method on above mentioned shapes, an overall recognition rate of 98.80% is achieved on several simulated and real objects of different shapes. Proposed method has also been compared with two existing methods, presents better result. Performance of the proposed method is illustrated by applying it on underwater images and it is observed to perform satisfactory on all the images under test.
Face recognition under varying background and pose is challenging, and extracting background and pose invariant features is an effective approach to solve this problem. this paper proposes a skin detection-based appro...
详细信息
ISBN:
(纸本)9781479915880
Face recognition under varying background and pose is challenging, and extracting background and pose invariant features is an effective approach to solve this problem. this paper proposes a skin detection-based approach for enhancing the performance of a Face Recognition (FR) system, employing a unique combination of Skin based background removal, Discrete Wavelet Transform (DWT), Adaptive Multi-Level threshold Binary Particle Swarm Optimization (ABPSO) and an Error Control Feedback (ECF) loop. Skin based background removal is used for efficient background removal and ABPSO-based feature selection algorithm is used to search the feature space for the optimal feature subset. the ECF loop is used to neutralize pose variations. Experimental results, obtained by applying the proposed algorithm on Color FERET and CMUPIE face databases, show that the proposed system outperforms other FR systems. A significant increase in the recognition rate and substantial reduction in the number of features are observed.
We present a simple and powerful scheme to allow CSG of implicit surfaces on the GPU. We decompose the boolean expression of surfaces into sum-of-products form. Our algorithm presented in this paper then renders each ...
详细信息
ISBN:
(纸本)9781479915880
We present a simple and powerful scheme to allow CSG of implicit surfaces on the GPU. We decompose the boolean expression of surfaces into sum-of-products form. Our algorithm presented in this paper then renders each product term, sum of products can be automatically by enabling depth test. Our Approximate CSG uses adaptive marching points algorithm for finding ray-surface intersection. Once we find an interval where root exists after root-isolation, this is used for presence of intersection. We perform root-refinement only for the uncomplemented terms in the product. Exact CSG is done by using the discriminant of the ray-surface intersection for the presence of the root. Now we can simply evaluate the product expression by checking all uncomplemented terms should be true and all complemented terms should be false. If our condition is met, we find the maximum of all the roots among uncomplemented terms to be the solution. Our algorithm is linear in the number of terms O(n). We achieve real-time rates for 4-5 terms in the product for approximate CSG. We achieve more than real-time rates for Exact CSG. Our primitives are implicit surfaces so we can achieve fairly complex results with less terms.
暂无评论