In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD) and the Continuous Gesture Dataset (ConGD). Both datasets a...
详细信息
In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD) and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset (CGD) that has a total of more than 50000 gestures for the "one-shot-learning" competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences. Using these datasets we will open two competitions on the CodaLab platform so that researchers can test and compare their methods for "user independent" gesture recognition. The first challenge is designed for gesture spotting and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and pu...
详细信息
Offline signature verification is one of the most challenging tasks in biometrics and document foren-sics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged s...
详细信息
In this work, we propose a more realistic and efficient face-based mobile authentication technique using CNNs. This paper discusses and explores an inevitable problem of using face images for mobile authentication, ta...
详细信息
In this work, we propose a more realistic and efficient face-based mobile authentication technique using CNNs. This paper discusses and explores an inevitable problem of using face images for mobile authentication, taken from varying distances with a front/selfie camera of the mobile phone. Incidentally, once an individual comes towards a certain distance from the camera, the face images get large and appear over-sized. Simultaneously sharp features of some portions of the face, such as forehead, cheek, and chin are changed completely. As a result, the face features change and the impact increases exponentially once the individual crosses a certain distance and gradually approaches towards the front camera. This work proposes a solution (achieving better accuracy and facial features, whereby face images were cropped and aligned around its close bounding box) to mitigate the aforementioned identified gap. The work investigated different frontier face detection and recognition techniques to justify the proposed solution. Among all the employed methods evaluated, CNNs worked best. For a quantitative comparison of the proposed method, manually cropped face images/annotations of the face images along with their close boundary were prepared. In turn, we have developed a database considering the above-mentioned scenario for 40 individuals, which will be publicly available for academic research purposes. The experimental results achieved indicate a successful implementation of the proposed method and the performance of the proposed technique is also found to be superior in comparison to the existing state-of-the-art.
In this paper, we address the problem of multimodal registration of coronary vessels by developing a 3D parametrical model of vessel trees from computer tomography data and registering it to angiography images during ...
详细信息
In this paper, we address the problem of multimodal registration of coronary vessels by developing a 3D parametrical model of vessel trees from computer tomography data and registering it to angiography images during intervention. Thus, the interventionist takes profit from 3D data otherwise only available before the intervention. This facilitates orientation in ambiguous radiographs, interactive visualization of all vessel structures to estimate their mutual position and navigation within the vessel system and ultimately reduces the radiation the patient and the physicians are exposed to. The model is build by exploring the branching vessel tree starting from a single position and successively expanding through the vessels guided by a local deformable surface. The result is a tree of cylindrical segments each adapted to the vessel walls that is registered to angiography images in a fast and robust way. Validation on 8 patients confirms the robustness of our method.
This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of the ICDAR2013. The general objective of the contest was to use well established evaluation practices and pro...
详细信息
ISBN:
(纸本)9781479901937
This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of the ICDAR2013. The general objective of the contest was to use well established evaluation practices and procedures to record recent advances in off-line handwriting segmentation. Two benchmarking datasets, one for text line and one for word segmentation, were created in order to test and compare all submitted algorithms as well as some state-of-the-art methods for handwritten document image segmentation in realistic circumstances. Handwritten document images were produced by many writers in two Latin based languages (English and Greek) and in one Indian language (Bangla, the second most popular language in India). These images were manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation results. The datasets of previously organized contests (ICDAR2007, ICDAR2009 and ICFHR2010 Handwriting Segmentation Contests) along with a dataset of Bangla document images were used as training dataset. Eleven methods are submitted in this competition. A brief description of the submitted algorithms, the evaluation criteria and the segmentation results obtained from the submitted methods are also provided in this manuscript.
Recent works have shown that the computational efficiency of 3D medical image (e.g. CT and MRI) segmentation can be impressively improved by dynamic inference based on slice-wise complexity. As a pioneering work, a dy...
详细信息
An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that compr...
详细信息
An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that comprise the window. It works by decomposing the window operation into sub-operations over smaller window partitions. These partitions are the quadtree blocks corresponding to the window. Although a block b in the underlying spatial database may cover several of the smaller window partitions, b is only retrieved once rather than multiple times. This is achieved by using an auxiliary main memory data structure called the active border which requires O(n) additional storage for a window query of size n × n. As a result, the algorithm generates an optimal number of disk I/O requests to answer a window query (i.e., one request per covering quadtree block). A proof of correctness and an analysis of the algorithm's execution time and space requirements are given, as are some experimental results.
The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a...
详细信息
Mutual occlusions among targets can cause track loss or target position deviation, because the observation likelihood of an occluded target may vanish even when we have the estimated location of the target. This paper...
详细信息
ISBN:
(纸本)9781479951192
Mutual occlusions among targets can cause track loss or target position deviation, because the observation likelihood of an occluded target may vanish even when we have the estimated location of the target. This paper presents a novel probability framework for multitarget tracking with mutual occlusions. The primary contribution of this work is the introduction of a vectorial occlusion variable as part of the solution. The occlusion variable describes occlusion states of the targets. This forms the basis of the proposed probability framework, with the following further contributions: 1) Likelihood: A new observation likelihood model is presented, in which the likelihood of an occluded target is computed by referring to both of the occluded and occluding targets. 2) Priori: Markov random field (MRF) is used to model the occlusion priori such that less likely "circular" or "cascading" types of occlusions have lower priori probabilities. Both the occlusion priori and the motion priori take into consideration the state of occlusion. 3) Optimization: A realtime RJMCMC-based algorithm with a new move type called "occlusion state update" ispresented. Experimental results show that the proposed framework can handle occlusions well, even including long-duration full occlusions, which may cause tracking failures in the traditional methods.
暂无评论