检索结果-内蒙古大学图书馆

IEEE computer Society Conference on computer vision and pattern recognition Workshops (CVPRW)

作者： Jun Wan Stan Z. Li Yibing Zhao Shuai Zhou Isabelle Guyon Sergio Escalera National Laboratory of Pattern Recognition Chinese Academy of Sciences China Macau University of Science and Technology Macau UPSud and INRIA Université Paris-Saclay ChaLearn University of Barcelona Computer Vision Center ChaLearn

In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD) and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset (CGD) that has a total of more than 50000 gestures for the "one-shot-learning" competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences. Using these datasets we will open two competitions on the CodaLab platform so that researchers can test and compare their methods for "user independent" gesture recognition. The first challenge is designed for gesture spotting and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.

关键词： Gesture recognition Training Indexes computer vision Testing Conferences

来源：评论

学校读者我要写书评

暂无评论

ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition-RRC-MLT-2019 15

ICDAR2019 robust reading challenge on multi-lingual scene te...

引用

15th IAPR International Conference on Document Analysis and recognition, ICDAR 2019

作者： Nayef, Nibal Liu, Cheng-Lin Ogier, Jean-Marc Patel, Yash Busta, Michal Chowdhury, Pinaki Nath Karatzas, Dimosthenis Khlif, Wafa Matas, Jiri Pal, Umapada Burie, Jean-Christophe L3i Laboratory University of la Rochelle France Computer Vision Center Universitat Autonoma de Barcelona Spain CVPR Unit Indian Statistical Institute India Robotics Institute Carnegie Mellon Universiry Pittsburgh United States Center for Machine Perception Department of Cybernetics Czech Technical University Prague Czech Republic National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences China

ISBN: (纸本)9781728128610

With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge. © 2019 IEEE.

关键词： Competition

来源：评论

学校读者我要写书评

暂无评论

SigNet: Convolutional siamese network for writer independent offline signature verification

arXiv

引用

arXiv 2017年

作者： Dey, Sounak Dutta, Anjan Ignacio Toledo, J. Ghosh, Suman K. Lladós, Josep Pal, Umapada Computer Vision Center Computer Science Dept. Universitat Autònoma de Barcelona Edifici O Campus Uab Bellaterra08193 Spain Computer Vision and Pattern Recognition Unit Indian Statistical Institute 203 B. T. Road Kolkata700108 India

Offline signature verification is one of the most challenging tasks in biometrics and document foren-sics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might only differ from a real signature by some specific kinds of deformation. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network. Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity. This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs. Experiments conducted on cross-domain datasets emphasize the capability of our network to handle forgery in different languages (scripts) and handwriting styles. Moreover, our designed Siamese network, named SigNet, provided better results than the state-of-the-art results on most of the benchmark signature datasets. Copyright © 2017, The Authors. All rights reserved.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

More Realistic and Efficient Face-Based Mobile Authentication using CNNs

More Realistic and Efficient Face-Based Mobile Authenticatio...

引用

International Joint Conference on Neural Networks

作者： Abhijit Das Abira Sengupta Muhammad Saqib Umapada Pal Michael Blumenstein Center for Artificial Intelligence School of Software University of Technology Sydney Australia Department of Computer Science Kalyani Government Engineering College Kalyani India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

In this work, we propose a more realistic and efficient face-based mobile authentication technique using CNNs. This paper discusses and explores an inevitable problem of using face images for mobile authentication, taken from varying distances with a front/selfie camera of the mobile phone. Incidentally, once an individual comes towards a certain distance from the camera, the face images get large and appear over-sized. Simultaneously sharp features of some portions of the face, such as forehead, cheek, and chin are changed completely. As a result, the face features change and the impact increases exponentially once the individual crosses a certain distance and gradually approaches towards the front camera. This work proposes a solution (achieving better accuracy and facial features, whereby face images were cropped and aligned around its close bounding box) to mitigate the aforementioned identified gap. The work investigated different frontier face detection and recognition techniques to justify the proposed solution. Among all the employed methods evaluated, CNNs worked best. For a quantitative comparison of the proposed method, manually cropped face images/annotations of the face images along with their close boundary were prepared. In turn, we have developed a database considering the above-mentioned scenario for 40 individuals, which will be publicly available for academic research purposes. The experimental results achieved indicate a successful implementation of the proposed method and the performance of the proposed technique is also found to be superior in comparison to the existing state-of-the-art.

关键词： Face Cameras Feature extraction Authentication Face recognition Optical distortion Face detection

来源：评论

学校读者我要写书评

暂无评论

Building and registering parameterized 3D models of vessel trees for visualization during intervention

Building and registering parameterized 3D models of vessel t...

引用

International Conference on pattern recognition

作者： G. Langs P. Radeva D. Rotger F. Carreras Insitute for Computer Graphics and Vision Graz University of Technology Graz Austria Pattern Recognition and Image Processing Group Vienna University of Technology Vienna Austria Computer Vision Center Universitat Autònoma de Barcelona Bellaterra Spain Hospital St. Pau Barcelona Spain

In this paper, we address the problem of multimodal registration of coronary vessels by developing a 3D parametrical model of vessel trees from computer tomography data and registering it to angiography images during intervention. Thus, the interventionist takes profit from 3D data otherwise only available before the intervention. This facilitates orientation in ambiguous radiographs, interactive visualization of all vessel structures to estimate their mutual position and navigation within the vessel system and ultimately reduces the radiation the patient and the physicians are exposed to. The model is build by exploring the branching vessel tree starting from a single position and successively expanding through the vessels guided by a local deformable surface. The result is a tree of cylindrical segments each adapted to the vessel walls that is registered to angiography images in a fast and robust way. Validation on 8 patients confirms the robustness of our method.

关键词： X-ray imaging Computed tomography Robustness Angiography Image segmentation Image reconstruction Distortion measurement Tree graphs Data visualization Radio navigation

来源：评论

学校读者我要写书评

暂无评论

ICDAR 2013 Handwriting Segmentation Contest

ICDAR 2013 Handwriting Segmentation Contest

引用

International Conference on Document Analysis and recognition

作者： Nikolaos Stamatopoulos Basilis Gatos Georgios Louloudis Umapada Pal Alireza Alaei Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research Demokritos Athens Greece Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India Computer Science Laboratory Universite Francois Rabelais Tours France

ISBN: (纸本)9781479901937

This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of the ICDAR2013. The general objective of the contest was to use well established evaluation practices and procedures to record recent advances in off-line handwriting segmentation. Two benchmarking datasets, one for text line and one for word segmentation, were created in order to test and compare all submitted algorithms as well as some state-of-the-art methods for handwritten document image segmentation in realistic circumstances. Handwritten document images were produced by many writers in two Latin based languages (English and Greek) and in one Indian language (Bangla, the second most popular language in India). These images were manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation results. The datasets of previously organized contests (ICDAR2007, ICDAR2009 and ICFHR2010 Handwriting Segmentation Contests) along with a dataset of Bangla document images were used as training dataset. Eleven methods are submitted in this competition. A brief description of the submitted algorithms, the evaluation criteria and the segmentation results obtained from the submitted methods are also provided in this manuscript.

关键词： Image segmentation Educational institutions Handwriting recognition Benchmark testing Measurement Matched filters Text analysis

来源：评论

学校读者我要写书评

暂无评论

Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

arXiv

引用

arXiv 2023年

作者： Shen, Haoran Zhang, Yifu Wang, Wenxuan Chen, Chen Liu, Jing Song, Shanshan Li, Jiangyun School of Automation and Electrical Engineering University of Science and Technology Beijing China Center for Research in Computer Vision University of Central Florida United States National Lab of Pattern Recognition Institute of Automation Chinese Academy of Sciences China

Recent works have shown that the computational efficiency of 3D medical image (e.g. CT and MRI) segmentation can be impressively improved by dynamic inference based on slice-wise complexity. As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i.e. Med-DANet [44]) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices. However, the issues of incomplete data analysis, high training costs, and the two-stage pipeline in Med-DANet require further improvement. To this end, this paper further explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure. For each slice of the input volume, our proposed method dynamically selects an important foreground region for segmentation based on the policy generated by our Decision Network and Crop Position Network. Besides, we propose to insert a stage-wise quantization selector to the employed segmentation model (e.g. U-Net) for dynamic architecture adapting. Extensive experiments on BraTS 2019 and 2020 show that our method achieves comparable or better performance than previous state-of-the-art methods with much less model complexity. Compared with previous methods Med-DANet and TransBTS with dynamic and static architecture respectively, our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019. Copyright © 2023, The Authors. All rights reserved.

关键词： Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

Efficient Window Block Retrieval in Quadtree-Based Spatial Databases

引用

GeoInformatica 1997年第1期1.0卷 59-91页

作者： Aref, Walid G. Samet, Hanan Computer Science Department Center for Automation Research University of Maryland College Park MD 20742 United States University of Alexandria Egypt University of Maryland College Park United States Matsushita Info. Technol. Laboratory Princeton United States IBM Research Almaden CA United States University of Maryland Inst. for Advanced Computer Studies College Park MD United States ACM IEEE United States Department of Computer Science University of Maryland United States Computer Vision Laboratory Stanford University United States ACM IEEE Intl. Assoc. of Pattern Recognition

An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that comprise the window. It works by decomposing the window operation into sub-operations over smaller window partitions. These partitions are the quadtree blocks corresponding to the window. Although a block b in the underlying spatial database may cover several of the smaller window partitions, b is only retrieved once rather than multiple times. This is achieved by using an auxiliary main memory data structure called the active border which requires O(n) additional storage for a window query of size n × n. As a result, the algorithm generates an optimal number of disk I/O requests to answer a window query (i.e., one request per covering quadtree block). A proof of correctness and an analysis of the algorithm's execution time and space requirements are given, as are some experimental results.

关键词： Active border Clipping Data structures Databases Design of algorithms Quadtree space decomposition Range query Spatial databases Window block retrieval

来源：评论

学校读者我要写书评

暂无评论

A fast matching algorithm for graph-based handwriting recognition

A fast matching algorithm for graph-based handwriting recogn...

引用

9th IAPR-TC-15 International Workshop on Graph-Based Representations in pattern recognition, GbRPR 2013

作者： Fischer, Andreas Suen, Ching Y. Frinken, Volkmar Riesen, Kaspar Bunke, Horst Centre for Pattern Recognition and Machine Intelligence Concordia University 1455 de Maisonneuve Blvd West Montreal QC H3G 1M8 Canada Computer Vision Center Dept. of Computer Science Universitat Autònoma de Barcelona 08193 Bellaterra Spain Institute for Informations Systems University of Applied Sciences and Arts Northwestern Switzerland Riggenbachstrasse 16 4600 Olten Switzerland Institute of Computer Science and Applied Mathematics University of Bern Neubrückstrasse 10 3012 Bern Switzerland

ISBN: (纸本)9783642382208

The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy. © 2013 Springer-Verlag.

关键词： Vector spaces

来源：评论

学校读者我要写书评

暂无评论

A Probabilistic Framework for Multitarget Tracking with Mutual Occlusions

A Probabilistic Framework for Multitarget Tracking with Mutu...

引用

IEEE Conference on computer vision and pattern recognition

作者： Menglong Yang Yiguang Liu Longyin Wen Zhisheng You Stan Z. Li Key Laboratory of Fundamental Synthetic Vision Graphics and Image for National Defense School of Aeronautics and Astronautics & Computer Science Sichuan University Center for Biometrics and Security Research & National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences

ISBN: (纸本)9781479951192

Mutual occlusions among targets can cause track loss or target position deviation, because the observation likelihood of an occluded target may vanish even when we have the estimated location of the target. This paper presents a novel probability framework for multitarget tracking with mutual occlusions. The primary contribution of this work is the introduction of a vectorial occlusion variable as part of the solution. The occlusion variable describes occlusion states of the targets. This forms the basis of the proposed probability framework, with the following further contributions: 1) Likelihood: A new observation likelihood model is presented, in which the likelihood of an occluded target is computed by referring to both of the occluded and occluding targets. 2) Priori: Markov random field (MRF) is used to model the occlusion priori such that less likely "circular" or "cascading" types of occlusions have lower priori probabilities. Both the occlusion priori and the motion priori take into consideration the state of occlusion. 3) Optimization: A realtime RJMCMC-based algorithm with a new move type called "occlusion state update" ispresented. Experimental results show that the proposed framework can handle occlusions well, even including long-duration full occlusions, which may cause tracking failures in the traditional methods.

关键词： Target tracking Probabilistic logic Approximation algorithms Cameras Proposals Computational modeling

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：