检索结果-内蒙古大学图书馆

arXiv 2025年

作者： Zheng, Guanhua Sang, Jitao Xu, Changsheng The University of Science and Technology of China Hefei230026 China The School of Computer and Information Technology The Beijing Key Laboratory of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing100044 China The National Lab of Pattern Recognition Institute of Automation CAS Beijing100190 China The University of Chinese Academy of Sciences China

Attributions aim to identify input pixels that are relevant to the decision-making process. A popular approach involves using modified backpropagation (BP) rules to reverse decisions, which improves interpretability compared to the original gradients. However, these methods lack a solid theoretical foundation and exhibit perplexing behaviors, such as reduced sensitivity to parameter randomization, raising concerns about their reliability and highlighting the need for theoretical justification. In this work, we present a unified theoretical framework for methods like GBP, RectGrad, LRP, and DTD, demonstrating that they achieve input alignment by combining the weights of activated neurons. This alignment improves the visualization quality and reduces sensitivity to weight randomization. Our contributions include: (1) Providing a unified explanation for multiple behaviors, rather than focusing on just one. (2) Accurately predicting novel behaviors. (3) Offering insights into decision-making processes, including layer-wise information changes and the relationship between attributions and model decisions. © 2025, CC BY.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Segment Based Camera Calibration

引用

Journal of Computer Science & Technology 1993年第1期8卷 11-16页

作者：马颂德魏国庆黄金风 National Lab of Pattern Recognition Institute of AutomationChinese Academy of SciencesBeijing 100080

The basic idea of calibrating a camera system in previous approaches is to determine camera parameters by using a set of known 3D points as calibration *** this paper,we present a method of camera calibration in which camera parameters are determined by a set of 3D lines.A set of constraints is derived on camera parameters in terms of perspective line *** these con- straints,the same perspective transformation matrix as that for point mapping can be computed *** minimum number of calibration lines is *** result generalizes that of Lin,Huang and Faugeras for camera location determination in which at least 8 line correspondences are re- quired for linear computation of camera *** line segments in an image can be located easi- ly and more accurately than points,the use of lines as calibration reference tends to ease the compu- tation in image preprocessing and to improve calibration *** results on the calibration along with stereo reconstruction are reported.

关键词： Camera calibration line correspondences perspective transformation matrix 3D reconstruction

来源：评论

学校读者我要写书评

暂无评论

3D Motion Estimation and Motion Fusion by Affine Region Matching

引用

Journal of Computer Science & Technology 1993年第1期8卷 17-25页

作者：魏国庆马颂德 National Lab of Pattern Recognition Institute of AutomationChinese Academy of SciencesBeijing 100080

In this paper,a new method is presented for 3D motion estimation by image region correspon- dences using stereo *** the weak perspectivity assumption,we first employ the moment tensor theory(Cyganski and Orr)to compute the monocular affine transformations relating images taken by the same camera at different time instants and the binocular affine transformations relating images taken by different cameras at the same time *** then show that 3D motion can he recovered from these 2D transformations.A space-time fusion strategy is proposed to aim at robust *** knowledge of point correspondences is required in the above processes and the computa- lions involved are *** find corresponding image regions,new affine invariants,which show stronger invariance,are derived in term of tensor contraction *** on real motion images are conducted to verify the proposed method.

关键词： Moment tensors affine transformation weak perspectivity region correspondences monocular motion binocular motion motion fusion affine invariants

来源：评论

学校读者我要写书评

暂无评论

Sub-sequence Factorization-an Effective Approach for Projective Reconstruction under Occlusion

引用

中国电子杂志（英文版） 2001年第2期10卷 189-195页

作者： LU Le HU Zhanyi National Lab. of Pattern Recognition NLPR Institute of Automation Chinese Academy of Sciences Beijing 100080 China

Projective reconstruction is a key step for 3D metric reconstruction from a sequence of images captured by an uncalibrated camera. Due to its inherent robustness, the factorization method is widely used in literatures. However, the main shortcoming of the standard factorization method is that it requires the corresponding points to appear across ALL the images. When large changes of view angles occur in the image sequence, or the scene is easy to be self-occluded, nearly it will be very difficult to have some corresponding points visible across all the images. But it is much easier for only several images to have enough correspondences required by factorization method. These images are called as a subsequence in the whole image set. In this paper,a sub-sequence factorization method is proposed to cope with the problem. The basic principle of the sub-sequence factorization method is to divide a long image sequence into several short subsequences according to its spatial continuity, and the standard factorization method is employed for each ***, a novel alignment process is invoked to transform different reconstructions under different subsequences into one reference coordinates system. It is more practical as it only demands that there are enough correspondences in each subsequence, not the whole sequence. The proposed sub-sequence factorization method preserves the robustness aspect embedded in the standard factorization method. The experiments on synthetic and real images validate our proposed new method.

关键词： Stratified reconstruction Projective reconstruction Transformation matrix.

来源：评论

学校读者我要写书评

暂无评论

Kernel-Based Nonlinear Discriminant Analysis for Face recognition

引用

Journal of Computer Science & Technology 2003年第6期18卷 788-795页

作者：刘青山黄锐卢汉清马颂德 National Lab of Pattern Recognition Institute of Automation The Chinese Academy of Sciences Beijing 100080 P.R. China

Linear subspace analysis methods have been successfully applied to extract features for face recognition. But they are inadequate to represent the complex and nonlinear variations of real face images, such as illumination, facial expression and pose variations, because of their linear properties. In this paper, a nonlinear subspace analysis method, Kernel-based Nonlinear Discriminant Analysis (KNDA), is presented for face recognition, which combines the nonlinear kernel trick with the linear subspace analysis method - Fisher Linear Discriminant Analysis (FLDA).First, the kernel trick is used to project the input data into an implicit feature space, then FLDA is performed in this feature space. Thus nonlinear discriminant features of the input data are yielded. In addition, in order to reduce the computational complexity, a geometry-based feature vectors selection scheme is adopted. Another similar nonlinear subspace analysis is Kernel-based Principal Component Analysis (KPCA), which combines the kernel trick with linear Principal Component Analysis (PCA). Experiments are performed with the polynomial kernel, and KNDA is compared with KPCA and FLDA. Extensive experimental results show that KNDA can give a higher recognition rate than KPCA and FLDA.

关键词： linear subspace analysis kernel-based nonlinear discriminant analysis kernel-based principal component analysis face recognition

来源：评论

学校读者我要写书评

暂无评论

Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers 6

Incorporating HMM-state sequence confusion for rapid MLLR ad...

引用

6th International Conference on Spoken Language Processing, ICSLP 2000

作者： Zhao, Bing Xu, Bo National Lab of Pattern Recognition Institute of Automation Chinese Academy of Sciences China

ISBN: (纸本)7801501144

In this paper, we introduce the HMM-state sequence confusion characteristics as prior knowledge into the framework of MLLR to relax the transformation and reduce the risks of over-training when adaptation data size is small. There are two issues to be addressed as follows: first, how to estimate such confusion information reliably;second how to use the information in refining the estimation of MLLR adaptation. The pronunciation modeling technology was utilized to build the state sequence confusion table. Then the correlation of states is calculated according to the confusion table. Following proposed algorithm made a relaxation in the process of MLLR adaptation when the adaptation data is very small. Our experiment on a Mandarin state-tying triphone toneless LVCSR system showed that error rate reduction is 9.5% over standard MLLR with about 10 utterances (less than 30 seconds) of adaptation data.

关键词： Metadata

来源：评论

学校读者我要写书评

暂无评论

An effecitve algorithm for fingerprint matching

An effecitve algorithm for fingerprint matching

引用

2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering

作者： Hao, Ying Tan, Tieniu Wang, Yunhong National Lab of Pattern Recognition CAS Institute of Automation Beijing 100080 China

Fingerprint matching is one of the most important stages in automatic fingerprint identification systems (AFIS). Traditional methods treat this problem as point pattern matching, which is essentially an intractable problem due to the various nonlinear deformations commonly observed in fingerprint images. In this article, we propose an effective fingerprint matching algorithm based on error propagation. Firstly, ridge information and Hough transformation are adopted to find several pairs of matching minutiae, the initial correspondences, which are used to estimate the common region of two fingerprints and the alignment, parameters. Then a MatchedSet which includes the correspondence and its surrounding matched minutiae pairs is established. The subsequent matching process is guided by the concept of error propagation: the matching errors of each unmatched minutiae are estimated according to those of its most relevant neighbor minutiae. In order to prevent the process from being misguided by mismatched minutiae pairs, we adopt a flexible propagation scheme. Experimental results demonstrate the robustness of our algorithm to non-linear deformation.

关键词： Image processing

来源：评论

学校读者我要写书评

暂无评论

Online text-independent writer identification based on temporal sequence and shape codes

Online text-independent writer identification based on tempo...

引用

ICDAR2009 - 10th International Conference on Document Analysis and recognition

作者： Li, Bangy Tan, Tieniu National Lab. of Pattern Recognition Institute of Automation Chinese Academy of Science China

ISBN: (纸本)9780769537252

In this paper we present a novel method for online text-independent writer identification. Most of the existing writer identification techniques require the data to be from a specific text which is not applicable to cases where such text is not available, such as in criminal justice systems when text documents with different content need to be compared. Text-independent approaches often require a large amount of data to be confident of good results. We propose temporal sequence and shape codes to encode online handwriting. Temporal sequence codes (TSC) are to characterize trajectory in speed and pressure change in writing, and shape codes (SC) are to characterize direction of trajectory in writing handwriting. For TSC , we use two different codes to encode speed and pressure to code book: stroke temporal sequence codes (STSC) and neighbor temporal sequence codes (NTSC). At identification stage, we implement decision and fusion strategy to identify writer. Experimental results show that our proposed method can improve the identification accuracy with a small number of characters. Moreover, we find that the proposed method is even effective for cross-language (English & Chinese) writer identification. © 2009 IEEE.

关键词： Encoding (symbols)

来源：评论

学校读者我要写书评

暂无评论

Data decomposition and spatial mixture modeling for part based model

Data decomposition and spatial mixture modeling for part bas...

引用

11th Asian Conference on Computer Vision, ACCV 2012

作者： Zhang, Junge Huang, Yongzhen Huang, Kaiqi Wu, Zifeng Tan, Tieniu National Lab. of Pattern Recognition Institute of Automation Chinese Academy of Sciences China

ISBN: (纸本)9783642373305

This paper presents a system of data decomposition and spatial mixture modeling for part based models. Recently, many enhanced part based models (with e.g., multiple features, more components or parts) have been proposed. Nevertheless, those enhanced models bring high computation cost together with the risk of over-fitting. To tackle this problem, we propose a data decomposition method for part based models which not only accelerates training and testing process but also improves the performance on average. Besides, the original part based model uses a strict rigid structural model to describe the distribution of each part location. It is not "deformable" enough, especially for those instances with different viewpoints or poses in the same aspect ratio. To address this problem, we present a novel spatial mixture modeling method. The spatial mixture embedded model is then integrated into the proposed data decomposition framework. We evaluate our system on the challenging PASCAL VOC2007 and PASCAL VOC2010 datasets, demonstrating the state-of-the-art performance compared with other related methods in terms of accuracy and efficiency. © 2013 Springer-Verlag.

关键词： Aspect ratio

来源：评论

学校读者我要写书评

暂无评论

Multi-modal supervised latent dirichlet allocation for event classification in social media 14

Multi-modal supervised latent dirichlet allocation for event...

引用

6th International Conference on Internet Multimedia Computing and Service, ICIMCS 2014

作者： Qian, Shengsheng Zhang, Tianzhu Xu, Changsheng National Lab of Pattern Recognition Institute of Automation Chinese Academy of Sciences Beijing China

ISBN: (纸本)9781450328104

In social media, many existing websites (e.g., Flickr, YouTube, and Facebook) are for users to share their own interests and opinions of many popular events, and success-fully facilitate the event generation, sharing and propagation. As a result, there are substantial amounts of user-contributed media data (e.g., images, videos, and textual content) for a wide variety of real-world events of different types and scales. The aim of this paper is to automatically identify the interesting events from massive social media data, which are useful to browse, search and monitor social events by users or governments. To achieve this goal, we propose a novel multi-modal supervised latent dirichlet allocation (mm-SLDA) for social event classification. Our proposed mm-SLDA has a number of advantages. (1) It can effectively exploit the multi-modality and the multi-class property of social events jointly. (2) It makes use of the supervised social event category label information and is able to classify multi-class social event directly. We evaluate our proposed mm-SLDA on a real world dataset and show extensive experimental results, which demonstrate that our model outperforms state-of-the-art methods. Copyright 2014 ACM.

关键词： Classification (of information)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：