检索结果-内蒙古大学图书馆

Polynomial eigenvalue decomposition for eigenvalues with unmajorised ground truth – Reconstructing analytic dinosaurs

引用

Science Talks 2025年 14卷

作者： Schlecht, Sebastian J. Weiss, Stephan Dept. of Signal Processing and Acoustics Aalto University Espoo Finland Dept. of Art and Media Aalto University Espoo Finland Multimedia Comms & Signal Processing Friedrich-Alexander Universität Erlangen-Nürnberg Germany Dept. of Electronic & Electrical Eng University of Strathclyde Scotland Glasgow United Kingdom

When estimated space-time covariance matrices from finite data, any intersections of ground truth eigenvalues will be obscured, and the exact eigenvalues become spectrally majorised with probability one. In this paper, we propose a novel method for accurately extracting the ground truth analytic eigenvalues from such estimated space-time covariance matrices. The approach operates in the discrete Fourier transform (DFT) domain and groups sufficiently eigenvalues over a frequency interval into segments that belong to analytic functions and then solves a permutation problem to align these segments. Utilising an inverse partial DFT and a linear assignment algorithm, the proposed EigenBone method retrieves analytic eigenvalues efficiently and accurately. Experimental results demonstrate the effectiveness of this approach in reconstructing eigenvalues from noisy estimates. Overall, the proposed method offers a robust solution for approximating analytic eigenvalues in scenarios where state-of-the-art methods may fail. © 2025 The Authors

关键词： Analytic eigenvalue decomposition Hungarian algorithm Partial reconstruction Space-time covariance estimation Spectral majorisation

来源：评论

学校读者我要写书评

暂无评论

Speaker verification from coded telephone speech using stochastic feature transformation and handset identification 3rd

Speaker verification from coded telephone speech using stoch...

引用

3rd IEEE Pacific Rim Conference on multimedia, PCM 2002

作者： Yu, Eric W.M. Mak, Man-Wai Kung, Sun-Yuan Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering The Hong Kong Polytechnic University Hong Kong

ISBN: (纸本)3540002626

A handset compensation technique for speaker verification from coded telephone speech is proposed. The proposed technique combines handset selectors with stochastic feature transformation to reduce the acoustic mismatch between different handsets and different speech coders. Coder-dependent GMM-based handset selectors are trained to identify the most likely handset used by the claimants. Stochastic feature transformations are then applied to remove the acoustic distortion introduced by the coder and the handset. Experimental results show that the proposed technique outperforms the CMS approach and significantly reduces the error rates under six different coders with bit rates ranging from 2.4 kb/s to 64 kb/s. Strong correlation between speech quality and verification performance is also observed. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Acoustic distortion

来源：评论

学校读者我要写书评

暂无评论

A GMM-Based handset selector for channel mismatch compensation with applications to speaker identification 2nd

引用

2nd IEEE Pacific-Rim Conference on multimedia, IEEE-PCM 2001

作者： Yiu, K.K. Mak, M.W. Kung, S.Y. Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering The Hong Kong Polytechnic University Hong Kong

ISBN: (纸本)3540426809

In telephone-based speaker identification, variation in handset characteristics can introduce severe speech variability even for speech uttered by the same speaker. This paper proposes a method to compensate the variation in handset characteristics. In the method, a number of Gaussian mixture models are independently trained to identify the most likely handset given a test utterance. The identified handset is used to select a compensation vector from a set of pre-computed vectors, where the pre-computed vectors are the average frame-by-frame differences between the clean and distorted utterances. The clean features are then recovered by subtracting the selected compensation vector from the distorted vectors. Experimental results based on 138 speakers of the YOHO and telephone YOHO corpora show that the proposed approach is computationally efficient and is able to increase the accuracy from 17% (without compensation) to 85% (with compensation). © Springer-Verlag Berlin Heidelberg 2001.

关键词： Vectors

来源：评论

学校读者我要写书评

暂无评论

Channel robust speaker verification via bayesian blind stochastic feature transformation

Channel robust speaker verification via bayesian blind stoch...

引用

9th European Conference on Speech Communication and Technology

作者： Yiu, Kwok-Kwong Mak, Man-Wai Kung, Sun-Yuan Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering Hong Kong Polytechnic University Dept. of Electrical Engineering Princeton University United States

In telephone-based speaker verification, the channel conditions can be varied significantly from sessions to sessions. Therefore, it is desirable to estimate the channel conditions online and compensate the acoustic distortion without prior knowledge of the channel characteristics. Because no a priori knowledge is used, the estimation accuracy depends greatly on the length of the verification utterances. This paper extends the Blind Stochastic Feature Transformation (BSFT) algorithm that we recently proposed to handle the short-utterance scenario. The idea is to estimate a set of prior transformation parameters from a development set in which a wide variety of channel conditions exists in the verification utterances. The prior transformations are then incorporated into the online estimation of the BSFT parameters in a Bayesian (maximum a posteriori) fashion. The resulting transformation parameters are therefore dependent on both the prior transformations and the verification utterances. For short (long) utterances, the prior transformations play a more (less) important role. We referred the extended algorithm to as Bayesian BSFT (BBSFT) and applied it to the 2001 NIST SRE task. Results show that Bayesian BSFT outperforms BSFT for utterances shorter than or equal to 4 seconds.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Kalman filtering approach to multispectral/hyperspectral image classification

引用

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS 1999年第1期35卷 319-330页

作者： Chang, CI Brumbley, C IEEE Remote Sensing Signal and Image Processing Laboratory Dept. of Computer Science and Electrical Engineering University of Maryland Baltimore County

Linear unmixing is a widely used remote sensing image processing technique for subpixel classification and detection where a scene pixel is generally modeled by a linear mixture of spectral signatures of materials present within the pixel. tin approach, called linear unmixing Kalman filtering (LUKF), is presented which incorporates the concept of linear unmixing into Kalman filtering so as to achieve signature abundance estimation, subpixel detection and classification for remotely sensed images. Zn this case, the linear mixture model used in linear unmixing is implemented as the measurement equation in Kalman filtering. The state equation which is required for Kalman filtering but absent in linear unmixing is then used to model the signature abundance. By utilizing these two equations the proposed LUKF not only can detect abrupt change in various signature abundances within pixels, but also can detect and classify desired target signatures. The performance of effectiveness and robustness of the LUKF is demonstrated through simulated data and real scene images, Satellite Pour l'Observation de la Terra (SPOT) and Hyperspectral Digital Imagery Collection (HYDICE) data.

关键词： .Baltimore County abundance image classification Equation image processing unmixing Kalman filtering SUBPIXEL spectral signatures Linear Mixture Model HYDICE LUKF

来源：评论

学校读者我要写书评

暂无评论

Probabilistic feature transformation for channel robust speaker verification

Probabilistic feature transformation for channel robust spea...

引用

2006 16th IEEE signal processing Society Workshop on Machine Learning for signal processing, MLSP 2006

作者： Mak, Man-Wai Yiu, Kwok-Kwong Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering Hong Kong Polytechnic University Hong Kong Hong Kong

ISBN: (纸本)1424406560

Feature transformation plays an important role in robust speaker verification over telephone networks. This paper compares several feature transformation techniques and evaluates their verification performance and computation time under the 2000 NIST speaker recognition evaluation protocol. Techniques compared include feature mapping (FM), stochastic feature transformation (SFT), and blind stochastic feature transformation (BSFT). The paper proposes a probabilistic feature mapping (PFM) in which the mapped features depend not only on the top-1 decoded Gaussian but also on the posterior probabilities of other Gaussians in the root model. The paper also proposes speeding up the computation of PFM and BSFT parameters by considering the top few Gaussians only. Results show that PFM performs slightly better than FM and that the fast approach can reduce computation time substantially. Among the approaches investigated, the fast BSFT is found to have the highest potential for robust speaker verification over telephone networks because it can achieve good performance without any a priori knowledge of the communication channel. It was also found that fusion of the scores derived from systems using BSFT and PFM can reduce the error rate further. © 2006 IEEE.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Object assisted video coding for video conferencing system 3rd

Object assisted video coding for video conferencing system

引用

3rd IEEE Pacific Rim Conference on multimedia, PCM 2002

作者： Lai, K.C. Wong, S.C. Lun, Daniel Centre for Multimedia Signal Processing Dept. of Electronic and Information Engineering The Hong Kong Polytechnic University Hung Hom Hong Kong

ISBN: (纸本)3540002626

An object-based video coding for video conferencing system is proposed. There are two main processes: segmentation process and face detection process. The segmentation process is used to segment each frame of a video sequence into two non-overlapping regions, namely foreground and background. A novel face detection technique based on chrominance and the contour of the segmented region is applied to the foreground region. Smaller quantization step is used for the facial region to improve viewer’s perception while a larger quantization step is used for the background to compensate the coding efficiency. The remaining regions are kept in normal coding quality to prevent degradation of important information other than the facial regio. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Articulatory feature-based conditional pronunciation modeling for speaker verification 8

Articulatory feature-based conditional pronunciation modelin...

引用

8th International Conference on Spoken Language processing, ICSLP 2004

作者： Leung, Ka-Yee Mak, Man-Wai Kung, Sun-Yuan Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering Hong Kong Polytechnic University Hong Kong Dept. of Electrical Engineering Princeton University United States

Because of the differences in education background, accents, etc., different persons have their unique way of pronunciation. This paper exploits the pronunciation characteristics of speakers and proposes a new conditional pronunciation modeling (CPM) technique for speaker verification. The proposed technique aims to establish a link between articulatory properties (e.g., manners and places of articulation) and phoneme sequences produced by a speaker. This is achieved by aligning two articulatory feature (AF) streams with a phoneme sequence determined by a phoneme recognizer, and formulating the probabilities of articulatory classes conditioned on the phonemes as speaker-dependent probabilistic models. The scores obtained from the AF-based pronunciation models are then fused with those obtained from a spectral-based speaker verification system, with the frame-by-frame fused scores weighted by the confidence of the pronunciation models. Evaluations based on the SPIDRE corpus demonstrate that AF-based CPM systems can recognize speakers even with short utterances and are readily combined with spectral-based systems to further enhance the reliability of speaker verification.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Adaptive decision fusion for multi-sample speaker verification over GSM networks 8

Adaptive decision fusion for multi-sample speaker verificati...

引用

8th European Conference on Speech Communication and Technology, EUROSPEECH 2003

作者： Cheung, Ming-Cheung Mak, Man-Wai Kung, Sun-Yuan Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering Hong Kong Polytechnic University Hong Kong Dept. of Electrical Engineering Princeton University United States

In speaker verification, a claimant may produce two or more utterances. In our previous study [1], we proposed to compute the optimal weights for fusing the scores of these utterances based on their score distribution and our prior knowledge about the score statistics estimated from the mean scores of the corresponding client speaker and some pseudo-impostors during enrollment. As the fusion weights depend on the prior scores, in this paper, we propose to adapt the prior scores during verification based on the likelihood of the claimant being an impostor. To this end, a pseudo-imposter GMM score model is created for each speaker. During verification, the claimant?s scores are fed to the score model to obtain a likelihood for adapting the prior score. Experimental results based on the GSM-transcoded speech of 150 speakers from the HTIMIT corpus demonstrate that the proposed prior score adaptation approach provides a relative error reduction of 15% when compared with our previous approach where the prior scores are non-adaptive.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Eukaryotic protein subcellular localization based on local pairwise profile alignment SVM

Eukaryotic protein subcellular localization based on local p...

引用

2006 16th IEEE signal processing Society Workshop on Machine Learning for signal processing, MLSP 2006

作者： Guo, Jian Mak, Man-Wai Kung, Sun-Yuan Center for Multimedia Signal Processing Dept. of Electronic and Information Engineering Hong Kong Polytechnic University Hong Kong Dept. of Electrical Engineering Princeton University United States

ISBN: (纸本)1424406560

This paper studies the use of profile alignment and support vector machines for subcellular localization. In the training phase, the profiles of all protein sequences in the training set are constructed by PSI-BLAST and the pairwise profile-alignment scores are used to form feature vectors for training a support vector machine (SVM) classifier. During testing, the profile of a query protein sequence is computed and aligned with all the profiles constructed during training to obtain a feature vector for classification by the SVM classifier. Tests on Reinhardt and Hubbard's eukaryotic protein dataset show that the total accuracy can reach 99.4%, which is significantly higher than those obtained by methods based on sequence alignments and amino acid composition. It was also found that the proposed method can still achieves a prediction accuracy of 96% even if none of the sequence pairs in the dataset contains more than 5% identity. This paper also demonstrates that the performance of the SVM is proportional to the degree of its kernel matrix meeting the Mercer's condition. © 2006 IEEE.

关键词： Proteins

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：