检索结果-内蒙古大学图书馆

International Conference on Signals and Electronic Systems (ICSES 2008)

作者： Maka, Tomasz Bonikowski, Lukasz Tech Univ Szczecin PL-71210 Szczecin Poland

ISBN: (纸本)9788388309472

The analysis of three sets of feature vectors used in speaker identification (ID) systems for speech signals received in encoding-decoding process with AMR, SPEEX and MELP coders has been presented. We have analyzed feature sets for various speech coding bit rates using SVM-based speaker ID system. The results were compared with identification accuracy obtained with vectors where fundamental frequency was an additional feature. Performed experiments show that such feature contributes better identification accuracy for coded speech than uncoded one in most cases.

关键词： speech coding speaker identification F0 feature

来源：评论

学校读者我要写书评

暂无评论

speech coding BASED ON PITCH SYNCHRONY AND TWO-STAGE TRANSFORMATION

SPEECH CODING BASED ON PITCH SYNCHRONY AND TWO-STAGE TRANSFO...

引用

IEEE International Conference on Acoustics, speech, and Signal Processing (ICASSP)

作者： Li, Xiao-ming Bao, Chang-chun Kleijn, W. Bastiaan Beijing Univ Technol Speech & Audio Signal Proc Lab Sch Elect Informat & Control Engn Beijing Peoples R China

ISBN: (纸本)9781479903566

In this paper, an effective speech coder that is based on a sparse representation of speech by exploiting the strong dependencies between adjacent pitch cycles is proposed. In the proposed coder, a pitch-synchronous processing that consists of pitch warping and a two-stage transformation is used to achieve a compact representation of the voiced speech. Power spectral density preserving quantization (PSD-PQ) is adopted for quantizing the transform coefficients. The result is a coder that is efficient over a wide range of bit rates: it approaches perfect reconstruction with increasing rate, and has a parametric signal representation at low rates. Both objective PESQ results and subjective A/B listening tests show that the proposed coder outperforms the ITU-T G. 722.1 codec.

关键词： speech coding pitch-synchronous compact representation quantization

来源：评论

学校读者我要写书评

暂无评论

Linear inter-frame dependencies for very low bit-rate speech coding

引用

speech COMMUNICATION 2001年第4期34卷 333-349页

作者： López-Soler, JM Sánchez, V de la Torre, A Rubio-Ayuso, AJ Univ Granada Fac Ciencias Dept Elect & Tecnol Comp E-18071 Granada Spain

We have studied experimentally the operational rate-distortion performance for very low bit-rate speech coding using linear inter-frame dependencies. We propose an algorithm that efficiently combines quantization and linear interpolation procedures. With a maximum delay of 200 ms, for the spectral envelope information and using line spectrum pair (LSP) parameters as input space the proposed algorithm performs best at rates of between 200 and 300 b/s. For comparison's sake several other procedures such as the multi-frame encoder (Kemp D., Collura J., Tremain T., Multi-Frame coding of LPC Parameters at 600-800 bps. In: IEEE ICASSP-91, 1991, pp. 609-612) and matrix quantizer (Tsao C., Gray R., Matrix quantizer design for LPC speech using the generalized Lloyd algorithm. IEEE Transactions on Acoustics, speech, and Signal Processing ASSP-33, 1985, 537-545) are simulated. Furthermore, a mono-dimensional version of the proposed procedure is shown experimentally to provide the best operational rate-distortion trade-off when coding a parametric representation (pitch, gain and voicing information) of the excitation signal. (C) 2001 Elsevier Science B.V. All rights reserved.

关键词： speech coding quantization interpolation inter-frame dependencies

来源：评论

学校读者我要写书评

暂无评论

The practical use of noise to improve speech coding by analogue cochlear implants

引用

CHAOS SOLITONS & FRACTALS 2000年第12期11卷 1885-1894页

作者： Morse, RP Meyer, GF Univ Keele MacKay Inst Commun & Neurosci Sch Life Sci Keele ST5 5BG Staffs England

The addition of noise to speech signals coded by an analogue multichannel cochlear implant has previously been shown in modelling studies to enhance the representation of speech cues by the fine time structure of evoked nerve discharges. The enhancement, however, occurred only for a range of noise levels, and this range was stimulus dependent. Theoretically, fine optimization of the noise levels would be unnecessary if each implant channel stimulated a group of cochlear nerve fibres such that each fibre in the group received an independent noise waveform in addition to the same information-bearing signal. We present results from computer simulations that suggest that current spread in the cochlea may be exploited to obtain a high degree of independence between the noise waveforms that stimulate adjacent fibres. The model simulated monopolar stimulation of a cochlear nerve by 11, 21 or 41 electrodes in the scala tympani. The correlation between the effective stimuli for pairs of nerve fibres and the correlation between the corresponding evoked discharges were calculated for two noise strategies. In one strategy, an independent noise current was applied to each electrode. Less correlation between effective stimuli was obtained with the alternate strategy that used inhibition between the noise sources. (C) 2000 Elsevier Science Ltd. All rights reserved.

关键词： speech coding

来源：评论

学校读者我要写书评

暂无评论

ADAPTIVE DENSITY PULSE EXCITATION FOR LOW BIT-RATE speech coding

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 1995年第2期E78A卷 199-207页

作者： AKAMINE, M MISEKI, K Toshiba Corp Kawasaki-shi Japan

An excitation signal for a synthesis filter plays an important role in producing high quality speech at a low bit rate. This paper presents a new efficient excitation model, Adaptive Density Pulse (ADP), for Low bit-rate speech coding. This ADP is a pulse train whose density (spacing interval) is constant within a subframe but can be varied subframe by subframe. First, the ADP excitation signal is defined. A procedure for finding the optimal ADP excitation is presented. Some results on investigating the effects of the ADP parameters on the synthesized speech quality are discussed. ADP excitation is introduced to the CELP (Code Excited Linear Prediction) coding method to improve speech quality at bit rates around 4 kbps. A CELP coder with an ADP (ADP-CELP) is described. ADP excitation makes it possible for the CELP coder to follow transient portions of speech signals. Also ADP excitation can reduce computational complexity in selecting the best excitation from a codebook, which has been the primary drawback of CELP. The number of multiplications can be reduced to the order of 1/D-2 by utilizing the sparseness of ADP excitation, where D is the pulse interval. The authors evaluated the speech quality of a 4 kbps ADP-CELP coder by computer simulation. ADP excitation improved the performance of conventional CELP in segmental SNR.

关键词： speech coding CELP EXCITATION ADAPTIVE DENSITY PULSE

来源：评论

学校读者我要写书评

暂无评论

Rate adaptive speech coding for universal multimedia access

引用

IEEE SIGNAL PROCESSING MAGAZINE 2003年第2期20卷 30-39页

作者： Homayounfar, K Genista Corporation Tokyo Japan

This article reviews state-of-the-art in transport adaptation techniques for mobile networks. It discusses the mechanisms for rate adaptation to combat quality degradations of speech caused by the radio links. It begins with a review of dynamic schemes for adaptation of speech encoders in cellular networks where we observe two distinct approaches to rate adaptation: network controlled and source controlled. The issues associated with adaptive voice over IP (VoIP) mechanisms are considered next. Here, the encoder detects some form of network congestion to judge how to behave itself for the good of the network. It is noted that this altruistic behavior will only benefit coordinated IP networks such as private intranets and its application to the public Internet is improbable.

关键词： speech coding Transcoding Bit rate Scalability Codecs Automated highways IP networks Streaming media Quality control Monitoring

来源：评论

学校读者我要写书评

暂无评论

ROBUST speech coding FOR THE INDOOR WIRELESS CHANNEL

引用

AT&T TECHNICAL JOURNAL 1993年第4期72卷 64-73页

作者： GOULD, KW COX, RV JAYANT, NS MELCHNER, MJ AT&T BELL LABS SPEECH CODING RES DEPTMURRAY HILLNJ 07974

Alternative methods for digitally transcoding speech for radio transmission in an indoor environment have been investigated and compared to the CCITT standard, adaptive differential pulse code modulation (ADPCM).1 These alternative coders-are designed to minimize the effects of transmission errors on the quality of the transcoded speech. The coders compared are CCITT standard G.721 ADPCM, adaptive sub-band coding, and two other non-standard versions of ADPCM. In general, when packets of data are lost the adaptive sub-band coder performs extremely well in terms of maintaining speech quality, as the sub-band synthesis filters fill out the gaps in speech. However, the sub-band coder requires the greatest levels of complexity and delay. The other ADPCM systems offer lower complexity and delay-at the expense of lower speech quality.

关键词： speech coding

来源：评论

学校读者我要写书评

暂无评论

Distortion measures for speech coding

引用

IETE TECHNICAL REVIEW 1998年第4期15卷 251-258页

作者： De, A NORTEL Bell No Res Montreal PQ H3E 1H6 Canada

In this article, we have reviewed some of the existing subjective and objective measures used in the area of speech coding. The mean opinion score and the diagnostic acceptability measure are two of the widely used subjective measures. The most popular class of the time-domain measures is the signal-to-noise ratio (SNR) with its variants such as the segmental SNR, the granular segmentsal SNR etc. Among the spectral distortion measures, the log likelihood ratio measure, the lag area ratio measure, the log spectral distortion measure, the cepstral distance and the Itakura-Saito distortion measure are quite well-known. Some of the more recently proposed objective measures place emphasis on the perceptually significant aspects. Three such classes of the psychoacoustically-motivated measures are the information index, the Bark spectral distortion measure and the neural distance measure (e.g., the cochlear discrimination information, the cochlear hidden Markovian measues). The merit of considering important perceptual events is evident in the success of these measures.

关键词： speech coding

来源：评论

学校读者我要写书评

暂无评论

ALGEBRAIC speech coding - A TERNARY CELP

引用

ANNALES DES TELECOMMUNICATIONS-ANNALS OF TELECOMMUNICATIONS 1992年第5-6期47卷 214-226页

作者： DIFRANCESCO, R 1. France Telecom DTRE 100 rue du Faubourg Saint-Antoine F-75012 Paris France

This article presents new speech coding methods for real time application (telephone, videophone) or off-line applications (storage). speech quality is in the classical telephone range, with a 4 kHz bandwidth and a sampling at 8 kHz. An elementary approach leads to a 16 kbit/s codec and a 24 kbit/s codec, using integer codebooks and fast computations. The speech quality of the two codecs has been measured in comparison with more complex ones and in realistic conditions, with noisy telecommunication channels. The elementary approach is completed by a synthetic model, with a systematic generalization of the algorithms (e.g. for a generalized VSELP). Some methods for channel protection, which are already known by the speech coding researchers, are summed up in the Appendix. A change of representation for low density codes (less than 1 bit/sample) is proposed.

关键词： speech coding ALGEBRAIC METHOD LINEAR PREDICTION CODEC

来源：评论

学校读者我要写书评

暂无评论

A psychoacoustic "NofM"-type speech coding strategy for cochlear implants

引用

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING 2005年第18期2005卷 3044-3059页

作者： Nogueira, W Büchner, A Lenarz, T Edler, B Leibniz Univ Hannover Informat Technol Lab D-30167 Hannover Germany Med Univ Hanover Dept Otolaryngol D-30625 Hannover Germany

We describe a new signal processing technique for cochlear implants using a psychoacoustic-masking model. The technique is based on the principle of a so-called "NofM" strategy. These strategies stimulate fewer channels (N) per cycle than active electrodes (NofM;N < M). In "NofM" strategies such as ACE or SPEAK, only the N channels with higher amplitudes are stimulated. The new strategy is based on the ACE strategy but uses a psychoacoustic-masking model in order to determine the essential components of any given audio signal. This new strategy was tested on device users in an acute Study, with either 4 or 8 channels stimulated per cycle. For the first condition (4 channels), the mean improvement over the ACE strategy was 17%. For the second condition (8 channels), no significant difference was found between the two strategies.

关键词： cochlear implant NofM ACE speech coding psychoacoustic model masking

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：