The authors present PTEAR_VLSNR (pitchtracking basing on Evolutionary algorithm with Regularization at Very Low SNR), a pitch tracking algorithm for speech in strong noise. The algorithm builds a pitch enhancement an...
详细信息
The authors present PTEAR_VLSNR (pitchtracking basing on Evolutionary algorithm with Regularization at Very Low SNR), a pitch tracking algorithm for speech in strong noise. The algorithm builds a pitch enhancement and extraction model, which enhance the pitch by a matched filter, and to further deal with strong noise, the optimal factor was proposed, which can be optimised globally by the evolutionary computing. Specially, regularisation constraint of fitness function was applied to enhance the generalisation ability. Temporal dynamics constraints are used to improve the tracking rate and the voicing decision can be optimal by evolutionary computing similarly. In addition, the balance of optimisation accuracy and time cost were considered. In experiments, genetic algorithm and particle swarm optimisation with two-norm term were represented as evolutionary algorithms with regularisation. At last, they compare the performance of the algorithm and other representative algorithms. The experimental results show that this proposed algorithm performs well in both high and low signal-to-noise ratios (SNRs).
The pitchtracking of music has been researched for several decades. Several possible improvements are available for creating a good t-distribution, using the instantaneous robust algorithm for pitchtracking framewor...
详细信息
The pitchtracking of music has been researched for several decades. Several possible improvements are available for creating a good t-distribution, using the instantaneous robust algorithm for pitchtracking framework to perfectly detect pitch. This article shows how to detect the pitch of music utilizing an improved detection method which applies a statistical method;this approach uses a pitch track, or a sequence of frequency bin numbers. This sequence is used to create an index that offers useful features for comparing similar songs. The pitch frequency spectrum is extracted using a modified instantaneous robust algorithm for pitchtracking (IRAPT) as a base combined with the statistical method. The pitch detection algorithm was implemented, and the percentage of performance matching in Thai classical music was assessed in order to test the accuracy of the algorithm. We used the longest common subsequence to compare the similarities in pitch sequence alignments in the music. The experimental results of this research show that the accuracy of retrieval of Thai classical music using the t-distribution of instantaneous robust algorithm for pitchtracking (t-IRAPT) is 99.01%, and is in the top five ranking, with the shortest query sample being five seconds long.
pitch tracking algorithms have been proposed in many digital speech processing literature. Among the practical use of pitchtracking are: improved recognition, improved speech synthesis, and semantic disambiguation. A...
详细信息
ISBN:
(纸本)9781479951000
pitch tracking algorithms have been proposed in many digital speech processing literature. Among the practical use of pitchtracking are: improved recognition, improved speech synthesis, and semantic disambiguation. A similar problem to pitchtracking when applied to music input signals, is note tracking, i.e. detecting all the notes in the perceived music. The general problem of music recognition seems to be beyond the techniques that have been accomplished by the advances in digital speech processing. A "real music" signal is composed of multiple sound from several instruments, and digitally separating the mix into individual channels/tracks is a hard problem to solve. The algorithm described in this paper assumes that the input signal is produced by a single source and further it focuses on monophonic sound, as opposed to polyphonic sound where two or more notes are played at the same time. The algorithm described below has been implemented on an Android device using proper building blocks (Activity and Service) that comply with the Android design guidelines to achieve the best performance. In addition to the standard Android libraries from the latest Android SDK, the application also relies on a third-party library for digital signal processing routines. The Android implementation of the algorithm has been tested using input sources from human voice and musical instruments. The paper also shows the experimental results of handling these input sources.
暂无评论