The Mossformer model excels in speech separation but has not been effectively applied to music source separation. Music sources have complex characteristics and higher sampling rates, making separation tasks more chal...
详细信息
ISBN:
(数字)9798350380347
ISBN:
(纸本)9798350380354
The Mossformer model excels in speech separation but has not been effectively applied to music source separation. Music sources have complex characteristics and higher sampling rates, making separation tasks more challenging. We addressed a rarely explored task of separating piano concerto recordings into individual piano and orchestral tracks. This process involves intricate coordination between the piano and orchestra, creating highly complex audio signals in both time and frequency domains. Our main contributions include: (1) adapting the speech separation model for the novel task of piano concerto source separation, constructing and processing a specialized dataset.(2) introducing channel attention in the separation module to dynamically adjust feature focus based on instrument characteristics, enhancing key features. Experiments on the Piano Concerto Dataset (PCD) showed improved separation performance, with a 0.22dB average Signal-to-Distortion Ratio (SDR) increase over the baseline model.
The oral story culture of ethnic folklore is one of the important components of traditional Chinese culture, and it is of key importance to study the timbre evaluation methods of this type of audio after digital captu...
The oral story culture of ethnic folklore is one of the important components of traditional Chinese culture, and it is of key importance to study the timbre evaluation methods of this type of audio after digital capture. This paper uses subjective timbre evaluation experiments to explore the subjective perceptual characteristics of audio signals of oral story, extracts the objective acoustic parameters of audio timbre based on human vocal principles, auditory perceptual characteristics and speech time domain features, and conducts research on objective timbre evaluation methods based on support vector regression (SVR), random forest regression (RFR), convolutional neural networks (CNN) and long short-term memory networks (LSTM). The results show that the use of feature extraction and non-linear algorithms has good results in the evaluation of oral story timbre.
At present Auto-Tracking technology attracted increasing attention in the field of stage lighting. Most of the existing automatic lighting tracing methods adopt indoor positioning technology to acquire the location pa...
ISBN:
(数字)9781728160573
ISBN:
(纸本)9781728160580
At present Auto-Tracking technology attracted increasing attention in the field of stage lighting. Most of the existing automatic lighting tracing methods adopt indoor positioning technology to acquire the location parameters of the tracing object. Then the location parameters were converted into the control data of moving light for automatic tracing. The main problems are as follows: one is the positional errors of special light position such as front light and fixed-point light. The other is the miss tracking of moving targets. In this paper, according to the projection requirements of front light and fixed-point light, the existing tracing model is improved. The error between the spot and the actual position is analyzed. Test results show that the accuracy of the upper left and upper right directions is relatively high. The bigger the curvature amplitude is, the bigger the error is. At the same time, the test data provide theory evidence for the setting of the tracing light position and the design of the performance route.
visual comfort is one of the important indexes to evaluate the image quality and viewing experience in the process of viewing stereoscopic video and images. This paper mainly investigated the effect of the main part s...
详细信息
ISBN:
(纸本)9781510812055
visual comfort is one of the important indexes to evaluate the image quality and viewing experience in the process of viewing stereoscopic video and images. This paper mainly investigated the effect of the main part size and disparity distribution type of stereoscopic images on visual comfort through two subjective evaluation experiments. The experiment results confirmed that the main part size and disparity space distribution type are two important factors which can affect visual comfort, and they also proved the influence trend of these factors on visual comfort in different disparity conditions. Our experiments have important guiding significance for the establishment and improvement of the relevant objective evaluation experimental model and acquisition and display technology of stereoscopic images.
The purpose of the research on emotion classification of film and television (TV) scene images is to hope that the computer can simulate the audience's emotional perception to judge positive or negative emotional ...
详细信息
ISBN:
(数字)9781728155869
ISBN:
(纸本)9781728155876
The purpose of the research on emotion classification of film and television (TV) scene images is to hope that the computer can simulate the audience's emotional perception to judge positive or negative emotional trends of the image. In this paper, from the perspective of emotional semantics, a method of the emotion classification about the film and TV scene image is proposed. Firstly, the film and TV scene image dataset were established and assessed based on the subjective evaluation experiment. Then the theory of psychology and photographic art was used to extract image emotional features. Finally, particle swarm optimization (PSO)-support vector machine (SVM) algorithm based on feature importance dimension reduction was used for building the classifier. The experimental results show that the image dataset initially established in this paper can be helpful to the subsequent study on emotion classification of the film and TV scene image, and the accuracy of the positive and negative emotion classifier on this dataset is 0.87, which conforms to audience's emotional perception of the film and TV scene image.
To protect the acoustical environment of auditorium in performing place, this paper mainly focusses on the emission noise measurement method of stage machinery. For the high noise and frequently-used equipment during ...
To protect the acoustical environment of auditorium in performing place, this paper mainly focusses on the emission noise measurement method of stage machinery. For the high noise and frequently-used equipment during the performance, the measurement equipment, environment and condition, detection position, evaluation parameter, tested object and the running requirement are all studied and discussed. In particular, the suitable test point and load demand is analyzed through the simulation of EASE software and the noise measurement under laboratory environment. Our research aims to provide the basic work for developing the draft standard for stage machinery noise, which has certain application value to improve the audio-visual effects and control the noise pollution for modern theatre.
This paper presents a learning-based high-speed trajectory tracking control strategy for quadrotors, which achieves efficient learning and strong reliability by the collaboration of deep reinforcement learning (RL) an...
详细信息
ISBN:
(数字)9798350379228
ISBN:
(纸本)9798350390780
This paper presents a learning-based high-speed trajectory tracking control strategy for quadrotors, which achieves efficient learning and strong reliability by the collaboration of deep reinforcement learning (RL) and self-tuning mechanism. Different from existing methods, the proposed strategy is designed to explore optimal control performance by taking advantage of model-based self-tuning mechanism and deep reinforcement learning. Specifically, the self-tuning guided deep RL scheme is put forward for quadrotors, with superior learning efficiency and strong adaptability. Firstly, a novel self-tuning mechanism is constructed and some auxiliary variables are introduced to enhance the tracking performance. Then, based on the model-driven self-tuning design, the deep RL is proposed to achieve model-guided learning, where the tuning actions are adopted in the evaluation process during training, aiming at removing the bad explorations by the carefully designed parallel evaluation. Finally, the convergence is analyzed based on the proposed learning framework, which indicates the efficient cooperation of exploration and self-tuning mechanism. To verify the effectiveness of the proposed controller, the guided training and hardware experiments are implemented to show efficient cooperation and satisfactory high-speed trajectory tracking control of the proposed method.
暂无评论