Most multi-channel speaker extraction schemes use the target speaker’s location information as a reference, which must be known in advance or derived from visual cues. In addition, memory and computation costs are en...
Most multi-channel speaker extraction schemes use the target speaker’s location information as a reference, which must be known in advance or derived from visual cues. In addition, memory and computation costs are enormous when the model deals with the fusion input. In this paper, we propose Speaker-extraction-and-filter Network (SeafNet), which is a low-complexity multi-channel speaker extraction network with only speech cues. Specifically, the SeafNet separates the mixture by utilizing the correlation between an estimation of target speaker on reference channel and the mixed input on rest channels. Experimental results show that compared with the baseline, the SeafNet model achieves 6.4% relative SISNRi improvement on the fixed geometry array and 8.9% average relative SISNRi improvement on the ad-hoc array. Meanwhile, the SeafNet achieves 60% relative reduction in the number of parameters and 42% relative reduction in the computational cost.
The development of industrial robots, as a carrier of artificial intelligence, has played an important role in promoting the popularisation of artificial intelligence super automation technology. The paper introduces ...
详细信息
Imitation learning has emerged as a promising approach for addressing sequential decision-making problems, with the assumption that expert demonstrations are optimal. However, in real-world scenarios, most demonstrati...
Imitation learning has emerged as a promising approach for addressing sequential decision-making problems, with the assumption that expert demonstrations are optimal. However, in real-world scenarios, most demonstrations are often imperfect, leading to challenges in the effectiveness of imitation learning. While existing research has focused on optimizing with imperfect demonstrations, the training typically requires a certain proportion of optimal demonstrations to guarantee performance. To tackle these problems, we propose to purify the potential noises in imperfect demonstrations first, and subsequently conduct imitation learning from these purified demonstrations. Motivated by the success of diffusion model, we introduce a two-step purification via diffusion process. In the first step, we apply a forward diffusion process to smooth potential noises in imperfect demonstrations by introducing additional noise. Subsequently, a reverse generative process is utilized to recover the optimal demonstration from the diffused ones. We provide theoretical evidence supporting our approach, demonstrating that the distance between the purified and optimal demonstration can be bounded. Empirical results on MuJoCo and RoboSuite demonstrate the effectiveness of our method from different aspects.
In this work, we focus on the challenging problem of Label Enhancement (LE), which aims to exactly recover label distributions from logical labels, and present a novel Label Information Bottleneck (LIB) method for LE....
In this work, we focus on the challenging problem of Label Enhancement (LE), which aims to exactly recover label distributions from logical labels, and present a novel Label Information Bottleneck (LIB) method for LE. For the recovery process of label distributions, the label irrelevant information contained in the dataset may lead to unsatisfactory recovery performance. To address this limitation, we make efforts to excavate the essential label relevant information to improve the recovery performance. Our method formulates the LE problem as the following two joint processes: 1) learning the representation with the essential label relevant information, 2) recovering label distributions based on the learned representation. The label relevant information can be excavated based on the “bottleneck” formed by the learned representation. Significantly, both the label relevant information about the label assignments and the label relevant information about the label gaps can be explored in our method. Evaluation experiments conducted on several benchmark label distribution learning datasets verify the effectiveness and competitiveness of LIB. Our source codes are available at https://***/qinghai-zheng/LIBLE.
Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpe...
详细信息
In order to solve the problem of poor learning effect caused by data heterogeneity among different participants in the existing federated learning methods, this paper proposes a federated data augmentation algorithm b...
详细信息
The problem of visual object tracking has traditionally been handled by variant tracking paradigms, either learning a model of the object's appearance exclusively online or matching the object with the target in a...
详细信息
Effectively monitoring ships and discovering abnormal ship trajectory in time is necessary for marine traffic supervision. The basic work of discovering the ship's abnormal trajectory is to predict the ship's ...
ISBN:
(数字)9781728195582
ISBN:
(纸本)9781728195599
Effectively monitoring ships and discovering abnormal ship trajectory in time is necessary for marine traffic supervision. The basic work of discovering the ship's abnormal trajectory is to predict the ship's navigation dynamically. Previous works in ship trajectory prediction are basically concern on single-source data, for example, the AIS data. These methods ignore the relations between different sources which may improve the performance of predicting ship trajectory. We propose a neural sequence model based on heterogeneous multisource fusion for ship trajectory completion and prediction. Our method makes better utilization of AIS, GPS and ARPA radar information to predict ship trajectory precisely. We construct a dataset which contains about 8 million ship trajectory samples and the experiments demonstrate that our multi-source fusion model gains promising results.
It is a trend now that computing power through parallelism is provided by multi-core systems or heterogeneous architectures for High Performance Computing (HPC) and scientific computing. Although many algorithms have ...
详细信息
ISBN:
(纸本)9781509052530
It is a trend now that computing power through parallelism is provided by multi-core systems or heterogeneous architectures for High Performance Computing (HPC) and scientific computing. Although many algorithms have been proposed and implemented using sequential computing, alternative parallel solutions provide more suitable and high performance solutions to the same problems. In this paper, three parallelization strategies are proposed and implemented for a dynamic programming based cloud smoothing application, using both shared memory and non-shared memory approaches. The experiments are performed on NVIDIA GeForce GT750m and Tesla K20m, two GPU accelerators of Kepler architecture. Detailed performance analysis is presented on partition granularity at block and thread levels, memory access efficiency and computational complexity. The evaluations described show high approximation of results with high efficiency in the parallel implementations, and these strategies can be adopted in similar data analysis and processing applications.
Countless sensors embedded in IoT devices produce an ocean of data. The quality of IoT services depends on this information;hence, its accuracy is critical. Unfortunately, noise, collision, unreliable network connecti...
详细信息
暂无评论