Room acoustic parameters that characterize acoustic environments can help to improve signal enhancement algorithms such as for dereverberation, or automatic speech recognition by adapting models to the current paramet...
详细信息
Room acoustic parameters that characterize acoustic environments can help to improve signal enhancement algorithms such as for dereverberation, or automatic speech recognition by adapting models to the current parameter set. The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key parameters. In this paper, we propose a blind ROom Parameter Estimator (ROPE) based on an artificial neural network that learns the mapping to discrete ranges of the RT and the ELR from single-microphone speech signals. Auditory-inspired acoustic features are used as neural network input, which are generated by a temporal modulation filter bank applied to the speech time-frequency representation. ROPE performance is analyzed in various reverberant environments in both clean and noisy conditions for both fullband and subband RT and ELR estimations. The importance of specific temporal modulation frequencies is analyzed by evaluating the contribution of individual filters to the ROPE performance. Experimental results show that ROPE is robust against different variations caused by room impulse responses (measured versus simulated), mismatched noise levels, and speech variability reflected through different corpora. Compared to state-of-the-art algorithms that were tested in the acoustic characterisation of environments (ACE) challenge, the ROPE model is the only one that is among the best for all individual tasks (RT and ELR estimation from fullband and subband signals). Improved fullband estimations are even obtained by ROPE when integrating speech-related frequency subbands. Furthermore, the model requires the least computational resources with a real time factor that is at least two times faster than competing algorithms. Results are achieved with an average observation window of 3 s, which is important for real-time applications.
The Internet of Things (IoT) is a Distributed System of cooperating Microservices (mu Ss). IoT services manage devices that monitor and control their environments. The interaction of the IoT with the physical environm...
详细信息
ISBN:
(纸本)9783903176140
The Internet of Things (IoT) is a Distributed System of cooperating Microservices (mu Ss). IoT services manage devices that monitor and control their environments. The interaction of the IoT with the physical environment creates strong security, privacy, and safety implications. It makes providing adequate security for IoT mu Ss essential. However, the complexity of IoT services makes detecting anomalous behavior difficult. We present a machine-learning based approach for modeling IoT service behavior by only observing inter-service communication. Our algorithm continuously learns S models on distributed IoT nodes within an IoT site. Combining the learned models within and in-between IoT sites converges our mu S models within short time. Sharing the resulting stable models among compute nodes enables good anomaly detection. As one application, firewalling IoT mu Ss becomes possible. Combining our autonomous mu S modeling with firewalling enables retrofitting security to existing IoT installations. We enable retrofitting access control to existing non-secure IoT installations. Our proposed approach is resource efficient, matching the requirements of the IoT. To evaluate the quality of our proposed algorithm, we show its behavior for a set of common IoT attacks. We evaluate how domain knowledge enables us to decorrelate events on a node, and how adding context features improves the detection rate.
Modern supercomputers often use Graphic Processing Units (or GPUs) to meet the ever-growing demands for high performance computing. GPUs typically have a complex memory architecture with various types of memories and ...
详细信息
ISBN:
(纸本)9781450361132
Modern supercomputers often use Graphic Processing Units (or GPUs) to meet the ever-growing demands for high performance computing. GPUs typically have a complex memory architecture with various types of memories and caches, such as global memory, shared memory, constant memory, and texture memory. The placement of data on these memories has a tremendous impact on the performance of the hpc applications and identifying the optimal placement location is non-trivial. In this paper, we propose a machinelearning-based approach to build a classifier to determine the best class of GPU memory that will minimize GPU kernel execution time. This approach utilizes a set of performance counters obtained from profiling runs along with hardware features to generate the trained model. We evaluate our approach on several generations of NVIDIA GPUs, including Kepler, Maxwell, Pascal, and Volta on a set of benchmarks. The results show that the trained model achieves prediction accuracy over 90% and given a global version, the classifier can accurately determine which data placement variant would yield the best performance.
The increase in popularity of video streaming, the improvement of bandwidth corresponding to 5G networks and the transmission of higher amounts of data derived from advanced video formats such as Ultra High Definition...
详细信息
ISBN:
(纸本)9781538693858
The increase in popularity of video streaming, the improvement of bandwidth corresponding to 5G networks and the transmission of higher amounts of data derived from advanced video formats such as Ultra High Definition (UHD) with 4K and 8K resolutions make the user demand a high perceptual quality of the contents consumed. For that reason, it is necessary to create advanced models for predicting video quality based on the video features and the encoding settings in environments where the reference is not present. The use of machinelearning (ML) techniques for data analysis based on patterns extracted on the features of audio-visual content, improves the generation of prediction models for accurately predicting quality. This paper presents a novel model for assessing video quality based on the analysis of encoding video settings of the transmitted contents and the image intrinsic characteristics for objectively estimating the Mean Opinion Score (MOS) in correlation with the subjective results. The use of data mining techniques for combining a collection of parameters associated to the video transmitted improves the performance of traditional quality evaluation, as demonstrated with the database analyzed for this purpose.
The combination of growth in compute capabilities and availability of large datasets has led to a re-birth of deep learning. Deep Neural Networks (DNNs) have become state-of-the-art in a variety of machinelearning ta...
详细信息
ISBN:
(纸本)9781450357043
The combination of growth in compute capabilities and availability of large datasets has led to a re-birth of deep learning. Deep Neural Networks (DNNs) have become state-of-the-art in a variety of machinelearning tasks spanning domains across vision, speech, and machine translation. Deep learning (DL) achieves high accuracy in these tasks at the expense of 100s of ExaOps of computation; posing significant challenges to efficient large-scale deployment in both resource-constrained environments and data *** of the key enablers to improve operational efficiency of DNNs is the observation that when extracting deep insight from vast quantities of structured and unstructured data the exactness imposed by traditional computing is not required. Relaxing the "exactness" constraint enables exploiting opportunities for approximate computing across all layers of the system *** this talk we present a multi-TOPS AI core [3] for acceleration of deep learning training and inference in systems from edge devices to data centers. We demonstrate that to derive high sustained utilization and energy efficiency from the AI core requires ground-up re-thinking to exploit approximate computing across the stack including algorithms, architecture, programmability, and *** accuracy is the fundamental measure of deep learning quality. The compute engine precision in our AI core is carefully calibrated to realize significant reduction in area and power while not compromising numerical accuracy. Our research at the DL algorithms/applications-level [2] shows that it is possible to carefully tune the precision of both weights and activations to as low as 2-bits for inference and was used to guide the choices of compute precision supported in the architecture and hardware for both training and inference. Similarly, distributed DL training's scalability is impacted by the communication overhead to exchange gradients and weights after each mini-batch. Our research on gradient
Convolution Neural Networks (CNNs), a special subcategory of Deep learning Neural Networks (DNNs), have become increasingly popular in industry and academia for their powerful capability in pattern classification, ima...
详细信息
Convolution Neural Networks (CNNs), a special subcategory of Deep learning Neural Networks (DNNs), have become increasingly popular in industry and academia for their powerful capability in pattern classification, image processing, and speech recognition. Recently, they have been widely adopted in High Performance Computing (hpc) environments for solving complex problems related to modeling, runtime prediction, and big data analysis. Current state-of-the-art designs for DNNs on modern multi- and many-core CPU architectures, such as variants of Caffe, have reported promising performance in speedup and scalability, comparable with the GPU implementations. However, modern CPU architectures employ Non-Uniform Memory Access (NUMA) technique to integrate multiple sockets, which incurs unique challenges for designing highly efficient CNN frameworks. Without a careful design, DNN frameworks can easily suffer from long memory latency due to a large number of memory accesses to remote NUMA domains, resulting in poor scalability. To address this challenge, we propose NUMA-aware multi-solver-based CNN design, named NUMA-Caffe, for accelerating deep learning neural networks on multi- and many-core CPU architectures. NUMA-Caffe is independent of DNN topology, does not impact network convergence rates, and provides superior scalability to the existing Caffe variants. Through a thorough empirical study on four contemporary NUMA-based multi- and many-core architectures, our experimental results demonstrate that NUMA-Caffe significantly outperforms the state-of-the-art Caffe designs in terms of both throughput and scalability.
While a large number of deep learning networks have been studied and published that produce outstanding results on natural image datasets, these datasets only make up a fraction of those to which deep learning can be ...
详细信息
暂无评论