检索结果-内蒙古大学图书馆

arXiv 2022年

作者： Hasana, Nahian Ibn Department of Electrical and Electronic Engineering Bangladesh University of Engineering and Technology Dhaka Bangladesh

Identification of bird species from audio records is one of the challenging tasks due to the existence of multiple species in the same recording, noise in the background, and long-term recording. Besides, choosing a proper acoustic feature from audio recording for bird species classification is another problem. In this paper, a hybrid method is represented comprising both traditional signal processing and a deep learning-based approach to classify bird species from audio recordings of diverse sources and types. Besides, a detailed study with 34 different features helps to select the proper feature set for classification and analysis in real-time applications. Moreover, the proposed deep neural network uses both acoustic and temporal feature learning. The proposed method starts with detecting voice activity from the raw signal, followed by extracting short-term features from the processed recording using 50 ms (with 25ms overlapping) time windows. Later, the short-term-features are reshaped using second stage (non-overlapping) windowing to be trained through a distributed 2D Convolutional neural network (CNN) that forwards the output features to a Long and Short Term Memory (LSTM) network. Then a final dense layer classifies the bird species. For the 10 class classifier, the highest accuracy achieved was 90.45% for a feature set consisting of 13 Mel Frequency Cepstral Coefficients (MFCCs) and 12 Chroma Vectors. The corresponding specificity and AUC scores are 98.94% and 94.09%, respectively. Copyright © 2022, The Authors. All rights reserved.

关键词： Birds

来源：评论

学校读者我要写书评

暂无评论

Minimization of the Training Makespan in Hybrid Federated Split Learning

引用

IEEE TRANSACTIONS ON MOBILE COMPUTING 2025年第6期24卷 5400-5417页

作者： Tirana, Joana Tsigkari, Dimitra Iosifidis, George Chatzopoulos, Dimitris Univ Coll Dublin Sch Comp Sci Dublin Ireland Tel Res Madrid 28050 Spain Delft Univ Technol Dept Software Technol NL-2628 Delft Netherlands

Parallel Split Learning (SL) allows resource-constrained devices that cannot participate in Federated Learning (FL) to train deep neural networks (NNs) by splitting the NN model into parts. In particular, such devices (clients) may offload the processing task of the largest model part to a computationally powerful helper, and multiple helpers may be employed and work in parallel. In hybrid federated and split learning (HFSL), on the other hand, devices can participate in the training process through any of the two protocols (SL and FL), depending on the system's characteristics. This could considerably reduce the maximum training time over all clients (makespan), especially in highly heterogeneous scenarios. In this paper, we study the joint problem of the training protocol selection, client-helper assignments, and scheduling decisions, to minimize the training makespan. We prove this problem is NP-hard and propose two solution methods: one based on the decomposition of the problem by leveraging its inherent symmetry, and a second fully scalable one. Through numerical evaluations using our testbed's measurements, we build a solution strategy comprising these methods. Moreover, this strategy finds a near-optimal solution and achieves a shorter makespan than the baseline schemes by up to 71%.

关键词： Training Computational modeling Protocols Artificial neural networks Numerical models Load modeling Delays Costs Scheduling Processor scheduling Federated learning split learning distributed learning optimization

来源：评论

学校读者我要写书评

暂无评论

Real-time pneumonia prediction using pipelined spark and high-performance computing

引用

PEERJ COMPUTER SCIENCE 2023年 9卷 e1258页

作者： Ravikumar, Aswathy Sriraman, Harini Vellore Inst Technol Sch Comp Sci & Engn Chennai Tamil Nadu India

Background: Pneumonia is a respiratory disease caused by bacteria;it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. Methods: In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark's distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. R

关键词： Parameter server Convolutional neural network Spark Data parallel model Prediction model Pneumonia distributed deep learning High performance computing

来源：评论

学校读者我要写书评

暂无评论

Design and Optimization of Improved Recognition Algorithm for Piano Music Based on BP neural network

Design and Optimization of Improved Recognition Algorithm fo...

引用

作者： Chen, Zhaoheng Liu, Chun College of Music and Dance Huaihua University Hunan Huaihua418008 China

Based on the theory of speech recognition and the characteristics of music recognition, this study studies, recognizes and processes music sounds. Artificial neural network (ANN) is a distributed parallel information processing system, which can train and identify comprehensive input patterns without selecting specific speech parameters. An improved back propagation neural network (BPNN) model is proposed and applied to piano music recognition to find the maximum autocorrelation function on a small scale to adapt to fast-paced music. The simulation results show that the improved method solves the problems encountered in the application of neural network in music recognition, and at the same time makes use of various advantages of neural network pattern recognition, reducing the stage of manual extraction and processing knowledge in traditional recognition methods. To some extent, this algorithm avoids the situation of missing detection, false detection or recognition error of fast-paced music by traditional algorithms, and can significantly improve the recognition accuracy. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Scalable Hybrid Neuromorphic Accelerator & Hybrid neural networks

Scalable Hybrid Neuromorphic Accelerator & Hybrid Neural Net...

引用

作者： Nardone, Joshua California Polytechnic State University

学位级别：M.Eng., Master of Engineering

With machine learning workloads currently at very large scales, models are distributed across large compute systems. On distributed systems, the performance of these models are limited by the bandwidth limitations of chip-to-chip communication. To relieve this bottleneck, spiking neural networks (SNNs) can be utilized to reduce inter-chip communication traffic utilizing inherit network sparsity. However, in comparison to traditional artificial neural networks (ANNs), SNNs can have significant degradation in performance with increased network scale and *** research proposes a hybrid neural network accelerator that uses the best of both spiking and non-spiking layers by allocating a majority of resources to nonspiking layers on the interior of the chip while bandwidth-limited areas (e.g., I/O pads, or chip separation boundaries) employ spike-based data traffic. By limiting the overall use of spiking layers within the network, we realize the energy savings of SNNs without the a degradation in accuracy which comes with large spike-based *** present a scalable chiplet architecture and show how hybrid data is managed with both spike and non-spiking data communication. We also demonstrate how the asynchronous spike-based model is integrated efficiently with the synchronous artificial-based deep learning workloads. We demonstrate that our hybrid architecture offers significant improvements in performance, accuracy, and energy consumption in comparison to SNNs and ANNs. With up to a 1.34× increase in energy efficiency and 1.56× decrease in single inference latency, the versatility of the architecture is demonstrated by its validation across multiple datasets, encompassing both language processing and computer vision tasks.

关键词：

来源：评论

学校读者我要写书评

暂无评论

GradientFlow: Optimizing network Performance for Large-Scale distributed DNN Training

引用

IEEE TRANSACTIONS ON BIG DATA 2022年第2期8卷 495-507页

作者： Sun, Peng Wen, Yonggang Han, Ruobing Feng, Wansen Yan, Shengen SenseTime Hong Kong Peoples R China Nanyang Technol Univ Sch Comp Sci & Engn Singapore 639798 Singapore

It is important to scale out deep neural network (DNN) training for reducing model training time. The high communication overhead is one of the major performance bottlenecks for distributed DNN training across multiple GPUs. Our investigations have shown that popular open-source DNN systems could only achieve 2.5 speedup ratio on 64 GPUs connected by 56 Gbps network. To address this problem, we propose a communication backend named GradientFlow for distributed DNN training, and employ a set of network optimization techniques. First, we integrate ring-based allreduce, mixed-precision training, and computation/communication overlap into GradientFlow. Second, we propose lazy allreduce to improve network throughput by fusing multiple communication operations into a single one, and design coarse-grained sparse communication to reduce network traffic by only transmitting important gradient chunks. When training AlexNet and ResNet-50 on the ImageNet dataset using 512 GPUs, our approach could achieve 410.2 and 434.1 speedup ratio, respectively.

关键词： Training Graphics processing units Computational modeling Servers Data models Computer architecture Bandwidth distributed computing deep learning computer network

来源：评论

学校读者我要写书评

暂无评论

Data Analysis in Wireless Sensor networks with distributed Self Organizing Map 1

Data Analysis in Wireless Sensor Networks with Distributed S...

引用

1st IEEE International Conference on Advances in Signal processing, Power, Communication, and Computing, ASPCC 2024

作者： Panwar, Anita Nanda, Satyasai Jagannath Malaviya National Institute of Technology Jaipur Dept. of Electronics and Communication Engineering Rajasthan 302017 India

ISBN: (纸本)9798350355093

distributed clustering algorithms are employed in wireless sensor network (WSN) to improve the local data analysis. This process is carried out collaboratively with the help of nearby neighbours without a central controller. In this paper, distributed clustering is performed with Self Organizing Map (SOM). The SOM is a popular unsupervised neural network model that maps input data to a lower-dimensional grid. On this grid map similar input patterns are placed closer to each other. This process helps in discovering patterns and relationships in the data without prior labeling, thus making proposed distributed Self Organizing Map (DSOM) useful for unknown local data analysis at the WSNs. The proposed algorithm is applied to analyze two real life WSN datasets: Water quality monitoring of Thames river, Weather monitoring dataset of various stations at Canada. Comparative analysis is carried out with distributed Particle Swarm optimization algorithm and distributed K-means algorithm. The proposed DSOM has superior performance, as indicated by Silhouette Index and Quantization Error measurements. © 2024 IEEE.

关键词： Particle swarm optimization (PSO)

来源：评论

学校读者我要写书评

暂无评论

Comprehensive Dataset for Event Classification Using distributed Acoustic Sensing (DAS) Systems

引用

SCIENTIFIC DATA 2025年第1期12卷 1-8页

作者： Tomasov, Adrian Zaviska, Pavel Dejdar, Petr Klicnik, Ondrej Horvath, Tomas Munster, Petr Brno Univ Technol Dept Telecommun FEEC Tech 12 Brno 61600 Czech Republic

distributed Acoustic Sensing (DAS) technology leverages optical fibers to detect acoustic signals over long distances, offering high-resolution data critical for applications such as seismic monitoring, structural health monitoring, and security. A significant challenge in DAS systems is the accurate classification of detected events, which is crucial for their reliability. Traditional signal processing methods often struggle with the high-dimensional, noisy data produced by DAS systems, making advanced machine learning techniques essential for improved event classification. However, the lack of large, high-quality datasets has hindered progress. In this study, we present a comprehensive labeled dataset of DAS measurements collected around a university campus, featuring events such as walking, running, and vehicular movement, as well as potential security threats. This dataset provides a valuable resource for developing and validating machine learning models, enabling more accurate and automated event classification. The quality of the dataset is demonstrated through the successful training of a Convolutional neural network (CNN).

关键词：

来源：评论

学校读者我要写书评

暂无评论

Performance and Energy Aware Training of a Deep neural network in a Multi-GPU Environment with Power Capping

Performance and Energy Aware Training of a Deep Neural Netwo...

引用

29th International Conference on Parallel and distributed Computing (Euro-Par)

作者： Koszczal, Grzegorz Dobrosolski, Jan Matuszek, Mariusz Czarnul, Pawel Gdansk Univ Technol Fac Elect Telecommun & Informat Narutowicza 11-12 PL-80233 Gdansk Poland

ISBN: (纸本)9783031488023;9783031488030

In this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were able to find non-default GPU power cap settings within the range of 160-200W to improve the difference between percentage energy gain and performance loss by over 15.0%, EDP (Abbreviations and terms used are described in main text.) by over 17.3%, EDS with k = 1.5 by over 2.2%, EDS with k = 2.0 by over 7.5% and pure energy by over 25%, compared to the default power cap setting of 260W per GPU. These findings demonstrate the potential of today's CPU+GPU systems for configuration improvement in the context of performance-energy consumption metrics.

关键词： deep neural network training power capping multi GPU performance-energy optimization

来源：评论

学校读者我要写书评

暂无评论

Deep Unrolling-Based Joint Active User Detection and Channel Estimation in distributed Antenna Systems 16

Deep Unrolling-Based Joint Active User Detection and Channel...

引用

16th International Conference on Wireless Communications and Signal processing, WCSP 2024

作者： Liu, Junhui Xu, Qirui Zhu, Shihao Zhao, Ming Zhou, Wuyang University of Science and Technology of China Cas Key Laboratory of Wireless-Optical Communications Anhui Hefei China

ISBN: (纸本)9798350390643

This paper addresses the challenge of joint active user detection (AUD) and channel estimation (CE) for grant-free random access within massive machine-type communications (mMTC) enabled distributed antenna systems (DAS). Sparse Bayesian learning (SBL) integrated with variational Bayesian inference (VBI) methods exploit channel sparsity effectively and are commonly employed for AUD and CE. Nevertheless, these techniques generally require matrix inversion operations and numerous iterations, leading to high computational complexity. To mitigate these issues, we propose a deep unrolling neural network based on a fast inversion algorithm, introducing learnable parameters to accelerate convergence. Additionally, we consider the pilot collision in grant-free random access, which is often overlooked in most literature, and presents a pilot collision handling method that leverages the spatial sparsity of distributed antennas. Simulation results validate the reliability of the proposed network and the method for pilot collision processing. © 2024 IEEE.

关键词： Channel estimation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：