检索结果-内蒙古大学图书馆

29th IEEE international conference on parallel and distributed Systems, ICPADS 2023

作者： Zhang, Gaowei Song, Junping Hu, Yahui Fan, Pengfei Li, Chong Zhou, Xu Chinese Academy of Sciences Computer Network Information Center Beijing China University of Chinese Academy of Sciences Beijing China China University of Mining and Technology Beijing China

ISBN: (纸本)9798350330717

Vehicle Edge computing (VEC) has emerged as an efficacious paradigm that supports real-time, computation-intensive vehicular applications. However, due to the highly dynamic nature of computing node topology, existing scheduling algorithms need to more effectively apprehend the characteristics of fine-grained task topologies and network topologies. Moreover, they require significant communication overhead and training costs, making them inadequate for fine-grained task scheduling in vehicular networks. In response, our research explores fine-grained task scheduling issues within VEC scenarios, proposing a scheduling algorithm based on Graph Neural networks and Federated Learning (FL-GNN). This algorithm maintains a global scheduling model that periodically aggregates local scheduling models deployed on Roadside Units (RSUs) and high-performance vehicles. Furthermore, to enhance the model's ability to perceive topology and expedite the convergence rate, we incorporate a graph neural network layer in each local model to preprocess the raw state of the VEC environment. Lastly, we construct a simulation platform and implement multiple competitive solutions, demonstrating the superiority of the FL-GNN algorithm in aspects such as reducing the average task delay, balancing the load, and improving the task scheduling success rate. © 2023 IEEE.

关键词： deep reinforcement learning Federated learning fine-grained task scheduling graph neural networks vehicle edge computing

来源：评论

学校读者我要写书评

暂无评论

Recommender Systems based on parallel and distributed Deep Learning 27

Recommender Systems based on Parallel and Distributed Deep L...

引用

27th Pan-Hellenic conference on Progress in computing and Informatics with international Participation, PCI 2023

作者： Stergiopoulos, Vaios Tousidou, Eleni Corral, Antonio Dept. of Electrical and Computer Eng. University of Thessaly Volos Greece Department of Informatics University of Almeria Almeria Spain

ISBN: (纸本)9798400716263

As individuals have become overloaded with information, Recommender Systems (RS) were created to provide machine generated recommendations. Significant advancements in RS have been made thanks to Machine Learning methods;Deep Learning (DL) in particular has become extremely popular. Despite the fact that Deep neural networks (DNNs) upgrade notably the performance of RS, they make them larger and more memory-intensive systems. To that end, the solution is adding (data or model) parallel and distributed algorithms to DL RS. In this paper, we present our large-scale, multi-staged, hybrid RS that processes a million-scale dataset, as well as the most noteworthy parallel or/and distributed DL systems. Finally, we outline directions regarding the future evolution of our RS by adding some features and ideas from such systems. © 2023 Owner/Author.

关键词： Recommender systems

来源：评论

学校读者我要写书评

暂无评论

AshPipe: Asynchronous Hybrid Pipeline parallel for DNN Training 24

AshPipe: Asynchronous Hybrid Pipeline Parallel for DNN Train...

引用

7th international conference on High Performance computing in Asia-Pacific Region (HPC Asia)

作者： Hosoki, Ryubu Endo, Toshio Hirofuchi, Takahiro Ikegami, Tsutomu Tokyo Inst Technol Yokohama Kanagawa Japan Natl Inst Adv Ind Sci & Technol Tokyo Japan

ISBN: (纸本)9798400708893

Deep Neural networks (DNNs) have become increasingly computationally intensive and have larger parameters, requiring efficient parallelization or distribution using multiple accelerators. Pipeline parallelism has been proposed as an effective way to distribute models and improve hardware utilization. However, the problem with pipeline parallelism is the trade-off between speedup and accuracy: synchronous approaches do not provide sufficient speedup, while asynchronous approaches suffer from accuracy degradation due to a different scheme from a single worker. In this paper, we propose AshPipe, a hybrid parallel framework that combines data parallelism and asynchronous pipeline parallelism to achieve efficient speedup for training. The proposed runtime uses the 1F1B schedule and data parallelism, with non-uniform numbers of workers and identical global batch sizes across stages. A Switch parallelism (SP) mechanism is also proposed as an option to mitigate accuracy degradation, which switches over from data parallelism to hybrid parallelism in the course of training. Experimental results show that AshPipe achieves 1.844x the throughput of data parallelism for ViT-H/14 whose parameter size is 632M. With the SP mechanism, AshPipe achieved a 30.2% reduction in training time with comparable accuracy compared to data parallelism when training on the CIFAR100 dataset.

关键词： distributed Deep Learning Data parallelism Pipeline parallelism Hybrid parallelism

来源：评论

学校读者我要写书评

暂无评论

Energy Efficient Cluster-Based Routing Protocol for WSN Using Nature Inspired Algorithm

引用

WIRELESS PERSONAL COMMUNICATIONS 2023年第4期130卷 2407-2440页

作者： Mishra, Rashmi Yadav, Rajesh K. K. Delhi Technol Univ Delhi India

WSN consist of tiny sensors that are distributed over the entire network and have the capability of sensing the data, processing it, and conveying it from one node to another. The purpose of the study is to minimize the power utilization per round and elevate the network lifespan. In the present case, nature-inspired mechanisms are used to minimize the power utilization of the network. In the proposed study, the Butterfly Optimization Algorithm (BOA) is used to choose the optimal quantity of cluster heads from the dense nodes (available nodes). The parameters to be considered for the choice of the cluster head are: the remaining power of the node;distance from the other nodes in the network;distance from the base station;node centrality;and node degree. The particle swarm optimization (PSO) is used to form the cluster head by choosing certain parameters, such as distance from the cluster head and the BS. The path is chosen by means of the Ant Colony Optimization (ACO) Mechanism. The route is optimized by the distance, node degree, and the chosen remaining power. The inclusive performance of the projected protocol is measured in terms of stability period, quantity of active nodes, data acknowledged by the base station, and overall power utilization of the network. The results of the put redirect methodology are correlated with the extant mechanisms such as LEACH, DEEC, DDEEC, and EDEEC (Khan et al. in World Appl Sci J, 2013;Arora and Singh in Soft Comput 23:715-734, 2019;Saini and Sharma in 2010 First international conference on parallel, distributed and grid computing (PDGC 2010), 2010;Elbhiri et al. in 2010 5th international symposium on I/V communications and mobile network, 2010) and correlated with the swarm mechanisms such as CRHS, BERA, FUCHAR, ALOC, CPSO, and FLION. This review will help investigators discover the applications in this arena.

关键词： Wireless sensor networks Energy efficiency Throughput Delay Nature inspired algorithm

来源：评论

学校读者我要写书评

暂无评论

A GAN-based Approach to Detect AI-Generated Images 26

A GAN-based Approach to Detect AI-Generated Images

引用

26th ACIS international Winter conference on Software Engineering, Artificial Intelligence, Networking and parallel/distributed computing, SNPD-Winter 2023

作者： Monkam, Galamo Xu, Weifeng Yan, Jie Bowie State University Department of Computer Science Bowie United States School of Criminal Justice University of Baltimore Baltimore United States

ISBN: (纸本)9798350345865

The ease with which deep learning can generate fake images has created a pressing need for a robust platform to distinguish between real and fake imagery. However, existing methods in image forensics rely on complex deep learning architectures that are expensive to train and have limited usability due to their large model size. This study examines the difficulty of detecting state-of-the-art image manipulations, both manually and automatically. We introduce G-JOB GAN, a machine learning model based on Generative Adversarial networks (GAN), which generates highly realistic images and achieves a 95.7% accuracy in detecting realistic generated images. The same architecture of G-JOB Gan can also detect fake images with a similar probability. To verify the results, we compare our results to several similar GAN architectures, including Style GAN, Pro GAN, and the Original GAN. Our model outperforms other GAN models in term of detection accuracy. © 2023 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Graph-based Multi-view Clustering for Web services 21

Graph-based Multi-view Clustering for Web services

引用

21st IEEE international Symposium on parallel and distributed Processing with Applications, 13th IEEE international conference on Big Data and Cloud computing, 16th IEEE international conference on Social computing and Networking and 13th international conference on Sustainable computing and Communications, ISPA/BDCloud/SocialCom/SustainCom 2023

作者： Yang, Wang Zhenzhen, Yuan Guosheng, Kang Buqing, Cao Jianxun, Liu Yong, Xiao Hnust Xiangtan China Hunan University of Science and Technology Hunan Provincial Key Lab. for Services Computing and Novel Software Technology Xiangtan China School of Computer Science and Engineering China

ISBN: (纸本)9798350329223

The number of Web services on the Internet has been steadily increasing in recent years due to their growing popularity. Under the big data environment, how to effectively manage Web services is of significance for service discovery, service recommendation, etc. It is widely studied that Web services clustering is an effective way for service management. However, most of the current Web service clustering only extracts the information of Web services for clustering from one view, such as Web service content descriptions, networks in which Web services participate, and so on. Extracting information from Web services only unilaterally will not be able to provide a three-dimensional and comprehensive description of Web services, which may diminish the effect of Web service clustering. In addition, some Web service resources will be wasted if other information of Web services is not used at the same time. We find that multi-view clustering can simultaneously consider multiple information of a data at the same time, and multiple information can complement and enhance each other according to the characteristics of multiview clustering. Therefore, in this paper, we apply Web services to graph-based multi-view clustering in multi-view clustering to improve the performance of Web service clustering by simultaneously considering multiple feature information about Web services and distributing different weights to different information in the clustering process. © 2023 IEEE.

关键词： Web services

来源：评论

学校读者我要写书评

暂无评论

iNUMAlloc: Towards Intelligent Memory Allocation for AI Accelerators with NUMA 21

iNUMAlloc: Towards Intelligent Memory Allocation for AI Acce...

引用

作者： Xu, Yuanchao Qian, Ruyi Wang, Yida Huo, Qirun Capital Normal University College of Information Engineering Beijing China Skl of Computer Architecture Institute of Computing Technology Cas Beijing China

ISBN: (纸本)9798350329223

The amazing success of deep neural network benefits from the rise of big data. As deep learning models are becoming more scale than ever before, their requirements for memory bandwidth are growing at a tremendous pace. Some AI accelerators adopt non-uniform memory access (NUMA) architecture to mitigate this issue and hence complicate device memory allocation. Although extensive studies have been conducted on how to mitigate resource contention and reduce latency, almost all of them target on CPU-oriented NUMA systems but not on AI accelerators where memory allocation precedes task scheduling. The current memory allocator generally adopts an interleaved memory allocation strategy, which is very easy to implement but far from *** tackle this issue, this paper proposes iNUMAlloc, an intelligent memory allocator specialized for AI accelerators with NUMA architecture by combining program behavior and predictable hardware resources altogether. Preliminary evaluation shows that it can help to improve the accuracy and efficiency of memory allocation, thereby achieving stable execution time. © 2023 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Embedding (K9-C9)n into Certain Necklace Graphs 7th

Embedding (K9-C9)n into Certain Necklace Graphs

引用

7th international conference on Big Data and Cloud computing Challenges, ICBCC 2022

作者： Afiya, Syeda Rajesh, M. School of Advanced Sciences Vellore Institute of Technology Tamilnadu Chennai India School of Computer Science and Engineering Vellore Institute of Technology Tamilnadu Chennai India

ISBN: (纸本)9789819910502

In parallel and distributed computing, there are practically two networks: linear networks (also called paths) and rings (also called loops). Many efficient algorithms, such as signal and image processing, were first discovered by solving algebraic problems, graphical problems and parallel implementations involving linear networks and rings. As a result, a network having both good path and cycle embedding is crucial. In this paper, using embedding method, we simulate the cartesian product of (K9-C9)n graph into certain necklace graphs. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Linear networks

来源：评论

学校读者我要写书评

暂无评论

Auto-HPCnet: An Automatic Framework to Build Neural Network-based Surrogate for High-Performance computing Applications 23

Auto-HPCnet: An Automatic Framework to Build Neural Network-...

引用

32nd international Symposium on High-Performance parallel and distributed computing (HPDC) part of the ACM Federated computing Research conference (FCRC)

作者： Dong, Wenqian Kestor, Gokcen Li, Dong Florida Int Univ Miami FL 33199 USA Univ Calif Merced Miami FL 95343 USA Pacific Northwest Natl Lab Richland WA 99352 USA

ISBN: (纸本)9798400701559

High-performance computing communities are increasingly adopting Neural networks (NN) as surrogate models in their applications to generate scientific insights. Replacing an execution phase in the application with NN models can bring significant performance improvement. However, there is a lack of tools that can help domain scientists automatically apply NN-based surrogate models to HPC applications. We introduce a framework, named Auto-HPCnet, to democratize the usage of NN-based surrogates. Auto-HPCnet is the first end-to-end framework that makes past proposals for the NN-based surrogate model practical and disciplined. Auto-HPCnet introduces a workflow to address unique challenges when applying the approximation, such as feature acquisition and meeting the application-specific constraint on the quality of final computation outcome. We show that Auto-HPCnet can leverage NN for a set of HPC applications and achieve 5.50x speedup on average (up to 16.8x speedup and with data preparation cost included) while meeting the application-specific constraint on the final computation quality.

关键词： Scientific Machine Learning Neural Architecture Search Surrogate Model Construction Bayesian Optimization

来源：评论

学校读者我要写书评

暂无评论

BreathPass: Ultrasounic Authentication by Chest and Abdomen Movement while Breathing 30

BreathPass: Ultrasounic Authentication by Chest and Abdomen ...

引用

30th IEEE international conference on parallel and distributed Systems, ICPADS 2024

作者： Li, Lingkun Dang, Fan Liu, Duo Cao, Zhichao Beijing Jiaotong University China Tsinghua University China Michigan State University United States

ISBN: (纸本)9798331515966

In this study, we propose BreathPass, a non-invasive authentication system that characterizes the chest/abdomen movement incurred by human breath to enable unlocking smart devices while wearing various types of face covers, clothing, in different postures, and dynamic status such as walking or running. To capture the breathing pattern, BreathPass uses speakers to emit ultrasound signals. The signals are reflected off the chest wall and abdomen and then back to the microphone, which records the reflected signals. The system then extracts the breathing pattern from the reflected signals, and further extracts fingerprints from the breathing pattern, and use these fingerprints to perform authentication. We carefully design a Deep Neural Network model and explore its capacity for feature abstraction in order to address the challenges associated with tiny position changes resulting in different breathing patterns and the extremely narrow bandwidth of breathing. We implement a prototype and conduct extensive experiments. BreathPass achieves an overall accuracy of 83%, a true positive rate of 73%, and a false positive rate of 5%, according to performance evaluation results. © 2024 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：