检索结果-内蒙古大学图书馆

1st International Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Heinz, Carsten Kalkhof, Torben Lavan, Yannick Koch, Andreas Tech Univ Darmstadt Embedded Syst & Applicat Grp Darmstadt Germany

ISBN: (纸本)9798350364613;9798350364606

AMD AI Engines (AIEs) extend the design space and open up new options for coarse -grained processing in re-configurable accelerators. Pure FPGA designs for machine learning often struggle to compete with the high clock frequencies of GPUs for data -intensive workloads with only limited control flow. Having AIEs available on-chip with an FPGA fabric allows for low -latency co -processing and permits parts of an application to be placed on the most suitable kind of processing unit. Many data -heavy workloads, particularly in the AI domain, benefit from data streaming. With TaPaSCo-AIE, we present a framework for heterogeneous systems centered around data streams. Our framework focuses on AMD Versal devices and incorporates AI Engines and 100G network. We demonstrate the efficient use of TaPaSCo-AIE in a real -world evaluation based on a neural network, and achieve significant performance improvements over CPUs, and even exceed the performance of an AIM GPU.

关键词： FPGA Versal AI Engine Heterogeneous computing network

来源：评论

学校读者我要写书评

暂无评论

Computational Storage for an Energy-Efficient Deep neural network Training System 29th

Computational Storage for an Energy-Efficient Deep Neural Ne...

引用

29th International Conference on Parallel and distributed Computing (Euro-Par)

作者： Li, Shiju Tang, Kevin Lim, Jin Lee, Chul-Ho Kim, Jongryool SK Hynix Memory Forest x&D Solut Lab San Jose CA 95134 USA Texas State Univ Dept Comp Sci San Marcos TX USA

ISBN: (纸本)9783031396977;9783031396984

Near-storage data processing and computational storage have recently received considerable attention from the industry as energy- and cost-efficient ways to improve system performance. This paper introduces a computational-storage solution to enhance the performance and energy efficiency of an AI training system, especially for training a deep learning model with large datasets or high-dimensional data. Our system leverages dimensionality reduction effectively by offloading its operations to computational storage in a systematic manner. Our experiment results show that it can reduce the training time of a deep learning model by over 40.3%, while lowering energy consumption by 38.2%.

关键词： Deep neural networks Computational Storage Near-Storage Data Preprocessing Model Training Energy Efficiency

来源：评论

学校读者我要写书评

暂无评论

Evaluation of architecture-aware optimization techniques for Convolutional neural networks 31

Evaluation of architecture-aware optimization techniques for...

引用

31st Euromicro International Conference on Parallel, distributed and network-Based processing, PDP 2023

作者： Marichal, Raul Toyos, Guillermo Dufrechou, Ernesto Ezzatti, Pablo Instituto de Computación Universidad de la República Facultad de Ingeniería Montevideo Uruguay

ISBN: (纸本)9798350337631

The growing need to perform neural network inference with low latency is giving place to a broad spectrum of heterogeneous devices with deep learning capabilities. Therefore, obtaining the best performance from each device and choosing the most suitable platform for a given problem has become challenging. This paper evaluates multiple inference platforms using architecture-aware optimizations for convolutional neural networks. Specifically, we use TensorRT and OpenVINO frameworks for hardware optimizations on top of the platform-aware NetAdapt algorithm. The experimental evaluation shows that on MobileNet and AlexNet, using NetAdapt with TensorRT or Open-VINO can improve latency up to 10 x and 5.3 x, respectively. Moreover, a throughput test using different batch sizes showed variable performance improvement on the different devices. Discussing the experimental results can guide the selection of devices and optimizations for different AI solutions. © 2023 IEEE.

关键词： Edge computing

来源：评论

学校读者我要写书评

暂无评论

Hadar: Heterogeneity-Aware Optimization-Based Online Scheduling for Deep Learning Cluster 38

Hadar: Heterogeneity-Aware Optimization-Based Online Schedul...

引用

International Parallel and distributed processing Symposium (IPDPS)

作者： Sultana, Abeda Xu, Fei Yuan, Xu Chen, Li Tzeng, Nian-Feng Univ Louisiana Lafayette Sch Comp & Informat Lafayette LA USA East China Normal Univ Sch Comp Sci & Technol Shanghai Peoples R China Univ Delaware Dept Comp & Informat Sci Newark DE USA

ISBN: (纸本)9798350387117;9798350387124

With the wide adoption of deep neural network (DNN) models for various applications, enterprises, and cloud providers have built deep learning clusters and increasingly deployed specialized accelerators, such as GPUs and TPUs, for DNN training jobs. To arbitrate cluster resources among multi-user jobs, existing schedulers fall short, either lacking fine-grained heterogeneity awareness or hardly generalizable to various scheduling policies. To fill this gap, we propose a novel design of a task-level heterogeneity-aware scheduler, Hadar, based on an online optimization framework that can express other scheduling algorithms. Hadar leverages the performance traits of DNN jobs on a heterogeneous cluster, characterizes the task-level performance heterogeneity in the optimization problem, and makes scheduling decisions across both spatial and temporal dimensions. The primal-dual framework is employed, with our design of a dual subroutine, to solve the optimization problem and guide the scheduling design. Extensive trace-driven simulations with representative DNN models have been conducted to demonstrate that Hadar improves the average job completion time (JCT) by 3x over an Apache YARN-based resource manager used in production. Moreover, Hadar outperforms Gavel[1], the state-of-the-art heterogeneity-aware scheduler, by 2.5x for the average JCT, shortens the queuing delay by 13%, and improves FTF (Finish-Time-Fairness) by 1.5%.

关键词： distributed deep learning scheduling optimization

来源：评论

学校读者我要写书评

暂无评论

Implementing Chaos Based Optimisations on neural networks for Predictions of distributed Denial-of-Service (DDoS) Attacks 2

Implementing Chaos Based Optimisations on Neural Networks fo...

引用

2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal processing, PCEMS 2023

作者： Jha, Anisha Goel, Avikal Mahajan, Divansh Jain, Goonjan Delhi Technological University Department of Applied Mathematics Delhi India

ISBN: (纸本)9798350310719

A distributed Denial-of-Service attack (DDoS) involves overwhelming a network with a large amount of traffic that aims to disrupt the normal functioning of a network. DDoS attacks can cause a variety of problems, such as website downtime, loss of revenue, and damage to a company's reputation. One of the main challenges in dealing with DDoS attacks is detecting them in a timely and accurate *** learning algorithms can be trained to recognize patterns in network traffic that are indicative of a DDoS attack, and they can also be used to distinguish between legitimate traffic and attack traffic. The paper discusses a method for improving the performance of neural networks by utilizing chaos-based algorithms for detection and prediction of DDoS attacks. Moreover, this paper talks about using chaos-based optimization to speed up the process of training a model using neural networks. This technique uses a chaotic sequence to set the initial weights and biases of the model, which can result in a wider range of random starting points. This can help the model to find the best solution faster. © 2023 IEEE.

关键词： Denial-of-service attack

来源：评论

学校读者我要写书评

暂无评论

Mitigating DDoS attacks: A distributed blockchain-SDN secure IoT system enhanced by artificial neural networks 4th

Mitigating DDoS attacks: A distributed blockchain-SDN secure...

引用

4th International Conference on Computational Methods in Science and Technology, ICCMST 2024

作者： Tiwana, Pardeep Singh Kaur, Lakhvinder Kaur, Navpreet Wadhwa, Heena Kumari, Nisha Sharma, Rajeev Chandigarh Engineering College-CGC Landran Punjab Mohali India Chitkara University Institute of Engineering and Technology Chitkara University Punjab India

ISBN: (纸本)9781032911571

The introduction of the Internet of Things (IoT) has resulted in the enlargement of complex and intelligent solutions, which has led to an expansion of the variety of innovative services and processing capacity that are in line with the standards of life in the modern day. It is possible that the widespread usage of these services is to blame for the growing number of communication entities and devices to be element of the Internet of Things (IoT). This presents a substantial challenge in terms of the management of networks. Consequently, it is of the utmost importance to restructure the administration of the network that controls the Internet of Things (IoT). By virtue of its programmability and centralized characteristics, software-defined networking (SDN) facilitates the streamlining of network administration, enables network abstraction, facilitates network development, and has the capability to administrate the Internet of Things (IoT) network. However, there are still obstacles to overcome in terms of security when it comes to the Internet of Things (IoT). distributed denial of service assaults, often known as DDoS attacks, are regarded to be one of the most severe security concerns. Internet of Things (IoT) systems that are used on a global scale are susceptible to these attacks. This article examines DDoS assaults in software-defined networking and the Internet of Things. This research evaluates SDN, machine learning, and blockchain-based detection and mitigation technologies. An IoT architecture that is strong and secure has been proposed. Mult controllers and SDN were used to create this framework. Most people believe blockchain secures decentralized SDN-IoT networks well. As commonly known, artificial neural networks (ANNs) may improve risk assessment and mitigation. © 2025 the Author(s).

关键词： Risk assessment

来源：评论

学校读者我要写书评

暂无评论

LAS: Locality-Aware Scheduling for GEMM-Accelerated Convolutions in GPUs

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 2023年第5期34卷 1479-1494页

作者： Kim, Hyeonjin Song, William J. Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea

This article presents a graphics processing unit (GPU) scheduling scheme that maximizes the exploitation of data locality in deep neural networks (DNNs). Convolution is one of the fundamental operations used in DNNs and accounts for more than 90% of the total execution time. To leverage massive thread-level parallelism (TLP) in a GPU, deeply nested convolution loops are lowered (or unrolled) into large matrix multiplication, which trades memory capacity and bandwidth for TLP augmentation. A large workspace matrix is split into tiles of general matrix multiplication (GEMM) and concurrently executed by many thread blocks. Notably, the workspace is filled with a number of duplicate data that originate from the same sources in the input feature map during the lowering process. However, conventional GPU scheduling is oblivious to data duplication patterns in the workspace, and thread blocks are assigned to streaming multiprocessors (SMs) irrespective of data similarity between GEMM tiles. Such scheduling misses a significant opportunity to exploit data locality manifested in the DNN convolution. This article proposes a GPU scheduling technique called Locality-Aware Scheduling (LAS) that i) identifies which thread blocks share the largest amount of identical data based on the lowered patterns of a DNN convolution and ii) allocates such thread blocks showing the greatest data similarity to the same SM. In this way, small caches in SMs can efficiently utilize the data locality of the DNN convolution. Experimental results show that LAS with tensor cores achieves 20.1% performance improvements on average with 14.8% increases in L1 cache hit rates.

关键词： Graphics processing units Instruction sets Memory management Registers System-on-chip Parallel processing Information filters Deep neural network GEMM GPU cache convolution scheduling

来源：评论

学校读者我要写书评

暂无评论

distributed Online Convex Optimization with Compressed Communication 36

Distributed Online Convex Optimization with Compressed Commu...

引用

36th Conference on neural Information processing Systems (NeurIPS)

作者： Tu, Zhipeng Wang, Xi Hong, Yiguang Wang, Lei Yuan, Deming Shi, Guodong Chinese Acad Sci Acad Math & Syst Sci Beijing Peoples R China Univ Sydney Australian Ctr Field Robot Sch AMME Sydney Australia Tongji Univ Coll Control Sci & Engn Shanghai Peoples R China Zhejiang Univ Coll Control Sci & Engn Hangzhou Peoples R China Nanjing Univ Sci & Technol Sch Automat Nanjing Peoples R China

ISBN: (纸本)9781713871088

We consider a distributed online convex optimization problem when streaming data are distributed among computing agents over a connected communication network. Since the data are high-dimensional or the network is large-scale, communication load can be a bottleneck for the efficiency of distributed algorithms. To tackle this bottleneck, we apply the state-of-art data compression scheme to the fundamental GD-based distributed online algorithms. Three algorithms with difference-compressed communication are proposed for full information feedback (DC-DOGD), one-point bandit feedback (DC-DOBD), and two-point bandit feedback (DC-DO2BD), respectively. We obtain regret bounds explicitly in terms of time horizon, compression ratio, decision dimension, agent number, and network parameters. Our algorithms are proved to be no-regret and match the same regret bounds, w.r.t. time horizon, with their uncompressed versions for both convex and strongly convex losses. Numerical experiments are given to validate the theoretical findings and illustrate that the proposed algorithms can effectively reduce the total transmitted bits for distributed online training compared with the uncompressed baseline.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

distributed Inverse Constrained Reinforcement Learning for Multi-agent Systems 36

Distributed Inverse Constrained Reinforcement Learning for M...

引用

36th Conference on neural Information processing Systems (NeurIPS)

作者： Liu, Shicheng Zhu, Minghui Penn State Univ Sch Elect Engn & Comp Sci University Pk PA 16802 USA

ISBN: (纸本)9781713871088

This paper considers the problem of recovering the policies of multiple interacting experts by estimating their reward functions and constraints where the demonstration data of the experts is distributed to a group of learners. We formulate this problem as a distributed bi-level optimization problem and propose a novel bi-level "distributed inverse constrained reinforcement learning" (D-ICRL) algorithm that allows the learners to collaboratively estimate the constraints in the outer loop and learn the corresponding policies and reward functions in the inner loop from the distributed demonstrations through intermittent communications. We formally guarantee that the distributed learners asymptotically achieve consensus which belongs to the set of stationary points of the bi-level optimization problem. Simulations are done to validate the proposed algorithm.

关键词： Multi agent systems

来源：评论

学校读者我要写书评

暂无评论

Neuro-Adaptive Formation Tracking for networked Autonomous Surface Vehicles Under Time Delay via Hierarchical Information Security Control

引用

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2025年第3期26卷 3831-3841页

作者： Zhang, Xiao-Yu Han, Tao Xiao, Bo Yan, Huaicheng Hubei Normal Univ Sch Elect Engn & Automat Huangshi 435002 Hubei Peoples R China East China Univ Sci & Technol Sch Informat Sci & Engn Shanghai 200237 Peoples R China

This paper investigates the formation tracking control problem for autonomous surface vehicles (ASVs) with dynamic uncertainties and external disturbances under secure and privacy-preserving interaction. An innovative hierarchical information security control (HISC) framework is proposed to solve the estimation problem in a secure and privacy-preserving way and the formation tracking problem for ASVs. The information processing layer of HISC framework focuses on the distributed secure and privacy-preserving estimator (DSPE) algorithm under sampled-data interaction and the local control layer is mainly about the robust neuro-adaptive controller without any model information for the formation of networked ASVs under communication delay. Through systematic analysis, sufficient conditions are given for guaranteeing the stability and convergence of the studied closed-loop system. Ultimately, simulation outcomes are showcased to corroborate the efficacy of the proposed control scheme.

关键词： Information security Delays Vectors Uncertainty Topology Privacy network topology Formation control Estimation Delay effects Autonomous surface vehicles formation control hierachical information security control neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：