With the widespread use of deep neuralnetworks (DNNs) in intelligent systems, DNN accelerators with high performance and energy efficiency are greatly demanded. As one of the feasible processing-in-memory (PIM) archi...
详细信息
With the widespread use of deep neuralnetworks (DNNs) in intelligent systems, DNN accelerators with high performance and energy efficiency are greatly demanded. As one of the feasible processing-in-memory (PIM) architectures, 3-D stacked-DRAM-based PIM (DRAM-PIM) architecture enables large-capacity memory and low-cost memory access, which is a promising solution for DNN accelerators with better performance and energy efficiency. However, the low-access-cost characteristics of stacked DRAM and the distributed manner of memory access and data storing require us to rebalance the hardware design and DNN mapping. In this article, we propose NicePIM to efficiently explore the design space of hardware architecture and DNN mapping of DRAM-PIM-based DNN inference accelerators, which consists of three key components: 1) PIM-Tuner;2) PIM-Mapper;and 3) data-scheduler. PIM-Tuner optimizes the hardware configurations leveraging a DNN model for classifying area-compliant PIM-node designs and a deep kernel learning model for identifying better-hardware parameters. PIM-Mapper explores a variety of DNN mapping configurations, including parallelism between branches of DNN, DNN layer partitioning, DRAM capacity allocation, and data layout pattern in DRAM, to generate high-hardware-utilization DNN mapping schemes for various hardware configurations. The data-scheduler employs an integer-linear-programming-based data scheduling algorithm to alleviate the inter-PIM-node communication overhead of data-sharing brought by DNN layer partitioning. Experimental results demonstrate that NicePIM can optimize hardware configurations for DRAM-PIM systems effectively and can generate high-quality DNN mapping schemes with latency and energy cost reduced by 37% and 28% on average, respectively, compared to the baseline method.
Zero Trust model enhances the security of wireless network environments, which is thought to be effectively applicable to Connected and automated vehicles (CAVs). Considering the abundance of real-time data in CAVs an...
详细信息
Zero Trust model enhances the security of wireless network environments, which is thought to be effectively applicable to Connected and automated vehicles (CAVs). Considering the abundance of real-time data in CAVs and the delay introduced by the data validation of the Zero Trust model, it may result in significant delay when processing real-time data. By caching popular content in advance on edge servers, edge caching can significantly reduce the response delay of real-time data in CAVs. However, achieving low-delay service responses requires ultra-dense deployments of edge servers, which increases the complexity of the wireless network. Therefore, it is challenging to achieve efficient cooperative caching between edge servers in Zero Trust-enabled CAVs. In this article, a distributed Edge Caching method with Multi-Agent reinforcement learning for Zero Trust-enabled CAVs, named D-ECMA, is proposed. Specifically, a collaboration graph construction method is designed to obtain efficient collaborative relationships. Then a prediction method for the demand of services based on Spatial-Temporal Fusion Graph neuralnetworks (STFGNN) is proposed to help edge servers adjust their caching policies. Following, a distributed edge caching method based on Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Zero Trust-enabled CAVs is designed. Finally, the effectiveness of D-ECMA is demonstrated through comparative experiments.
To deal with different beamforming network (BFN) design tasks, a generalized auto-design methodology is developed by the following three steps. First, a structure reconfiguration is performed by using a binary design ...
详细信息
To deal with different beamforming network (BFN) design tasks, a generalized auto-design methodology is developed by the following three steps. First, a structure reconfiguration is performed by using a binary design matrix assisted with different reconfiguration functions. Then, to speed up the component design, the homotopy continuation (HC) method is exploited to form a partial database. A mapping technique ensures the partial database to be expanded to the complete database, which covers the whole feature space. An inverse model is well trained based on the complete database to generate the physical parameters of multiple couplers. Finally, the optimal distributed phase compensation is decided by minimizing the total phase shift value of phase shifters. For the demonstration, two different waveguide BFNs are designed. In addition, this work discussed the potential of using the proposed methodology to design BFNs with a more complex configuration or BFNs in other materials. The results prove that the proposed methodology allows an automatic and simultaneous generation of multiple physical structures, which can improve the BFN's electrical performance or meet different physical specifications (PS). Considering many possible problems in a BFN design, the proposed methodology is generalized to different classes of BFNs. Moreover, the component design can be finished within a shorter time than existing algorithms.
Inter-chiplet communication is a fundamental bottleneck in scale-out Homogeneous Multi-Chip-Module-based Hardware Accelerators (HMCMHAs). This paper focuses on the problem of many-to-many communication traffic generat...
详细信息
ISBN:
(纸本)9798350311990
Inter-chiplet communication is a fundamental bottleneck in scale-out Homogeneous Multi-Chip-Module-based Hardware Accelerators (HMCMHAs). This paper focuses on the problem of many-to-many communication traffic generated when dispatching output feature map tiles among chiplets. Such traffic has a strong impact on the latency and energy metrics of the HMCMHAs as it exposes the limitations of the existing wire-based network-on-Package (NoP). This paper investigates augmenting the existing NoP with emerging wireless in-package communication links. The intrinsic single-hop and broadcastcapable technology is exploited to tackle the many-to-many communication traffic in question. We show that the proposed wireless-enabled NoP can significantly improve the latency and energy of Deep neuralnetwork (DNN) inference on HMCMHAs.
Decoding human electroencephalogram (EEG) signals and identifying the brain's response to external stimuli is a challenging task, but it is crucial for understanding the brain's information processing mechanis...
详细信息
Decoding human electroencephalogram (EEG) signals and identifying the brain's response to external stimuli is a challenging task, but it is crucial for understanding the brain's information processing mechanisms and developing brain like intelligent computers. Previous studies have used neuralnetworks to analyze the spatiotemporal features of various EEG regions for EEG emotion recognition, but there have been few studies characterizing different frequency bands of EEG signals, and little attention has been paid to the issue of fuzzy emotion labeling in continuous emotion models. Based on these two issues, this study proposes a Graph Convolutional network (FCLGCN) method to collect time and frequency band information of EEG, and solves the problem of fuzzy emotional boundaries through contrastive learning. FCLGCN achieved high recognition accuracy on DEAP and SEED datasets. According to the experimental results, the connection between the frontal and temporal lobes on both sides of the brain becomes tighter during emotional changes.
In distributed learning, a network of agents cooperate for solving a common task, like training a particular neuralnetwork. The devices usually adopt an iterative procedure with two steps, namely, they, first, perfor...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
In distributed learning, a network of agents cooperate for solving a common task, like training a particular neuralnetwork. The devices usually adopt an iterative procedure with two steps, namely, they, first, perform local optimization, using, e.g., stochastic gradient descent, and, then, they exchange information among them in order to achieve consensus on the final solution. In current literature, the proposed distributed algorithms achieve consensus using averaging rules over the received information at each agent. Here, the paper departs from this paradigm and focuses on "single agent" cooperation strategies in which each agent selects a particular neighbor at each iteration and uses only that information during the local optimization step. Three selection rules are designed and it is shown that they inherit the convergence properties of commonly used averaging rules. Moreover, their effectiveness is demonstrated experimentally in classification tasks over other algorithms using well-known datasets for a wide range of scenarios, capturing factors like non-IID datasets, network size, and AI model size.
Inspired by the concept of content-addressable retrieval from cognitive science,we propose a novel fragment-based Chinese named entity recognition(NER)model augmented with a lexicon-based memory in which both characte...
详细信息
Inspired by the concept of content-addressable retrieval from cognitive science,we propose a novel fragment-based Chinese named entity recognition(NER)model augmented with a lexicon-based memory in which both character-level and word-level features are combined to generate better feature representations for possible entity *** that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories,position-dependent features,such as prefix and suffix,are introduced and taken into account for NER tasks in the form of distributed *** lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary *** results show that the proposed model,called LEMON,achieved state-of-the-art performance with an increase in the Fl-score up to 3.2%over the state-of-the-art models on four different widely-used NER datasets.
distributed fiber optic sensors are promising technique for measuring strain, temperature and vibration over tens of kilometres by utilizing the backscattered Rayleigh, Raman and Brillouin signals. Recently, the use o...
详细信息
ISBN:
(纸本)9781728167435
distributed fiber optic sensors are promising technique for measuring strain, temperature and vibration over tens of kilometres by utilizing the backscattered Rayleigh, Raman and Brillouin signals. Recently, the use of an artificial neuralnetwork (ANN) has been adopted into the distributed fiber sensors for advanced data analytics, fast data processing time, high sensing accuracy and event classification. In this paper, the recent developments of ANN-based distributed fiber sensors and their operating principles are reviewed. Moreover, the performance of ANN is compared with the conventional signal processing algorithms. The future perspective view that can be extended further research development has also been discussed.
Internet of things (IoT) devices usually offer limited resources such as processing, memory, and network capacity, bringing more security threats to the environment. distributed denial of service (DDoS) signal attacks...
详细信息
Internet of things (IoT) devices usually offer limited resources such as processing, memory, and network capacity, bringing more security threats to the environment. distributed denial of service (DDoS) signal attacks are among the most serious threats. Software-defined networking (SDN) is a promising paradigm that could offer a scalable security solution optimised for the IoT ecosystem. However, investigating a robust security solution is still one of the most challenging problems that a smart home environment faces in SDN. In this paper, we introduce a multi-locality deep learning model for the detection of DDoS signals in an SDN-based smart home. It employs convolutional neuralnetworks (CNNs) by learning different levels of local information from the data. In this work, an ensemble of two CNNs to detect malicious traffic flows with low computation overhead framework is proposed. Experimental results demonstrate the robustness, effectiveness, and efficiency of our solution in detecting DDoS attacks in SDN smart home.
With the development of edge devices, smart sockets are now capable of handling various power load data, which provide a new solution for edge computing and real-time load forecasting. In this paper, a distributed sys...
With the development of edge devices, smart sockets are now capable of handling various power load data, which provide a new solution for edge computing and real-time load forecasting. In this paper, a distributed system that can centrally process load data is proposed, and a KGLSTM network based on graph neuralnetworks and long short-term memory networks is introduced. In our proposed system, smart sockets are used for the collecting and processing of load data. The proposed KGLSTM network is based on a dual-branch structure. One branch extracts medium and short-term load data features by using LSTM, and the other branch extracts medium and long-term load data features by using graph neuralnetworks. In terms of the training of model, a dynamic weighted loss function is proposed to guide the training, which improves the load prediction performance for peak load intervals. Finally, by comparing with advanced models, the effectiveness of KGLSTM and weighted KGLSTM for load prediction is validated, and the proposed two models also have a great performance in peak forecasting.
暂无评论