The attention mechanism has become a pivotal component in artificial intelligence, significantly enhancing the performance of deep learning applications. However, its quadratic computational complexity and intricate c...
详细信息
The attention mechanism has become a pivotal component in artificial intelligence, significantly enhancing the performance of deep learning applications. However, its quadratic computational complexity and intricate computations lead to substantial inefficiencies when processing long sequences. To address these challenges, we introduce Attar, a resistive random access memory(RRAM)-based in-memory accelerator designed to optimize attention mechanisms through software-hardware co-optimization. Attar leverages efficient Top-k pruning and quantization strategies to exploit the sparsity and redundancy of attention matrices, and incorporates an RRAM-based in-memory softmax engine by harnessing the versatility of the RRAM crossbar. Comprehensive evaluations demonstrate that Attar achieves a performance improvement of up to 4.88× and energy saving of 55.38% over previous computing-in-memory(CIM)-based accelerators across various models and datasets while maintaining comparable accuracy. This work underscores the potential of in-memory computing to enhance the efficiency of attention-based models without compromising their effectiveness.
Wireless communication-enabled Cooperative Adaptive Cruise Control(CACC)is expected to improve the safety and traffic capacity of vehicle *** CACC considers a conventional communication delay with fixed Vehicular Comm...
详细信息
Wireless communication-enabled Cooperative Adaptive Cruise Control(CACC)is expected to improve the safety and traffic capacity of vehicle *** CACC considers a conventional communication delay with fixed Vehicular Communication Network(VCN)***,when the network is under attack,the communication delay may be much higher,and the stability of the system may not be *** paper proposes a novel communication Delay Aware CACC with Dynamic Network Topologies(DADNT).The main idea is that for various communication delays,in order to maximize the traffic capacity while guaranteeing stability and minimizing the following error,the CACC should dynamically adjust the VCN network topology to achieve the minimum inter-vehicle *** this end,a multi-objective optimization problem is formulated,and a 3-step Divide-And-Conquer sub-optimal solution(3DAC)is *** results show that with 3DAC,the proposed DADNT with CACC can reduce the inter-vehicle spacing by 5%,10%,and 14%,respectively,compared with the traditional CACC with fixed one-vehicle,two-vehicle,and three-vehicle look-ahead network topologies,thereby improving the traffic efficiency.
Building energy planning is significantly challenged by climate change, particularly the increasing frequency of heat waves impacting heating and cooling demands. Current planning methodologies neglect the impacts of ...
详细信息
A novel quantum search algorithm tailored for continuous optimization and spectral problems was proposed recently by a research team from the University of Electronic Science and technology of China to broaden quantum...
A novel quantum search algorithm tailored for continuous optimization and spectral problems was proposed recently by a research team from the University of Electronic Science and technology of China to broaden quantum computation frontiers and enrich its application *** computing has traditionally excelled at tackling discrete search challenges, but many important applications from large-scale optimization to advanced physics simulations necessitate searching through continuous domains.
Integrating processing-in-memory (PIM) with GPUs accelerates large language model (LLM) inference, but existing GPU-PIM systems encounter several challenges. While GPUs excel in large general matrix-matrix multiplicat...
详细信息
Large Language Models (LLMs) have emerged as the cornerstone of content generation applications due to their ability to capture relations between newly generated token and the full preceding context. However, this abi...
详细信息
The Number Theoretic Transform (NTT) significantly impacts the execution time of Fully Homomorphic Encryption (FHE) in practical applications, driving research into accelerated NTT methods. computing-in-Memory (CIM) o...
详细信息
The extensive utilization of large language models (LLMs) underscores the crucial necessity for precise and contemporary knowledge embedded within their intrinsic parameters. Existing research on knowledge editing pri...
详细信息
computing-in-memory (CIM) architectures demonstrate superior performance over traditional architectures. To unleash the potential of CIM accelerators, many compilation methods have been proposed, focusing on applicati...
详细信息
暂无评论