Obesity is a major public health issue that affects both industrialized and developing countries. Obesity is a varied and complex issue that necessitates diagnosis and treatment. Various research projects have attempt...
详细信息
This paper concerns ultimately bounded output-feedback control problems for networked systems with unknown nonlinear dynamics. Sensor-to-observer signal transmission is facilitated over networks that has communication...
详细信息
This paper concerns ultimately bounded output-feedback control problems for networked systems with unknown nonlinear dynamics. Sensor-to-observer signal transmission is facilitated over networks that has communication *** transmissions are carried out over an unreliable communication channel. In order to enhance the utilization rate of measurement data, a buffer-aided strategy is novelly employed to store historical measurements when communication networks are inaccessible. Using the neural network technique, a novel observer-based controller is introduced to address effects of signal transmission behaviors and unknown nonlinear *** the application of stochastic analysis and Lyapunov stability, a joint framework is constructed for analyzing resultant system performance under the introduced controller. Subsequently, existence conditions for the desired output-feedback controller are delineated. The required parameters for the observerbased controller are then determined by resolving some specific matrix inequalities. Finally, a simulation example is showcased to confirm method efficacy.
Efficiently serving large language models (LLMs) requires batching many requests together to reduce the cost per request. Yet, the key-value (KV) cache, which stores attention keys and values to avoid re-computations,...
详细信息
Efficiently serving large language models (LLMs) requires batching many requests together to reduce the cost per request. Yet, the key-value (KV) cache, which stores attention keys and values to avoid re-computations, significantly increases memory demands and becomes the new bottleneck in speed and memory usage. This memory demand increases with larger batch sizes and longer context lengths. Additionally, the inference speed is limited by the size of KV cache, as the GPU's SRAM must load the entire KV cache from the main GPU memory for each token generated, causing the computational core to be idle during this process. A straightforward and effective solution to reduce KV cache size is quantization, which decreases the total bytes taken by KV cache. However, there is a lack of in-depth studies that explore the element distribution of KV cache to understand the hardness and limitation of KV cache quantization. To fill the gap, we conducted a comprehensive study on the element distribution in KV cache of popular LLMs. Our findings indicate that the key cache should be quantized per-channel, i.e., group elements along the channel dimension and quantize them together. In contrast, the value cache should be quantized per-token. From this analysis, we developed a tuning-free 2bit KV cache quantization algorithm, named KIVI. With the hardware-friendly implementation, KIVI can enable Llama (Llama-2), Falcon, and Mistral models to maintain almost the same quality while using 2.6× less peak memory usage (including the model weight). This reduction in memory usage enables up to 4× larger batch size, bringing 2.35× ∼ 3.47× throughput on real LLM inference workload. The source code is available at https://***/jy-yuan/KIVI. Copyright 2024 by the author(s)
Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals (such as images and videos) with the benefits of a compact neural representation. While numerous method...
详细信息
Blockchain is one of the biggest breakthroughs of the 21st century;it has the power to revolutionize how the financial world works by shifting the power from the hands of a central authority to the commoners. Not only...
详细信息
Beyond-5G(B5G)aims to meet the growing demands of mobile traffic and expand the communication *** that intelligent applications to B5G wireless communications will involve security issues regarding user data and opera...
详细信息
Beyond-5G(B5G)aims to meet the growing demands of mobile traffic and expand the communication *** that intelligent applications to B5G wireless communications will involve security issues regarding user data and operational data,this paper analyzes the maximum capacity of the multi-watermarking method for multimedia signal hiding as a means of alleviating the information security problem of *** multiwatermarking process employs spread transform dither *** the watermarking procedure,Gram-Schmidt orthogonalization is used to obtain the multiple spreading ***,multiple watermarks can be simultaneously embedded into the same position of a multimedia ***,the multiple watermarks can be extracted without affecting one another during the extraction *** analyze the effect of the size of the spreading vector on the unit maximum capacity,and consequently derive the theoretical relationship between the size of the spreading vector and the unit maximum capacity.A number of experiments are conducted to determine the optimal parameter values for maximum robustness on the premise of high capacity and good imperceptibility.
The electronics panels and controllers used in electric vehicle (EV) charging stations need to function securely and optimally in countries such as India, where the external temperature remains high for a significant ...
详细信息
Package design has become increasingly complex with the evolution of technology nodes and heterogeneous integration. To optimize timing performance and signal integrity, it is essential to separate different pairs of ...
详细信息
To assist human fact-checkers, researchers have developed automated approaches for visual misinformation detection. These methods assign veracity scores by identifying inconsistencies between the image and its caption...
详细信息
The reconfigurable intelligent surface (RIS) steering reflective beam directions toward a target mobile user equipment (UE) has been a promising technology for coverage enhancement and physical-layer (PHY) security to...
详细信息
暂无评论