Owing to population growth and industrialization, the need for electricity is at its apex, which creates stress on the grid due to the continuous consumption of power. To produce energy, natural resources are utilized...
详细信息
Recently, Large Language Models (LLMs) have been a phenomenal trend in the Artificial intelligence field. However, training and fine-tuning can be challenging because of privacy concerns and limited computing resource...
详细信息
Service discovery is a vital process that enables low latency provisioning of Internet of Things (IoT) applications across the computing continuum. Unfortunately, it becomes increasingly difficult to identify a proper...
详细信息
ISBN:
(纸本)9783031396977;9783031396984
Service discovery is a vital process that enables low latency provisioning of Internet of Things (IoT) applications across the computing continuum. Unfortunately, it becomes increasingly difficult to identify a proper service within strict time constraints due to the high heterogeneity of the computing continuum. Moreover, the plethora of network technologies and protocols commonly used by IoT applications further hinders service discovery. To address these issues, we introduce a novel Mobile Edge Service Discovery using the DNS (MESDD) algorithm, which uses a so-called Intermediate Discovery Code to identify suitable service instances. MESDD uses geofences for fine-grained service segmentation based on a naming scheme that identifies users' locations across the computing continuum. We deployed a real-life distributedcomputing continuum testbed and compared MESDD with three related methods, outperformed by 60% after eight update iterations.
The demand for computing performance combined with higher energy efficiency is ever-increasing. Analog computing paradigms are rising in popularity lately because they are potentially faster than digital implementatio...
详细信息
ISBN:
(纸本)9798350377217;9798350377200
The demand for computing performance combined with higher energy efficiency is ever-increasing. Analog computing paradigms are rising in popularity lately because they are potentially faster than digital implementations at a fraction of their energy consumption. For example, so-called oscillator-based Ising machines can solve combinatorial optimization problems very quickly and energy efficiently. They exploit the interaction between hundreds of identical oscillator nodes, forming a large network occupying the usable chip area. Also, spiking neural networks are a promising analog computing approach, which enables machine learning to be performed with low power consumption. Machine learning is currently one of the most significant factors in global energy demand for digital computers. Crucial for the analog computing principle are precise biasing currents because they accurately set mathematical coefficients and references that influence timing behaviors. So, bias current needs to be distributed to the hundreds of nodes spread across the whole die. We analyze two approaches for precise generation and distribution of biasing currents and two calibration methods for comparing and adjusting currents on a local scale and chip scale in the form of a case study on an oscillator-based Ising machine implemented in 28nm CMOS technology and targeting a maximum error of 1.5% while keeping the used area low.
In vehicle-pile-grid collaborative operations, achieving data sharing and resource optimization is crucial for enhancing system efficiency and user experience. However, privacy protection and data security issues seve...
详细信息
Mixture of Expert (MoE) has received increasing attention for scaling DNN models to extra-large size with negligible increases in computation. The MoE model has achieved the highest accuracy in several domains. Howeve...
详细信息
ISBN:
(纸本)9798350307924
Mixture of Expert (MoE) has received increasing attention for scaling DNN models to extra-large size with negligible increases in computation. The MoE model has achieved the highest accuracy in several domains. However, a significant load imbalance occurs in the device during the training of a MoE model, resulting in significantly reduced throughput. Previous works on load balancing either harm model convergence or suffer from high execution overhead. To address these issues, we present Prophet: a fine-grained load balancing method for parallel training of large-scale MoE models, which consists of a planner and a scheduler. Prophet planner first employs a fine-grained resource allocation method to determine the possible scenarios for the expert placement in a fine-grained manner, and then efficiently searches for a well-balanced expert placement to balance the load without introducing additional overhead. Prophet scheduler exploits the locality of the token distribution to schedule the resource allocation operations using a layer-wise fine-grained schedule strategy to hide their overhead. We conduct extensive experiments in four clusters and five representative models. The results indicate that Prophet gains up to 2.3x speedup compared to the state-of-the-art MoE frameworks including Deepspeed-MoE and FasterMoE. Additionally, Prophet achieves a load balancing enhancement of up to 12.06x when compared to FasterMoE.
In the new times distributed generations of renewable energies are particularly required for day-to-day utilization as the natural resources are very nearly annihilation. These renewable energy sources are a generally...
详细信息
The increasing popularity of electric vehicles is set to transform the dynamics of energy consumption. Demand forecasting of electric vehicle charging stations guarantees effective distribution of the available power ...
详细信息
Balancing robustness and computational efficiency in machine learning models is challenging, especially in settings with limited resources like mobile and IoT devices. This study introduces Adaptive and Localized Adve...
详细信息
Remote direct memory access (RDMA) supports zero-copy networking by transferring data from clients directly to host memory, eliminating the need to copy data between clients' memory and the data buffers in the hos...
详细信息
ISBN:
(纸本)9798400701559
Remote direct memory access (RDMA) supports zero-copy networking by transferring data from clients directly to host memory, eliminating the need to copy data between clients' memory and the data buffers in the hosting server. However, the hosting server must design efficient memory management schemes to handle incoming clients' data. In this paper, we propose a high-performance host memory management scheme called HM2 for RDMA-enabled distributed systems. We present a new buffer structure for incoming data from clients. In addition, we propose efficient data processing methods to reduce network transfers between clients and servers. We conducted a preliminary experiment to evaluate HM2, and the results showed HM2 achieved higher throughput than existing schemes, including L5 and FaRM.
暂无评论