Both the training and use of Large Language Models (LLMs) require large amounts of energy. Their increasing popularity, therefore, raises critical concerns regarding the energy efficiency and sustainability of data ce...
详细信息
ISBN:
(纸本)9798400704802
Both the training and use of Large Language Models (LLMs) require large amounts of energy. Their increasing popularity, therefore, raises critical concerns regarding the energy efficiency and sustainability of data centers that host them. This paper addresses the challenge of reducing energy consumption in data centers running LLMs. We propose a hybrid data center model that uses a cost-based scheduling framework to dynamically allocate LLM tasks across hardware accelerators that differ in their energy efficiencies and computational capabilities. Specifically, our workload-aware strategy determines whether tasks are processed on energy-efficient processors or high-performance GPUs based on the number of input and output tokens in a query. Our analysis of a representative LLM dataset, finds that this hybrid strategy can reduce CPU+GPU energy consumption by 7.5% compared to a workload-unaware baseline.
With the rapid acceleration of ML/AI research in the last couple of years, the energy consumption of the Information and Communication Technology (ICT) domain has rapidly increased. As a major part of this energy cons...
详细信息
ISBN:
(纸本)9798350355543
With the rapid acceleration of ML/AI research in the last couple of years, the energy consumption of the Information and Communication Technology (ICT) domain has rapidly increased. As a major part of this energy consumption is due to users' workloads, it is evident that users need to be aware of the energy footprint of their applications. Compute Energy & Emissions Monitoring Stack (CEEMS) has been designed to address this issue. CEEMS can report energy consumption and equivalent emissions of user workloads in real time for HPC and cloud platforms alike. Besides CPU energy usage, it supports reporting energy usage of workloads on NVIDIA and AMD GPU accelerators. CEEMS has been built around the prominent open-source tools in the observability eco-system like Prometheus and Grafana. CEEMS has been designed to be extensible and it allows the Data Center (DC) operators to easily define the energy estimation rules of user workloads based on the underlying hardware. This paper explains the architectural overview of CEEMS, data sources that are used to measure energy usage and estimate equivalent emissions and potential use cases of CEEMS from operator and user perspectives. Finally, the paper will conclude by describing how CEEMS deployment on the Jean-Zay supercomputing platform is capable of monitoring more than 1400 nodes that have a daily job churn rate of around 20k jobs.
Witha massive upsurge in data, combining dedu-plication with distributed storage continuously suffer from a low deduplication ratio when providing the corresponding through-put. It is because distributed storage requi...
详细信息
This paper studies how to improve the performance of main memory multicore OLTP systems for executing transactions with conflicts. A promising approach is to partition transaction workloads into mutually conflict-free...
详细信息
This paper studies how to improve the performance of main memory multicore OLTP systems for executing transactions with conflicts. A promising approach is to partition transaction workloads into mutually conflict-free clusters, and distribute the clusters to different cores for concurrent execution. We show that if transactions in each cluster are properly scheduled, transactions that are traditionally considered conflicting can be executed without conflicts at runtime. In light of this, we propose to schedule transactions and reduce runtime conflicts, instead of partitioning based on the conventional notion of conflicts. We formulate the transaction scheduling problem to minimize runtime conflicts, and show that the problem is NP-complete. This said, we develop an efficient scheduling algorithm to improve parallelism. Moreover, for transactions that are not packed in batches, we show that runtime conflict analysis also helps reduce conflict penalties, by proposing a proactive deferring method. Using standard and enhanced benchmarks, we show that on average our scheduling and proactive deferring methods improve the throughput of existing partitioners and concurrency control protocols by 131% and 109%, respectively, up to 294% and 152%.
A fault in large online service systems often triggers numerous alerts due to the complex business and component dependencies among services, which is known as "alert storm". In a short time, an online servi...
详细信息
ISBN:
(纸本)9798350329964
A fault in large online service systems often triggers numerous alerts due to the complex business and component dependencies among services, which is known as "alert storm". In a short time, an online service system may generate a huge amount of alert data. This poses a challenge for on-call engineers to identify alerts that are associated with a system failure for root cause analysis. In this paper, we propose DyAlert, a dynamic graph neural networks-based approach for linking alerts that might be triggered by a same fault to reduce the burden of on-call engineers in the fault analysis. Our insight is that alerts are often triggered by alert propagation when a system failure occurs, e.g., alert a would lead to the occurrence of alert b. Whether two alerts should be linked depends on if one alert is triggered by the propagation of the other. Leveraging this insight, we design a dynamic graph (namely Alert-Metric Dynamic Graph) that describes the propagation process of alerts. Based on the dynamic graph, we train a neural networks-based model to predict alert links. We evaluate DyAlert with real-world data collected from an online service system running 85 business units and about 30,000 different services in a large enterprise. The results show that DyAlert is effective in predicting alert links and it outperforms the state-of-the-art approaches with an average increase of 0.259 in F1-score.
Federated Learning (FL) is a promising solution for collaborative machine learning while respecting data privacy and locality. FL has been used in Low Earth Orbit (LEO) satellite constellations for different space app...
详细信息
ISBN:
(纸本)9798400702341
Federated Learning (FL) is a promising solution for collaborative machine learning while respecting data privacy and locality. FL has been used in Low Earth Orbit (LEO) satellite constellations for different space applications including earth observation, navigation, and positioning. Orbital Edge computing (OEC) refers to the deployment of edge computing resources and data processing capabilities in space-based systems, enabling real-time data analysis and decision-making for remote and space-based applications. While there is existing research exploring the integration of federated learning in OEC, the influence of diverse factors such as space conditions, communication constraints, and machine learning models remains uncertain. This paper addresses this gap and presents a comprehensive performance analysis of FL methods in the unique and challenging setting of OEC. We consider model accuracy, training time, and power consumption as the performance metrics under different working conditions including IID and non-IID data distributions to analyse the performance of centralised and decentralised FL approaches. The experimental results demonstrate that although the asynchronous centralised FL method has high fluctuations in the accuracy curve, it is suitable for space applications in which power consumption and training time are two main factors. In addition, the number of sampled satellites for decentralised FL methods is an important parameter in non-IID data distribution. Moreover, increasing altitude can reduce the training time and increase the power consumption. This study enables us to highlight a number of performance challenges in OEC for further investigation.
As the COVID-19 pandemic began to unfold, reports of the impact on lives, public and private industry, and education systems began to emerge. Almost simultaneously, the long-standing pandemic of racism in the U.S. was...
详细信息
The proceedings contain 71 papers. The topics discussed include: efficient user inspection algorithm based on dual bloom filters oriented for blockchain data management systems;secure outsourcing of fuzzy linear regre...
ISBN:
(纸本)9781665439749
The proceedings contain 71 papers. The topics discussed include: efficient user inspection algorithm based on dual bloom filters oriented for blockchain data management systems;secure outsourcing of fuzzy linear regression in cloud computing;design of the afterburning chamber mock-up casing;researching the relationship between information efficiency and complexity of digital information processing devices;soft measurement of process improvement potential;neuro-fuzzy model of gas balance control;cryptography professional rival (CPR): a game designing model to learn cryptography;assessing the degree of the social media user's openness using an expert model based on the Bayesian network;comprehensive analysis of cyber-physical systems data;measurement virtualization technologies for intelligent information and measurementsystems;and fuzzy assessment of the competitiveness of cloud software products.
Background: Primary hepatocellular carcinoma (HCC) is is a complex malignant tumor with high mortality. To explore the pathogenesis of HCC, Based on the GEO database, bioinformatics analysis was carried out based on t...
详细信息
The proceedings contain 191 papers. The topics discussed include: virtual inertia control strategy for PV using dc capacitive and electrochemical energy storage;liquid drip detection in power plant based on machine vi...
ISBN:
(纸本)9781665467735
The proceedings contain 191 papers. The topics discussed include: virtual inertia control strategy for PV using dc capacitive and electrochemical energy storage;liquid drip detection in power plant based on machine vision;intelligent recognition algorithm of specific pattern content information based on mobile terminal equipment;research on power communication defect diagnosis technology based on unsupervised learning;key technologies of high speed modulator for satellite data transmission;research on privacy fraud detection of logistic regression based on homomorphic encryption;application research on x-ray imaging technology for online detection of gas insulated switchgear;power transformer state identification method based on operational deflection shapes and visual measurement technology;research on identification method of potential abnormal station for miniature circuit breaker;trajectory tracking error optimization based on iterative learning;research and design of the intelligent bulk transfer laboratory interconnection;energy management cloud platform in softwarized energy internet;simulation and application of a new power stealing method based on the comparison of two types of transformers;lane line detection based on machine vision;an analysis method of GIS equipment fault causes based on online monitoring and joint diagnosis;and virtual prototype design of flexible exoskeleton based on pattern recognition technology.
暂无评论