Gradient compression is a promising approach to alleviating the communication bottleneck in data parallel deep neural network (DNN) training by significantly reducing the data volume of gradients for synchronization. ...
详细信息
Gradient compression is a promising approach to alleviating the communication bottleneck in data parallel deep neural network (DNN) training by significantly reducing the data volume of gradients for synchronization. While gradient compression is being actively adopted by the industry (e.g., Facebook and AWS), our study reveals that there are two critical but often overlooked challenges: 1) inefficient coordination between compression and communication during gradient synchronization incurs substantial overheads, and 2) developing, optimizing, and integrating gradient compression algorithms into DNN systems imposes heavy burdens on DNN practitioners, and ad-hoc compression implementations often yield surprisingly poor system performance. In this paper, we propose a compression-aware gradient synchronization architecture, CaSync, which relies on flexible composition of basic computing and communication primitives. It is general and compatible with any gradient compression algorithms and gradient synchronization strategies and enables high-performance computation-communication pipelining. We further introduce a gradient compression toolkit, CompLL, to enable efficient development and automated integration of on-GPU compression algorithms into DNN systems with little programming burden. Lastly, we build a compression-aware DNN training framework HiPress with CaSync and CompLL. HiPress is open-sourced and runs on mainstream DNN systems such as MXNet, TensorFlow, and PyTorch. Evaluation via a 16-node cluster with 128 NVIDIA V100 GPUs and a 100 Gbps network shows that HiPress improves the training speed over current compression-enabled systems (e.g., BytePS-onebit, Ring-DGC and PyTorch-PowerSGD) by 9.8%-69.5% across six popular DNN models. IEEE
Digital signal processors are extensively used to execute mathematical operations and advanced computational tasks on digital ***,they suffer from several inherent limitations,including low speed,high energy consumpti...
详细信息
Digital signal processors are extensively used to execute mathematical operations and advanced computational tasks on digital ***,they suffer from several inherent limitations,including low speed,high energy consumption,and large memory requirements,because of the hardware bottleneck and the imperative conversion between digital and analogue signals.
This paper proposes a new cluster method combined with Dynamic Mode Decomposition with Control (DMDc), and the Proper Orthogonal Decomposition (POD) to construct more accurate reduced order models. DMDc and POD are po...
In the era of advancement in technology and modern agriculture, early disease detection of potato leaves will improve crop yield. Various researchers have focussed on disease due to different types of microbial infect...
详细信息
We have witnessed the emergence of superhuman intelligence thanks to the fast development of large language models(LLMs) and multimodal language models. As the application of such superhuman models becomes increasingl...
详细信息
We have witnessed the emergence of superhuman intelligence thanks to the fast development of large language models(LLMs) and multimodal language models. As the application of such superhuman models becomes increasingly popular, a critical question arises: how can we ensure they still remain safe, reliable, and aligned well with human values encompassing moral values, Schwartz's Values, ethics, and many more? In this position paper, we discuss the concept of superalignment from a learning perspective to answer this question by outlining the learning paradigm shift from large-scale pretraining and supervised fine-tuning, to alignment training. We define superalignment as designing effective and efficient alignment algorithms to learn from noisy-labeled data(point-wise samples or pair-wise preference data) in a scalable way when the task is very complex for human experts to annotate and when the model is stronger than human experts. We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation. We then present a conceptual framework for superalignment, which comprises three modules: an attacker which generates the adversary queries trying to expose the weaknesses of a learner model, a learner which refines itself by learning from scalable feedbacks generated by a critic model with minimal human experts, and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing. We discuss some important research problems in each component of this framework and highlight some interesting research ideas that are closely related to our proposed framework, for instance, self-alignment, self-play, self-refinement, and more. Last, we highlight some future research directions for superalignment, including the identification of new emergent risks and multi-dimensional alignment.
Vehicular consumer electronics, such as autonomous vehicles (AVs), need collecting large amounts of private user information, which face the risk of privacy leakage. To protect the privacy of consumers, researchers ha...
详细信息
Building Automation Systems(BASs)are seeing increased usage in modern society due to the plethora of benefits they provide such as automation for climate control,HVAC systems,entry systems,and lighting *** BASs in use...
详细信息
Building Automation Systems(BASs)are seeing increased usage in modern society due to the plethora of benefits they provide such as automation for climate control,HVAC systems,entry systems,and lighting *** BASs in use are outdated and suffer from numerous vulnerabilities that stem from the design of the underlying BAS *** this paper,we provide a comprehensive,up-to-date survey on BASs and attacks against seven BAS protocols including BACnet,EnOcean,KNX,LonWorks,Modbus,ZigBee,and *** studies of secure BAS protocols are also presented,covering BACnet Secure Connect,KNX Data Secure,KNX/IP Secure,ModBus/TCP Security,EnOcean High Security and Z-Wave *** and ZigBee do not have security *** point out how these security protocols improve the security of the BAS and what issues remain.A case study is provided which describes a real-world BAS and showcases its vulnerabilities as well as recommendations for improving the security of *** seek to raise awareness to those in academia and industry as well as highlight open problems within BAS security.
This paper improves the ill-condition of bone-conducted (BC) speech signal by reducing the eigenvalue expansion. BC speech commonly contains a large spectral dynamic range that causes ill-condition for the classical l...
详细信息
Plasma therapy is an extensively used treatment for critically unwell *** this procedure,a legitimate plasma donor who can continue to supply plasma after healing is ***,significant dangers are associated with supply ...
详细信息
Plasma therapy is an extensively used treatment for critically unwell *** this procedure,a legitimate plasma donor who can continue to supply plasma after healing is ***,significant dangers are associated with supply management,such as the ambiguous provenance of plasma and the spread of infected or subpar blood into medicinal ***,from an ideological standpoint,less powerful people may be exploited throughout the contribution ***,there is a danger to the logistics system because there are now just some plasma *** research intends to investigate the blockchain-based solution for blood plasma to facilitate authentic plasma *** parameters,including electronic identification,chain code,and certified ledgers,have the potential to exert a substantial,profound influence on the distribution and implementation process of blood *** understand the practical ramifications of blockchain,the current study provides a proof of concept approach that aims to simulate the procedural code of modern plasma distribution ecosystems using a blockchain-based *** agent-based modeling used in the testing and evaluation mimics the supply chain to assess the blockchain’s feasibility,advantages,and constraints for the plasma.
In this article, we propose a novel volumetric video caching and rendering approach for an edge-assisted extended reality (XR) system to enhance user Quality of Experience (QoE). Particularly, user QoE consists of vis...
详细信息
暂无评论