We investigate the problem of finding a spanning tree of a set of n moving points in Rdim that minimizes the maximum total weight (under any convex distance function) or the maximum bottleneck throughout the motion. T...
详细信息
AI and reinforcement learning (RL) have attracted great attention in the study of multiplayer systems over the past decade. Despite the advances, most of the studies are focused on synchronized decision-making to atta...
详细信息
Vehicular edge computing (VEC) allows vehicles to process part of the tasks locally at the network edge while offloading the rest of the tasks to a centralized cloud server for processing. A massive volume of tasks ge...
详细信息
Resonant operation, exploiting high quality-factor planar inductors, has recently enabled gigahertz (GHz) applications for large-area electronics (LAE), providing a new technology platform for large-scale and flexible...
详细信息
Six-phase motors are becoming more popular because of their advantages such as lower torque ripple, better power distribution per phase, higher efficiency, and fault-tolerant capability compared to the three-phase one...
详细信息
Gradient compression is a promising approach to alleviating the communication bottleneck in data parallel deep neural network (DNN) training by significantly reducing the data volume of gradients for synchronization. ...
详细信息
Gradient compression is a promising approach to alleviating the communication bottleneck in data parallel deep neural network (DNN) training by significantly reducing the data volume of gradients for synchronization. While gradient compression is being actively adopted by the industry (e.g., Facebook and AWS), our study reveals that there are two critical but often overlooked challenges: 1) inefficient coordination between compression and communication during gradient synchronization incurs substantial overheads, and 2) developing, optimizing, and integrating gradient compression algorithms into DNN systems imposes heavy burdens on DNN practitioners, and ad-hoc compression implementations often yield surprisingly poor system performance. In this paper, we propose a compression-aware gradient synchronization architecture, CaSync, which relies on flexible composition of basic computing and communication primitives. It is general and compatible with any gradient compression algorithms and gradient synchronization strategies and enables high-performance computation-communication pipelining. We further introduce a gradient compression toolkit, CompLL, to enable efficient development and automated integration of on-GPU compression algorithms into DNN systems with little programming burden. Lastly, we build a compression-aware DNN training framework HiPress with CaSync and CompLL. HiPress is open-sourced and runs on mainstream DNN systems such as MXNet, TensorFlow, and PyTorch. Evaluation via a 16-node cluster with 128 NVIDIA V100 GPUs and a 100 Gbps network shows that HiPress improves the training speed over current compression-enabled systems (e.g., BytePS-onebit, Ring-DGC and PyTorch-PowerSGD) by 9.8%-69.5% across six popular DNN models. IEEE
Predictability is an essential challenge for autonomous vehicles(AVs)’*** neural networks have been widely deployed in the AV’s perception ***,it is still an open question on how to guarantee the perception predicta...
详细信息
Predictability is an essential challenge for autonomous vehicles(AVs)’*** neural networks have been widely deployed in the AV’s perception ***,it is still an open question on how to guarantee the perception predictability for AV because there are millions of deep neural networks(DNNs)model combinations and system configurations when deploying DNNs in *** paper proposes configurable predictability testbed(CPT),a configurable testbed for quantifying the predictability in AV’s perception *** provides flexible configurations of the perception pipeline on data,DNN models,fusion policy,scheduling policies,and predictability *** top of CPT,the researchers can profile and optimize the predictability issue caused by different application and system *** has been open-sourced at:https://***/Torreskai0722/CPT.
Network Intrusion Detection System (NIDS) serves as a essential component in data protection by monitoring computer networks for threats that can bypass conventional defenses such as malware and hackers. Deep learning...
详细信息
Semi-supervised learning (SSL) aims to reduce reliance on labeled data. Achieving high performance often requires more complex algorithms, therefore, generic SSL algorithms are less effective when it comes to image cl...
详细信息
The rapid development of deep learning provides great convenience for production and ***,the massive labels required for training models limits further ***-shot learning which can obtain a high-performance model by le...
详细信息
The rapid development of deep learning provides great convenience for production and ***,the massive labels required for training models limits further ***-shot learning which can obtain a high-performance model by learning few samples in new tasks,providing a solution for many scenarios that lack *** paper summarizes few-shot learning algorithms in recent years and proposes a ***,we introduce the few-shot learning task and its ***,according to different implementation strategies,few-shot learning methods in recent years are divided into five categories,including data augmentation-based methods,metric learning-based methods,parameter optimization-based methods,external memory-based methods,and other ***,We investigate the application of few-shot learning methods and summarize them from three directions,including computer vision,human-machine language interaction,and robot ***,we analyze the existing few-shot learning methods by comparing evaluation results on mini Image Net,and summarize the whole paper.
暂无评论