作者:
Li, JiaxuanZheng, Qilong
School of Computer Science Hefei China
National High Performance Computing Center Hefei China
Loops often dominate the execution time in high-performancecomputing, effective loop optimization is critical for overall performance. We propose a reinforcement learning-based framework that automatically discovers ...
详细信息
作者:
Yan, JiapengZheng, Qilong
School of Computer Science Hefei China
National High Performance Computing Center Hefei China
AI operators refer to reusable code programs encapsulated in AI chip frameworks that implement specific functions. To achieve highperformance, it is necessary to combine hardware characteristics efficiently when prog...
详细信息
Training an object detection model often requires numerous annotated images on a centralized host, which may violate user privacy and data confidentiality. Federated learning (FL) resolves this issue by allowing multi...
详细信息
Mobile/multi-access edge computing (MEC) is developed to support the upcoming AI-aware mobile services, which require low latency and intensive computation resources at the edge of the network. One of the most challen...
详细信息
This work proposes a framework for generating datasets that allows users to adjust the APT attack techniques within it. The framework utilizes the MITRE ATT&CK framework to label the attack traffic based on the Ta...
详细信息
Big data streams with diversity are generally processed by parallel computing environments with multiple computational nodes. Before processing, the big data streams need to be partitioned into sub-streams and cached ...
详细信息
Loops often dominate the execution time in high-performancecomputing, effective loop optimization is critical for overall performance. We propose a reinforcement learning–based framework that automatically discovers...
详细信息
ISBN:
(数字)9798331529482
ISBN:
(纸本)9798331529499
Loops often dominate the execution time in high-performancecomputing, effective loop optimization is critical for overall performance. We propose a reinforcement learning–based framework that automatically discovers and composes transformations—including tiling, fusion, interchange, and unrolling—and evaluate the framework on a subset of Polybench benchmarks. Compared to the Polly compiler baseline, our approach achieves an average speedup of 2.46×, peaking at 7× on the jacobi-1d kernel, while also consistently outperforming a global greedy scheduling algorithm. By adaptively combining multiple transformations, the RL-based method exploits deeper synergies with minimal overhead once trained, thus alleviating the repeated manual tuning and hardware-specific adjustments required by conventional techniques.
Secure sum protocol is a significant secure multiparty computation protocol and it has various applications in privacy-preserving distributed multiparty computation. However, most existing secure sum protocols rarely ...
详细信息
Secure sum protocol is a significant secure multiparty computation protocol and it has various applications in privacy-preserving distributed multiparty computation. However, most existing secure sum protocols rarely considered how to resist underlying collusion which is a significant practical problem. Urabe et al. proposed a collusion-resistant secure sum protocol, but too much cost of communication and computation results in its low performance efficiency. In this paper, we propose security definitions to measure secure multiparty computation protocol's capability of resisting potential collusion. Then, we precisely analyze several previous secure sum protocols' capability of resisting collusion. In addition, considering realistic requirement to resist collusion and performance efficiency needs, we present a novel collusion-resisting secure sum protocol. Theoretical analysis and experimental results confirm that our secure sum protocol is efficient and has strong capability of resisting potential collusion such that it is much superior to previous ones. The communication overheads and computation complexity of our scheme both are linearity of the number of participants. Besides, our protocol's capability of resisting collusion is adjustable according to different security needs.
Artificial Intelligence(AI)has gained popularity for the containment of COVID-19 pandemic *** AI techniques provide efficient mechanisms for handling pandemic *** methods,protocols,data sets,and various validation mec...
详细信息
Artificial Intelligence(AI)has gained popularity for the containment of COVID-19 pandemic *** AI techniques provide efficient mechanisms for handling pandemic *** methods,protocols,data sets,and various validation mechanisms empower the users towards proper decision-making and procedures to handle the *** so many tools,there still exist conditions in which AI must go a long *** increase the adaptability and potential of these techniques,a combination of AI and Bigdata is currently gaining *** paper surveys and analyzes the methods within the various computational paradigms used by different researchers and national governments,such as China and South Korea,to fight against this *** process of vaccine development requires multiple medical *** process requires analyzing datasets from different parts of the *** learning and the Internet of Things(IoT)revolutionized the field of disease diagnosis and disease *** accurate observations from different datasets across the world empowered the process of drug development and drug *** overcome the issues generated by the pandemic,using such sophisticated computing paradigms such as AI,Machine Learning(ML),deep learning,Robotics and Bigdata is essential.
AI operators refer to reusable code programs encapsulated in AI chip frameworks that implement specific functions. To achieve highperformance, it is necessary to combine hardware characteristics efficiently when prog...
详细信息
ISBN:
(数字)9798350316537
ISBN:
(纸本)9798350316544
AI operators refer to reusable code programs encapsulated in AI chip frameworks that implement specific functions. To achieve highperformance, it is necessary to combine hardware characteristics efficiently when programming. However, for operators with multiple computational steps, it is challenging to develop excellent scheduling strategies. To address this issue, this paper proposes a graph-based method for AI chip operator optimization. Firstly, establish bidirectional transformation relationship between operators and corresponding computation graphs. Then, use deepwalk and word2vec to convert operators’ computation graphs into embedding representations, and optimize corresponding operators by annotating the nodes using graph neural network. Also, simple operator fusion can be achieved by fusing graphs of multiple operators and optimizing the fused operator. Through the creation of an operator dataset and related experiments within the Cambricon community framework, this method demonstrates superior optimization and fusion of element-wise operators compared to other simple tuning methods.
暂无评论