检索结果-内蒙古大学图书馆

Statistical static timing analysis for circuit aging prediction

The Journal of China Universities of Posts and Telecommunications 2021年第2期28卷 14-23页

作者： Duan Shengyu Zhai Dongyao Lu Yue Shanghai Engineering Research Center of Intelligent Computing System Shanghai UniversityShanghai 200444China State Key Laboratory of Computer Architecture Institute of Computing TechnologyChinese Academy of SciencesBeijing 100190China School of Electronics and Computer Science University of SouthamptonSouthamptonUK

Complementary metal oxide semiconductor(CMOS)aging mechanisms including bias temperature instability(BTI)pose growing concerns about circuit *** results in threshold voltage increases on CMOS transistors,causing delay shifts and timing violations on logic *** amount of degradation is dependent on the circuit workload,which increases the challenge for accurate BTI aging prediction at the design *** this paper,a BTI prediction method for logic circuits based on statistical static timing analysis(SSTA)is proposed,especially considering the correlation between circuit workload and BTI *** consists of a training phase,to discover the relationship between circuit scale and the required workload samples,and a prediction phase,to present the degradations under different workloads in Gaussian probability *** method can predict the distribution of degradations with negligible errors,and identify 50%more BTI-critical paths in an affordable time,compared with conventional methods.

关键词： bias temperature instability(BTI) reliability prediction statistical static timing analysis(SSTA)

来源：评论

学校读者我要写书评

暂无评论

Building Agile Workflow Microservice System for HPC Applications Based on Fast-start OSv 27

Building Agile Workflow Microservice System for HPC Applicat...

引用

27th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2021

作者： Hu, Qiang Ma, Lixian Shao, En Tan, Guangming Institute of Computing Technology CAS State Key Laboratory of Computer Architecture Beijing China University of Chinese Academy of Science Beijing China

ISBN: (纸本)9781665408783

The advances of containers have significantly promoted the development of microservice architecture. This architecture splits a monolithic application into multiple independent components and the container orchestrator manages these components by the container in the cloud environment. The feasibility of deploying high performance computing(HPC) applications as microservices has been proven, but the existing container orchestrator incurs a large performance overhead as there is interference between different containers on the same physical host. In this paper, we design an agile workflow microservice system for HPC applications with fast-start OSv. We consider improving HPC workflow performance from two aspects: single OSv startup time optimization and workflow orchestration optimization. For single OSv startup time optimization, we design a fast-start OSv by analyzing the process of OSv startup and finding an optimization by modifying OSv source code. In this way, we get nearly 50% improvement of startup time. For workflow orchestration optimization, we propose four optimization techniques to speed up the execution of workflow by jointly considering OSv and workflow features, namely: node fusion, node merge, image preload, boot delay. Furthermore, we utilize our fast startup OSv to design an orchestration system for efficiently building an agile HPC workflow microservice by Kubevirt. Our experimental results optimization microservice system reduces the execution time by 30% compared with the original deployment with docker. © 2021 IEEE.

关键词： Containers

来源：评论

学校读者我要写书评

暂无评论

DSGA: Distractor-Suppressing Graph Attention for Multi-object Tracking 8

DSGA: Distractor-Suppressing Graph Attention for Multi-objec...

引用

8th International Conference on Robotics and Artificial Intelligence, ICRAI 2022

作者： Gao, Guangjie Gao, Yan Xu, Liyang Tan, Huibin Tang, Yuhua Institute for Quantum Information State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China Chinese Institute of Electronics Beijing China

ISBN: (纸本)9781450397544

Multiple object tracking (MOT) methods based on single object tracking are of great interest because of their ability to balance efficiency and performance on the strength of the localization capability of single-target tracking. However, most of the single object tracking methods only distinguish foreground and background. They are susceptible to the influence of similar interfering objects during localization, while in multiple object tracking scenarios, there are more interfering objects and the influence is more severe. Therefore, we propose a Distractor-Suppressing Graph Attention (DSGA) to learn more discriminative attention by reducing the influence of distractors on learning attention weight features. Furthermore, DSGA is embedded into the basic MOT framework "SiamMOT" formed as DSGA-SiamMOT and applied to multiple object tracking to verify its effectiveness. We conduct experiments on the MOT Challenge benchmark with "public detection", and obtain MOTA 66.65%, IDF1 62.2% accuracy on the MOT17 dataset with 14fps. © 2022 Association for computing Machinery.

关键词： Target tracking

来源：评论

学校读者我要写书评

暂无评论

Network-on-Interposer Design for Agile Neural-Network Processor Chip Customization 21

Network-on-Interposer Design for Agile Neural-Network Proces...

引用

Proceedings of the 58th Annual ACM/IEEE Design Automation Conference

作者： Mengdi Wang Ying Wang Cheng Liu Lei Zhang Institute of Computing Technology Chinese Academy of Sciences Institute of Computing Technology Chinese Academy of Sciences and State Key Laboratory of Computer Architecture

ISBN: (纸本)9781665432740

Chiplet based multi-die integration has been thought as a key enabler of the agile chip development flow. For 2.5D based multi-die system, Network on Interposer plays an essential role in the performance and the development cost of the chips. This work proposed a reusable NoI design for agile AI chip customization. The proposed NoI design can self-adapt to the inter-die communication patterns of various neural network applications, so the produced interposers can be reused across different AI chip specifications. Experimental results show the proposed NoI design brings 42.7%~79.5% of total data communication latency reduction in different scenarios, and it also decreased the area overhead by 26.4%.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep Reinforcement Agent for Failure-aware Job scheduling in High-Performance computing 27

Deep Reinforcement Agent for Failure-aware Job scheduling in...

引用

27th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2021

作者： Yang, Kang Cao, Rongyu Zhou, Yueyuan Zhang, Jiawei Shao, En Tan, Guangming Institute of Computing Technology CAS State Key Laboratory of Computer Architecture Beijing China University of Chinese Academy of Sciences Beijing China

ISBN: (纸本)9781665408783

Job scheduling is crucial in high-performance computing (HPC), which is dedicated to deciding when and which jobs are allocated to the system and placing the jobs on which resources, by considering multiple scheduling goals. Along with the incremental of various resources and dazzling deep learning training (DLT) workloads, job failure becomes a quite common issue in HPC, which will affect user satisfaction and cluster utilization. To alleviate the influence of hardware and software errors as much as possible, in this paper, we aim to tackle the problem of failure-aware job scheduling in HPC clusters. Inspired by the success of previous studies of deep reinforcement learning-driven job scheduling, we propose a novel HPC scheduling agent named FARS (Failure-aware RL-based scheduler) by considering the effects of job failures. On the one hand, a neural network is applied to map the information of raw cluster and job states to job placement decisions. On the other hand, to consider the influence of job failure for user satisfaction and cluster utilization, FARS leverages make-span of the entire workload as the training objective. Additionally, effective exploration and experience replay techniques are applied to obtain effectively converged agent. To evaluate the capability of FARS, we design extensive trace-based simulation experiments with the popular DLT workloads. The experimental results show that, compared with the best baseline model, FARS obtains 5.69% improvement of average make-span under different device error rates. Together, our FARS is an ideal candidate for failure-aware job scheduler in HPC clusters. © 2021 IEEE.

关键词： Scheduling

来源：评论

学校读者我要写书评

暂无评论

LIAS: A Lightweight Incentive Authentication Scheme for Forensic Services in IoV

LIAS: A Lightweight Incentive Authentication Scheme for Fore...

引用

作者： Zhang, Mingyue Zhou, Junlong Cong, Peijin Zhang, Gongxuan Zhuo, Cheng Hu, Shiyan Nanjing University of Science and Technology School of Computer Science and Engineering Nanjing210094 China Institute of Computing Technology Chinese Academy of Sciences State Key Laboratory of Computer Architecture Beijing100190 China Zhejiang University Department of Information Science and Electronic Engineering Hangzhou310027 China University of Southampton School of Electronics and Computer Science SouthamptonSO17 1BJ United Kingdom

Internet of Vehicles (IoV) has become an indispensable data sensing and processing platform in Internet of Things (IoT) for intelligent transportation. The mounted cameras on the vehicles along with the fixed roadside cameras are utilized to provide pictorial services for IoV users and law enforcement agencies. For such forensic services, ensuring the security and privacy of vehicles while guaranteeing the efficiency of data transmission among vehicles is important. In this paper, we propose a lightweight incentive authentication scheme (LIAS) for forensic services in IoV. LIAS is developed on a three-tier architecture containing cloud layer, fog layer, and user layer. LIAS uses pairing-free certificateless signcryption, pseudonym update mechanism, and incentive mechanism to realize a secure anonymous authentication efficiently. We conduct correctness and security analysis, as well as performance analysis and evaluation to validate the high security and efficiency of LIAS. Experimental results reveal that, the communication and computation overheads as well as the message delay and packet loss of LIAS are much lower than those of state-of-the-art techniques. Note to Practitioners - This paper is motivated by the security and privacy issues of forensic services in IoV for intelligent transportation. Our goal is to improve the security and privacy of vehicles while guaranteeing the lightweight and incentive of data transmission among the vehicles. Fog-assisted IoV is introduced to fully utilize the capacities of near-user edge devices as well as the connections between fog nodes and devices. However, it still faces the difficulties in ensuring vehicles' security and privacy. Moreover, vehicles' information dissemination could be easily monitored because of the unavoidable defect of wireless communication. Thereby, it is essential to guarantee the security and privacy of vehicles while enhancing the efficiency of vehicles' data transmission during the forensic service.

关键词： Authentication

来源：评论

学校读者我要写书评

暂无评论

DrugProtKGE: Weakly Supervised Knowledge Graph Embedding for Highly-Effective Drug-Protein Interaction Representation

DrugProtKGE: Weakly Supervised Knowledge Graph Embedding for...

引用

2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Qiu, Yanlong Wang, Siqi Yang, Xi Qiu, Xinyuan Wu, Chengkun Cui, Yingbo Yang, Canqun National University of Defense Technology Institute for Quantum Information State Key Laboratory of High-Performance Computing College of Computer Science Hunan Changsha410073 China National Supercomputer Center in Tianjin Tianjin300457 China National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing College of Computer Science Hunan Changsha410073 China National University of Defense Technology Department of Biology and Chemistry College of Science Hunan Changsha410073 China

ISBN: (纸本)9798350337488

With the exponential growth of biomedical knowledge in unstructured text repositories such as PubMed, it is imminent to establish a knowledge graph-style, efficient searchable and targeted database that can support the need of information retrieval from researchers and clinicians. To mine knowledge from graph databases, most previous methods view a triple in a graph (see Fig. 1) as the basic processing unit and embed the triplet element (i.e. drugs/chemicals, proteins/genes and their interaction) as separated embedding matrices, which cannot capture the semantic correlation among triple elements. To remedy the loss of semantic correlation caused by disjoint embeddings, we propose a novel approach to learn triple embeddings by combining entities and interactions into a unified representation. Furthermore, traditional methods usually learn triple embeddings from scratch, which cannot take advantage of the rich domain knowledge embedded in pre-trained models, and is also another significant reason for the fact that they cannot distinguish the differences implied by the same entity in the multi-interaction triples. In this paper, we propose a novel fine-tuning based approach to learn better triple embeddings by creating weakly supervised signals from pre-trained knowledge graph embeddings. The method automatically samples triples from knowledge graphs and estimates their pairwise similarity from pre-trained embedding models. The triples are then fed pairwise into a Siamese-like neural architecture, where the triple representation is fine-tuned in the manner bootstrapped by triple similarity scores. Finally, we demonstrate that triple embeddings learned with our method can be readily applied to several downstream applications (e.g. triple classification and triple clustering). We evaluated the proposed method on two open-source drug-protein knowledge graphs constructed from PubMed abstracts, as provided by BioCreative. Our method achieves consistent improvement in both t

关键词： Drug-Protein Interaction Knowledge Graph Embedding Triple Embedding Weakly Supervised Learning

来源：评论

学校读者我要写书评

暂无评论

FIRE: a dataset for feedback integration and refinement evaluation of multimodal models 24

FIRE: a dataset for feedback integration and refinement eval...

引用

Proceedings of the 38th International Conference on Neural Information Processing Systems

作者： Pengxiang Li Zhi Gao Bofei Zhang Tao Yuan Yuwei Wu Mehrtash Harandi Yunde Jia Song-Chun Zhu Qing Li Beijing Key Laboratory of Intelligent Information Technology School of Computer Science & Technology Beijing Institute of Technology and State Key Laboratory of General Artificial Intelligence BIGAI State Key Laboratory of General Artificial Intelligence BIGAI and State Key Laboratory of General Artificial Intelligence Peking University State Key Laboratory of General Artificial Intelligence BIGAI Beijing Key Laboratory of Intelligent Information Technology School of Computer Science & Technology Beijing Institute of Technology and Guangdong Laboratory of Machine Perception and Intelligent Computing Shenzhen MSU-BIT University Department of Electrical and Computer System Engineering Monash University Guangdong Laboratory of Machine Perception and Intelligent Computing Shenzhen MSU-BIT University and Beijing Key Laboratory of Intelligent Information Technology School of Computer Science & Technology Beijing Institute of Technology State Key Laboratory of General Artificial Intelligence BIGAI and State Key Laboratory of General Artificial Intelligence Peking University and Department of Automation Tsinghua University

ISBN: (纸本)9798331314385

Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction. In this paper, we build FIRE, a feedback-refinement dataset, consisting of 1.1M multi-turn conversations that are derived from 27 source datasets, empowering VLMs to spontaneously refine their responses based on user feedback across diverse tasks. To scale up the data collection, FIRE is collected in two components: FIRE-100K and FIRE-1M, where FIRE-100K is generated by GPT-4V, and FIRE-1M is freely generated via models trained on FIRE-100K. Then, we build FIRE-Bench, a benchmark to comprehensively evaluate the feedback-refining capability of VLMs, which contains 11K feedback-refinement conversations as the test data, two evaluation settings, and a model to provide feedback for VLMs. We develop the FIRE-LLaVA model by fine-tuning LLaVA on FIRE-100K and FIRE-1M, which shows remarkable feedback-refining capability on FIRE-Bench and outperforms untrained VLMs by 50%, making more efficient user-agent interactions and underscoring the significance of the FIRE dataset.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An Auto-Parallel Method for Deep Learning Models Based on Genetic Algorithm 29

An Auto-Parallel Method for Deep Learning Models Based on Ge...

引用

29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023

作者： Zeng, Yan Huang, Chengchuang Ni, Yijie Zhou, Chunbao Zhang, Jilin Wang, Jue Zhou, Mingyao Xue, Meiting Zhang, Yunquan Hangzhou Dianzi University School of Computer Science and Technology Hangzhou310018 China Ministry of Education Key Laboratory for Modeling and Simulation of Complex Systems Hangzhou310018 China Data Security Governance Zhejiang Engineering Research Center Hangzhou310018 China Hangzhou Dianzi University School of ITMO Joint Institute Hangzhou310018 China Institute of Computer Network Information Center of the Chinese Academy of Sciences Beijing100086 China HuaWei China Institute of Computing Technology of the Chinese Academy of Sciences State Key Laboratory of Computer Architecture Beijing100086 China

ISBN: (纸本)9798350330717

As the size of datasets and neural network models increases, automatic parallelization methods for models have become a research hotspot in recent years. The existing auto-parallel methods based on machine learning or graph algorithms still have issues with search efficiency and applicability. This paper proposes an automatic parallel method based on a dual-population genetic algorithm, TGA, which transforms model partitioning and placement into an integer linear programming problem and constructs a cost model to evaluate the solution. The solution space is built using the neural network's dataflow graph and device cluster's topology, and the dual-population genetic algorithm is used to search for the optimal model parallel strategy. Experiments with various models show that the proposed method can improve single-step execution time by up to 42% compared to the Baechi method and up to 37.7% compared to the Hierarchical method. © 2023 IEEE.

关键词： Auto-Parallel Distributed machine learning Genetic algorithm Integer linear programming Model parallelism

来源：评论

学校读者我要写书评

暂无评论

Analysis and mitigation of function interaction risks in robot apps 21

Analysis and mitigation of function interaction risks in rob...

引用

24th International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2021

作者： Xu, Yuan Zhang, Tianwei Bao, Yungang State Key Laboratory of Computer Architecture Institute of Computing Technology University of Chinese Academy of Sciences Peng Cheng Laboratory China Nanyang Technological University China

ISBN: (纸本)9781450390583

Robot apps are becoming more automated, complex and diverse. An app usually consists of many functions, interacting with each other and the environment. This allows robots to conduct various tasks. However, it also opens a new door for cyber attacks: adversaries can leverage these interactions to threaten the safety of robot operations. Unfortunately, this issue is rarely explored in past works. We present the first systematic investigation about the function interactions in common robot apps. First, we disclose the potential risks and damages caused by malicious interactions. By investigating the relationships among different functions, we identify and categorize three types of interaction risks. Second, we propose RTron, a novel system to detect and mitigate these risks and protect the operations of robot apps. We introduce security policies for each type of risks, and design coordination nodes to enforce the policies and regulate the interactions. We conduct extensive experiments on 110 robot apps from the ROS platform and two complex apps (Baidu Apollo and Autoware) widely adopted in industry. Evaluation results indicated RTron can correctly identify and mitigate all potential risks with negligible performance cost. To validate the practicality of the risks and solutions, we implement and evaluate RTron on a physical UGV (Turtlebot) with real-word apps and environments. © 2021 ACM.

关键词： Risk analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：