检索结果-内蒙古大学图书馆

Klotski v2: Improved DNN Model Orchestration Framework for Dataflow Architecture Accelerators

IEEE Transactions on computer-Aided Design of Integrated Circuits and Systems 2025年第3期44卷 1045-1058页

作者： Bai, Chen Wei, Xuechao Zhuo, Youwei Cai, Yi Zheng, Hongzhong Yu, Bei Xie, Yuan The Chinese University of Hong Kong Department of Computer Science and Engineering Hong Kong Hong Kong Alibaba DAMO Academy Computing Technology Laboratory Hangzhou311121 China Peking University School of Integrated Circuits Beijing100871 China The Hong Kong University of Science and Technology Department of Electronic and Computer Engineering Hong Kong Hong Kong

Dataflow architecture accelerators are a new kind of scalable DNN accelerators. For an instruction, the availability of input operands solely determines the beginning of executions. DNN model orchestration determines how to partition, schedule, and map the computation to the underlying hardware. In this article, we propose the Klotski v2 framework to solve DNN model orchestration for dataflow architecture accelerators. First, a Bayesian optimization-based entropy-directed partition algorithm is proposed to transform a DNN model into μ ops. Second, a unified formal formulation for μ ops scheduling and mapping is presented. Third, a two-stage methodology is proposed to decouple the scheduling and mapping. Fourth, a Hilbert curve-based mapping heuristic is proposed to enhance problem-solving efficiency, improving the tradeoff between solution quality and algorithm runtime. Extensive results show that Klotski v2 can achieve an average of 21.57% higher execution performance improvement than the previous methodologies. With the Hilbert curve-based mapping heuristic, we improve the algorithm efficiency by an average of 63.50% across different DNN workloads. © 1982-2012 IEEE.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

On the Design of Novel Attention Mechanism for Enhanced Efficiency of Transformers 24

On the Design of Novel Attention Mechanism for Enhanced Effi...

引用

61st ACM/IEEE Design Automation Conference, DAC 2024

作者： Jha, Sumit Kumar Jha, Susmit Ewetz, Rickard Velasquez, Alvaro Computer Science Department Florida International University MiamiFL United States Computer Science Laboratory Sri International Menlo ParkCA United States Electrical and Computer Engineering University of Central Florida OrlandoFL United States Department of Computer Science University of Colorado Boulder BoulderCO United States

ISBN: (纸本)9798400706011

We present a new xor-based attention function for efficient hardware implementation of transformers. While the standard attention mechanism relies on matrix multiplication between the key and the transpose of the query, we propose replacing the computation of this attention function with bitwise xor operations. We mathematically analyze the information-theoretic properties of the standard multiplication-based attention, demonstrating that it preserves input entropy, and then computationally show that the xor-based attention approximately preserves the entropy of its input despite small variations in correlations between the inputs. Across various admittedly simple tasks, including arithmetic, sorting, and text generation, we show comparable performance to baseline methods using scaled GPT models. The xor-based computation of the attention function shows substantial improvement in power consumption, latency, and circuit area compared to the corresponding multiplication-based attention function. This hardware efficiency makes xor-based attention more compelling for the deployment of transformers under tight resource constraints, opening new application domains in sustainable energy-efficient computing. Additional optimizations to the xor-based attention function can further improve efficiency of transformers. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

关键词： Distribution transformers

来源：评论

学校读者我要写书评

暂无评论

ControlVideo: conditional control for one-shot text-driven video editing and beyond

引用

science China(Information sciences) 2025年第3期68卷 150-162页

作者： Min ZHAO Rongzhen WANG Fan BAO Chongxuan LI Jun ZHU Department of Computer Science and Technology Institute for AI Tsinghua-Bosch Joint ML CenterTsinghua Laboratory of Brain and Intelligence Lab Tsinghua University ShengShu Technology Gaoling School of Artificial Intelligence Renmin University of China Beijing Key Laboratory of Big Data Management and Analysis Methods Pazhou Laboratory (Huangpu)

This paper presents ControlVideo for text-driven video editing — generating a video that aligns with a given text while preserving the structure of the source video. Building on a pre-trained text-to-image diffusion model, ControlVideo enhances the fidelity and temporal consistency by incorporating additional conditions(such as edge maps), and fine-tuning the key-frame and temporal attention on the source video-text pair via an in-depth exploration of the design space. Extensive experimental results demonstrate that ControlVideo outperforms various competitive baselines by delivering videos that exhibit high fidelity w.r.t. the source content, and temporal consistency, all while aligning with the text. By incorporating low-rank adaptation layers into the model before training, ControlVideo is further empowered to generate videos that align seamlessly with reference images. More importantly, ControlVideo can be readily extended to the more challenging task of long video editing(e.g., with hundreds of frames), where maintaining long-range temporal consistency is crucial. To achieve this, we propose to construct a fused ControlVideo by applying basic ControlVideo to overlapping short video segments and key frame videos and then merging them by pre-defined weight functions. Empirical results validate its capability to create videos across 140 frames, which is approximately 5.83 to 17.5 times more than what previous studies achieved. The code is available at https://***/thu-ml/controlvideo.

关键词： diffusion models controllable generation text-driven editing video editing long video editing

来源：评论

学校读者我要写书评

暂无评论

Federated Multiarmed Bandits Under Byzantine Attacks

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2025年第6期6卷 1488-1501页

作者： Saday, Artun Demirel, Ilker Yildirim, Yigit Tekin, Cem Bilkent University Department of Electrical and Electronics Engineering Ankara06800 Turkey Massachusetts Institute of Technology Computer Science & Artificial Intelligence Laboratory CambridgeMA02139 United States

Multiarmed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multiarmed bandits (FMAB) is an emerging framework where a cohort of learners with heterogeneous local models play an MAB game and communicate their aggregated feedback to a server to learn a globally optimal arm. Two key hurdles in FMAB are communication-efficient learning and resilience to adversarial attacks. To address these issues, we study the FMAB problem in the presence of Byzantine clients who can send false model updates threatening the learning process. We analyze the sample complexity and the regret of β-optimal arm identification. We borrow tools from robust statistics and propose a median-of-means (MoM)-based online algorithm, Fed-MoM-UCB, to cope with Byzantine clients. In particular, we show that if the Byzantine clients constitute less than half of the cohort, the cumulative regret with respect to \beta-optimal arms is bounded over time with high probability, showcasing both communication efficiency and Byzantine resilience. We analyze the interplay between the algorithm parameters, a discernibility margin, regret, communication cost, and the arms’ suboptimality gaps. We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments. © 2020 IEEE.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

Should supervised discretisation always be trusted unreservedly? On combining characteristics of supervised and unsupervised discretisation algorithms in two-step processing 27

Should supervised discretisation always be trusted unreserve...

引用

27th International Conference on Knowledge Based and Intelligent Information and Engineering Sytems, KES 2023

作者： Stanczyk, Urszula Baron, Grzegorz Department of Graphics Computer Vision and Digital Systems Faculty of Automatic Control Electronics and Computer Science Silesian University of Technology Akademicka 2A Gliwice44-100 Poland

The paper presents a description of the research methodology dedicated to a two-step discretisation process applied to the input numeric data, with combining the characteristics of selected supervised and unsupervised algorithms, which leads to extended processing of some attributes in train and test sets. The methodology was illustrated with the investigations carried out in the domain of stylometric analysis of texts, for two datasets prepared for the task of binary authorship attribution. The several variants of transformed input data obtained were subjected to exploration using two selected machine learning methods capable of inducing knowledge from both continuous and categorical forms, namely the PART and J48 classifiers. The results from the experiments indicate that, as can be expected, supervised transformations of data work well enough, however, they do not always return the best outcome. The two-step processing of some attributes shows sufficient promise to warrant a closer study, as opposed to always unconditionally relying only on supervised algorithms as outperforming all other approaches. © 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://***/licenses/by-nc-nd/4.0)

关键词： Authorship attribution Data representation Pattern recognition Stylometry Supervised discretisation Unsupervised discretisation

来源：评论

学校读者我要写书评

暂无评论

Migrant Resettlement by Evolutionary Multiobjective Optimization

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2025年第1期6卷 51-65页

作者： Liu, Dan-Xuan Gu, Yu-Ran Qian, Chao Mu, Xin Tang, Ke Nanjing University National Key Laboratory for Novel Software Technology School of Artificial Intelligence Nanjing210023 China Peng Cheng Laboratory Shenzhen518000 China Southern University of Science and Technology Department of Computer Science and Engineering Shenzhen518055 China

Migration has been a universal phenomenon, which brings opportunities as well as challenges for global development. As the number of migrants (e.g., refugees) increases rapidly, a key challenge faced by each country is the problem of migrant resettlement. This problem has attracted scientific research attention, from the perspective of maximizing the employment rate. Previous works mainly formulated migrant resettlement as an approximately submodular optimization problem subject to multiple matroid constraints and employed the greedy algorithm, whose performance, however, may be limited due to its greedy nature. In this article, we propose a new framework called migrant resettlement by evolutionary multiobjective optimization (MR-EMO), which reformulates migrant resettlement as a biobjective optimization problem that maximizes the expected number of employed migrants and minimizes the number of dispatched migrants simultaneously, and employs a multiobjective evolutionary algorithm (MOEA) to solve the biobjective problem. We implement MR-EMO using three MOEAs: the popular nondominated sorting genetic algorithm II (NSGA-II), MOEA based on decomposition (MOEA/D) as well as the theoretically grounded global simple evolutionary multiobjective optimizer (GSEMO). To further improve the performance of MR-EMO, we propose a specific MOEA, called GSEMO using matrix-swap mutation and repair mechanism (GSEMO-SR), which has a better ability to search for feasible solutions. We prove that MR-EMO using either GSEMO or GSEMO-SR can achieve better theoretical guarantees than the previous greedy algorithm. Experimental results under the interview and coordination migration models clearly show the superiority of MR-EMO (with either NSGA-II, MOEA/D, GSEMO or GSEMO-SR) over previous algorithms, and that using GSEMO-SR leads to the best performance of MR-EMO. © 2024 IEEE.

关键词： Multiobjective optimization

来源：评论

学校读者我要写书评

暂无评论

Low-temperature metal–oxide thin-film transistor technologies for implementing flexible electronic circuits and systems

引用

Journal of Semiconductors 2023年第9期44卷 3-10页

作者： Runxiao Shi Tengteng Lei Zhihe Xia Man Wong State Key Laboratory of Advanced Displays and Optoelectronics and Technologies Department of Electronic and Computer EngineeringThe Hong Kong University of Science and TechnologyHong KongChina

Here we review two 300℃metal–oxide(MO)thin-film transistor(TFT)technologies for the implementation of flexible electronic circuits and ***-enhanced TFTs for suppressing the variation and shift of turn-on voltage(VON),and dual-gate TFTs for acquiring sensor signals and modulating VON have been deployed to improve the robustness and performance of the systems in which they are *** circuit building blocks based on fluorinated TFTs have been designed,fabricated,and characterized,which demonstrate the utility of the proposed low-temperature TFT technologies for implementing flexible electronic *** construction and characterization of an analog front-end system for the acquisition of bio-potential signals and an active-matrix sensor array for the acquisition of tactile images have been reported recently.

关键词： flexible electronics metal-oxide semiconductor thin-film transistor dual gate fluorination analog front-end system sensors

来源：评论

学校读者我要写书评

暂无评论

Enhancing V2X QoS: dynamic scheduling scheme over 5G networks and byon

引用

International Journal of Information Technology (Singapore) 2024年第7期16卷 4427-4433页

作者： Mansouri, Wahida Ali Mohammed Elmourssi, Doaa Elyass, Wiam Almalih Department of Computer Science and Information Technology Faculty of Sciences and Arts Northern Border University LETI Laboratory University of Sfax

Over the years, various research teams have dedicated efforts to create scheduling algorithms that are not only effective but also efficient, with the goal of enhancing the quality of service in Vehicle-to-Everything (V2X) communication over 5G (5 Generation) networks. The proliferation of connected vehicles and the advent of 5G technologies present an unprecedented opportunity to optimize communication reliability, latency, and throughput in V2X scenarios. This paper introduces a novel scheduling that adapts to the dynamic nature of vehicular environments, ensuring seamless and efficient communication. The proposed scheduling scheme takes into consideration the traffic prioritization to effectively manage Quality of Service (QoS) for diverse traffic types. This approach considers different metrics such as channel quality, remaining payload, and delay. The results of our simulation show how well our Scheduling V2X Communications (SVC) algorithm performs, leading to minimizing latencies. For a sample of 100 users, the average latency in our suggested scheme was less than 0.001 ms. © Bharati Vidyapeeth's Institute of computer Applications and Management 2024.

关键词： 5G networks Prioritization QoS Scheduling V2X

来源：评论

学校读者我要写书评

暂无评论

Revisiting Context Aggregation for Image Matting 41

Revisiting Context Aggregation for Image Matting

引用

41st International Conference on Machine Learning, ICML 2024

作者： Liu, Qinglin Lv, Xiaoqian Meng, Quanling Li, Zonglin Lan, Xiangyuan Yang, Shuo Zhang, Shengping Nie, Liqiang School of Computer Science and Technology Harbin Institute of Technology Weihai China Peng Cheng Laboratory Shenzhen China Department of Computer Science The University of Hong Kong Hong Kong School of Computer Science and Technology Harbin Institute of Technology Shenzhen China

Traditional studies emphasize the significance of context information in improving matting performance. Consequently, deep learning-based matting methods delve into designing pooling or affinity-based context aggregation modules to achieve superior results. However, these modules cannot well handle the context scale shift caused by the difference in image size during training and inference, resulting in matting performance degradation. In this paper, we revisit the context aggregation mechanisms of matting networks and find that a basic encoder-decoder network without any context aggregation modules can actually learn more universal context aggregation, thereby achieving higher matting performance compared to existing methods. Building on this insight, we present AEMatter, a matting network that is straightforward yet very effective. AEMatter adopts a Hybrid-Transformer backbone with appearance-enhanced axis-wise learning (AEAL) blocks to build a basic network with strong context aggregation learning capability. Furthermore, AEMatter leverages a large image training strategy to assist the network in learning context aggregation from data. Extensive experiments on five popular matting datasets demonstrate that the proposed AEMatter outperforms state-of-the-art matting methods by a large margin. The source code is available at https://***/aipixel/AEMatter. Copyright 2024 by the author(s)

关键词： HTTP

来源：评论

学校读者我要写书评

暂无评论

ARCH-COMP23 Category Report: Hybrid Systems Theorem Proving 10th

ARCH-COMP23 Category Report: Hybrid Systems Theorem Proving

引用

10th International Workshop on Applied Verification of Continuous and Hybrid Systems, ARCH 2023

作者： Mitsch, Stefan Sheng, Huanhuan Zhan, Bohua Wang, Shuling Foster, Simon Munive, Jonathan Julián Huerta Y. Computer Science Department Carnegie Mellon University PittsburghPA United States State Key Laboratory of Computer Science Institute of Software Chinese Academy of Sciences Beijing China Department of Computer Science University of York United Kingdom Department of Computer Science University of Copenhagen Denmark

MZS+22This paper reports on the Hybrid Systems Theorem Proving (HSTP) category in the ARCH-COMP Friendly Competition 2023. The characteristic features of the HSTP category remain as in the previous edition [MZS+22]: HSTP focuses on flexibility of programming languages as structuring principles for hybrid systems, unambiguity and precision of program semantics, and mathematical rigor of logical reasoning principles. The benchmark set includes nonlinear and parametric continuous and hybrid systems and hybrid games, each in three modes: fully automatic verification, semi-automatic verification from proof hints, proof checking from scripted tactics. This instance of the competition focuses on presenting the differences between the provers on a subset of the benchmark examples. © 2023, EasyChair. All rights reserved.

关键词： Hybrid systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：