检索结果-内蒙古大学图书馆

arXiv 2025年

作者： Li, Hengzhuang Zhang, Teng National Engineering Research Center for Big Data Technology and System Service Computing Technology and Systems Laboratory Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China

Out-of-distribution (OOD) detection is crucial for developing trustworthy and reliable machine learning systems. Recent advances in training with auxiliary OOD data demonstrate efficacy in enhancing detection capabilities. Nonetheless, these methods heavily rely on acquiring a large pool of high-quality natural outliers. Some prior methods try to alleviate this problem by synthesizing virtual outliers but suffer from either poor quality or high cost due to the monotonous sampling strategy and the heavy-parameterized generative models. In this paper, we overcome all these problems by proposing the Hamiltonian Monte Carlo Outlier Synthesis (HamOS) framework, which views the synthesis process as sampling from Markov chains. Based solely on the in-distribution data, the Markov chains can extensively traverse the feature space and generate diverse and representative outliers, hence exposing the model to miscellaneous potential OOD scenarios. The Hamiltonian Monte Carlo with sampling acceptance rate almost close to 1 also makes our framework enjoy great efficiency. By empirically competing with SOTA baselines on both standard and large-scale benchmarks, we verify the efficacy and efficiency of our proposed HamOS. Our code is availab.e at: https://***/Fir-lat/HamOS_OOD. © 2025, CC BY.

关键词： Markov chains

来源：评论

学校读者我要写书评

暂无评论

GraSU: A fast graph update library for fpga-based dynamic graph processing 21

GraSU: A fast graph update library for fpga-based dynamic gr...

引用

2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2021

作者： Wang, Qinggang Zheng, Long Huang, Yu Yao, Pengcheng Gui, Chuangyi Liao, Xiaofei Jin, Hai Jiang, Wenbin Mao, Fubing National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab Cluster and Grid Computing Lab Huazhong University of Science and Technology China

ISBN: (纸本)9781450382182

Existing FPGA-based graph accelerators, typically designed for static graphs, rarely handle dynamic graphs that often involve substantial graph updates (e.g., edge/node insertion and deletion) over time. In this paper, we aim to fill this gap. The key innovation of this work is to build an FPGA-based dynamic graph accelerator easily from any off-the-shelf static graph accelerator with minimal hardware engineering efforts (rather than from scratch). We observe\em spatial similarity of dynamic graph updates in the sense that most of graph updates get involved with only a small fraction of vertices. We therefore propose an FPGA library, called GraSU, to exploit spatial similarity for fast graph updates. GraSU uses a differential data management, which retains the high-value data (that will be frequently accessed) in the specialized on-chip UltraRAM while the overwhelming majority of low-value ones reside in the off-chip memory. Thus, GraSU can transform most of off-chip communications arising in dynamic graph updates into fast on-chip memory accesses. Our experiences show that GraSU can be easily integrated into existing state-of-the-art static graph accelerators with only 11 lines of code modifications. Our implementation atop AccuGraph using a Xilinx Alveo#8482;\U250 board outperforms two state-of-the-art CPU-based dynamic graph systems, Stinger and Aspen, by an average of 34.24× and 4.42× in terms of update throughput, improving further overall efficiency by 9.80× and 3.07× on average. © 2021 ACM.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Variational Bayes for Federated Continual Learning

arXiv

引用

arXiv 2024年

作者： Yao, Dezhong Li, Sanmu Dai, Yutong Xu, Zhiqiang Hu, Shengshan Zhao, Peilin Sun, Lichao The National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab. Cluster and Grid Computing Lab. School of Computer Science and Technology Huazhong University of Science and Technology Wuhan430074 China Lehigh University PA18015 United States Mohamed bin Zayed University of Artificial Intelligence United Arab Emirates Tencent AI Lab. Shenzhen China

Federated continual learning (FCL) has received increasing attention due to its potential in handling real-world streaming data, characterized by evolving data distributions and varying client classes over time. The constraints of storage limitations and privacy concerns confine local models to exclusively access the present data within each learning cycle. Consequently, this restriction induces performance degradation in model training on previous data, termed "catastrophic forgetting". However, existing FCL approaches need to identify or know changes in data distribution, which is difficult in the real world. To release these limitations, this paper directs attention to a broader continuous framework. Within this framework, we introduce Federated Bayesian Neural Network (FedBNN), a versatile and efficacious framework employing a variational Bayesian neural network across all clients. Our method continually integrates knowledge from local and historical data distributions into a single model, adeptly learning from new data distributions while retaining performance on historical distributions. We rigorously evaluate FedBNN’s performance against prevalent methods in federated learning and continual learning using various metrics. Experimental analyses across diverse datasets demonstrate that FedBNN achieves state-of-the-art results in mitigating forgetting. Copyright © 2024, The Authors. All rights reserved.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

FusionANNS: An Efficient CPU/GPU Cooperative Processing Architecture for Billion-scale Approximate Nearest Neighbor Search

arXiv

引用

arXiv 2024年

作者： Tian, Bing Liu, Haikun Tang, Yuhang Xiao, Shihai Duan, Zhuohui Liao, Xiaofei Zhang, Xuecang Zhu, Junhua Zhang, Yu National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab/Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology China Huawei Technologies Co. Ltd China

Approximate nearest neighbor search (ANNS) has emerged as a crucial component of database and AI infrastructure Ever-increasing vector datasets pose significant challenges in terms of performance, cost, and accuracy for ANNS services None of modern ANNS systems can address these issues simultaneously. We present FusionANNS, a high-throughput, low-latency cost-efficient, and high-accuracy ANNS system for billion scale datasets using SSDs and only one entry-level GPU The key idea of FusionANNS lies in CPU/GPU collab. rative filtering and re-ranking mechanisms, which signifi cantly reduce I/O operations across CPUs, GPU, and SSDs to break through the I/O performance bottleneck. Specifically we propose three novel designs: (1) multi-tiered indexing to avoid data swapping between CPUs and GPU, (2) heuristic re-ranking to eliminate unnecessary I/Os and computations while guaranteeing high accuracy, and (3) redundant-aware I/O deduplication to further improve I/O efficiency. We imple ment FusionANNS and compare it with the state-of-the-art SSD-based ANNS system–SPANN and GPU-accelerated in memory ANNS system–RUMMY. Experimental results show that FusionANNS achieves 1) 9.4-13.1× higher query per second (QPS) and 5.7-8.8× higher cost efficiency compared with SPANN;2) and 2-4.9× higher QPS and 2.3-6.8× higher cost efficiency compared with RUMMY, while guaranteeing low latency and high accuracy. Copyright © 2024, The Authors. All rights reserved.

关键词： Nearest neighbor search

来源：评论

学校读者我要写书评

暂无评论

Efficient FPGA-based graph processing with hybrid pull-push computational model

引用

Frontiers of Computer Science 2020年第4期14卷 13-28页

作者： Chengbo YANG Long ZHENG Chuangyi GUI Hai JIN National Engineering Research Center for Big Data Technology and System/Service Computing Technology and System Lab/Cluster and Grid Computing Lab School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhan430074China

Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world *** and pipeline parallelism of FPGAs make it potential to process different stages of graph ***,considering the limited on-chip resources and streamline pipeline computation,the efficiency of hybrid model on FPGAs often suffers due to well-known random access feature of graph *** this paper,we present a hybrid graph processing system on FPGAs,which can achieve the best of both *** approach on FPGAs is unique and novel as ***,we propose to use edge block(consisting of edges with the same destination vertex set),which allows to sequentially access edges at block granularity for locality while still preserving the *** to the independence of blocks in the sense that all edges in an inactive block are associated with inactive vertices,this also enables to skip invalid blocks for reducing redundant ***,we consider a large number of vertices and their associated edge-blocks to maintain a predictable execution *** also present to switch models in advance with few stalls using their state *** evaluation on a wide variety of graph algorithms for many real-world graphs shows that our approach achieves up to 3.69x speedup over state-of-the-art FPGA-based graph processing systems.

关键词： graph processing efficiency computational model FPGAs

来源：评论

学校读者我要写书评

暂无评论

Intersecting-boundary-sensitive fingerprinting for tampering detection of DNN models 24

Intersecting-boundary-sensitive fingerprinting for tampering...

引用

Proceedings of the 41st International Conference on Machine Learning

作者： Xiaofan Bai Chaoxiang He Xiaojing Ma Bin Benjamin Zhu Hai Jin School of Cyber Science and Engineering Huazhong University of Science and Technology National Engineering Research Center for Big Data Technology and System and Services Computing Technology and System Lab and Hubei Engineering Research Center on Big Data Security and Hubei Key Laboratory of Distributed System Security Microsoft School of Computer Science and Technology Huazhong University of Science and Technology and National Engineering Research Center for Big Data Technology and System and Services Computing Technology and System Lab and Cluster and Grid Computing Lab.

Cloud-based AI services offer numerous benefits but also introduce vulnerabilities, allowing for tampering with deployed DNN models, ranging from injecting malicious behaviors to reducing computing resources. Fingerprint samples are generated to query models to detect such tampering. In this paper, we present Intersecting-Boundary-Sensitive Fingerprinting (IBSF), a novel method for black-box integrity verification of DNN models using only top-1 lab.ls. Recognizing that tampering with a model alters its decision boundary, IBSF crafts fingerprint samples from normal samples by maximizing the partial Shannon entropy of a selected subset of categories to position the fingerprint samples near decision boundaries where the categories in the subset intersect. These fingerprint samples are almost indistinguishable from their source samples. We theoretically establish and confirm experimentally that these fingerprint samples' expected sensitivity to tampering increases with the cardinality of the subset. Extensive evaluation demonstrates that IBSF surpasses existing state-of-the-art fingerprinting methods, particularly with larger subset cardinality, establishing its state-of-the-art performance in black-box tampering detection using only top-1 lab.ls. The IBSF code is availab.e at: https://***/CGCL-codes/IBSF.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Heterogeneous differential privacy for vertically partitioned databases

Heterogeneous differential privacy for vertically partitione...

引用

作者： Xia, Yang Zhu, Tianqing Ding, Xiaofeng Jin, Hai Zou, Deqing National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab Cluster and Grid Computing Lab Huazhong University of Science and Technology Wuhan China Deakin University Victoria Australia

Existing privacy-preserving approaches are generally designed to provide privacy guarantee for individual data in a database, which reduces the utility of the database for data analysis. In this paper, we propose a novel differential privacy mechanism to preserve the heterogeneous privacy of a vertically partitioned database based on attributes. We first present the concept of privacy lab.l, which characterizes the privacy information of the database and is instantiated by the classification. Then, we use an information-based method to systematically explore the dependencies between all attributes and the privacy lab.l. We finally assign privacy weights to every attribute and design a heterogeneous mechanism according to the basic Laplace mechanism. Evaluations using real datasets demonstrate that the proposed mechanism achieves a balanced privacy and utility. © 2019 John Wiley & Sons, Ltd.

关键词： Database systems

来源：评论

学校读者我要写书评

暂无评论

RAHP: A Redundancy-aware Accelerator for High-performance Hypergraph Neural Network

RAHP: A Redundancy-aware Accelerator for High-performance Hy...

引用

IEEE/ACM International Symposium on Microarchitecture (MICRO)

作者： Hui Yu Yu Zhang Ligang He Yingqi Zhao Xintao Li Ruida Xin Jin Zhao Xiaofei Liao Haikun Liu Bingsheng He Hai Jin National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China Department of Computer Science University of Warwick United Kingdom National University of Singapore Singapore

ISBN: (数字)9798350350579

ISBN: (纸本)9798350350586

Hypergraph Neural Network (HyperGNN) has emerged as a potent methodology for dissecting intricate multilateral connections among various entities. Current software/hardware solutions leverage a sequential execution model that relies on hyperedge and vertex indices for conducting standard matrix operations for HyperGNN inference. Yet, they are impeded by the dual challenges of redundant computation and irregular memory access overheads. This is primarily due to the frequent and repetitive access and updating of a number of feature vectors corresponding to the same hyperedges and vertices. To address these challenges, we propose the first redundancy-aware accelerator, RAHP, which enables high performance execution of HyperGNN inference. Specifically, we present a redundancy-aware asynchronous execution approach into the accelerator design for HyperGNN to reduce redundant computations and off-chip memory accesses. To unveil opportunities for data reuse and unlock the parallelism that existing HyperGNN solutions fail to capture, it prioritizes vertices with the highest degree as roots, prefetching other vertices along the hypergraph structure to capture the common vertices among multiple hyperedges, and synchronizing the computations of hyperedges and vertices in real-time. By such means, this facilitates the concurrent processing of relevant hyperedge and vertex computations of the common vertices along the hypergraph topology, resulting in smaller redundant computations overhead. Furthermore, by efficiently caching intermediate results of the common vertices, it curtails memory traffic and off-chip communications. To fully harness the performance potential of our proposed approach in the accelerator, RAHP incorporates a topology-driven data loading mechanism to minimize off-chip memory accesses on the fly. It is also endowed with an adaptive data synchronization scheme to mitigate the effects of conflicting updates of both hyperedges and vertices. Moreover, RAHP emplo

关键词： Computational modeling Neural networks Memory management Loading Energy conservation Vectors Software Topology Synchronization Standards

来源：评论

学校读者我要写书评

暂无评论

Towards high-throughput and low-latency billion-scale vector search via CPU/GPU collab.rative filtering and re-ranking 25

Towards high-throughput and low-latency billion-scale vector...

引用

Proceedings of the 23rd USENIX Conference on File and Storage Technologies

作者： Bing Tian Haikun Liu Yuhang Tang Shihai Xiao Zhuohui Duan Xiaofei Liao Hai Jin Xuecang Zhang Junhua Zhu Yu Zhang National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab/Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology China Huawei Technologies Co. Ltd Towards high-throughput and low-latency billion-scale vector search via CPU/GPU collaborative filtering and re-ranking

ISBN: (纸本)9781939133458

Approximate nearest neighbor search (ANNS) has emerged as a crucial component of database and AI infrastructure. Ever-increasing vector datasets pose significant challenges in terms of performance, cost, and accuracy for ANNS services. None of modern ANNS systems can address these issues simultaneously. In this paper, we present Fusion-ANNS, a high-throughput, low-latency, cost-efficient, and high-accuracy ANNS system for billion-scale datasets using SSDs and only one entry-level GPU. The key idea of Fusion-ANNS lies in CPU/GPU collab.rative filtering and reranking mechanisms, which significantly reduce I/O operations across CPUs, GPU, and SSDs to break through the I/O performance bottleneck. Specifically, we propose three novel designs: (1) multi-tiered indexing to avoid data swapping between CPUs and GPU, (2) heuristic re-ranking to eliminate unnecessary I/Os and computations while guaranteeing high accuracy, and (3) redundant-aware I/O deduplication to further improve I/O efficiency. We implement FusionANNS and compare it with the state-of-the-art SSD-based ANNS system-SPANN and GPU-accelerated in-memory ANNS system-RUMMY. Experimental results show that FusionANNS achieves 1) 9.4-13.1× higher query per second (QPS) and 5.7-8.8× higher cost efficiency compared with SPANN; 2) and 2-4.9× higher QPS and 2.3-6.8× higher cost efficiency compared with RUMMY, while guaranteeing low latency and high accuracy.

关键词：

来源：评论

学校读者我要写书评

暂无评论

GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability

arXiv

引用

arXiv 2024年

作者： Luo, Zihan Song, Xiran Huang, Hong Lian, Jianxun Zhang, Chenhao Jiang, Jinqi Xie, Xing Huazhong University of Science and Technology Wuhan China Microsoft Research Asia Beijing China The National Engineering Research Center for Big Data Technology and System Service Computing Technology and Systems Laboratory Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology China

Evaluating and enhancing the general capabilities of large language models (LLMs) has been an important research topic. Graph is a common data structure in the real world, and understanding graph data is a crucial part for advancing general intelligence. To evaluate and enhance the graph understanding abilities of LLMs, in this paper, we propose a benchmark named GraphInstruct, which comprehensively includes 21 classical graph reasoning tasks, providing diverse graph generation pipelines and detailed reasoning steps. Based on GraphInstruct, we further construct GraphLM through efficient instruction-tuning, which shows prominent graph understanding capability. In order to enhance the LLM with graph reasoning capability as well, we propose a step mask training strategy, and construct a model named GraphLM+. As one of the pioneering efforts to enhance the graph understanding and reasoning abilities of LLMs, extensive experiments have demonstrated the superiority of GraphLM and GraphLM+ over other LLMs. We look forward to more researchers exploring the potential of LLMs in the graph data mining domain through GraphInstruct. Our code for generating GraphInstruct is released publicly at: https://***/CGCL-codes/GraphInstruct. Copyright © 2024, The Authors. All rights reserved.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：