By treating users’ interactions as a user-item graph, graph learning models have been widely deployed in Collaborative Filtering (CF) based recommendation. Recently, researchers have introduced Graph Contrastive Lear...
详细信息
With the development of the architectures and the growth of AIoT application requirements, data processing on edge has become popular. Neural network inference is widely employed for data analytics on edge devices. Th...
With the development of the architectures and the growth of AIoT application requirements, data processing on edge has become popular. Neural network inference is widely employed for data analytics on edge devices. This paper extensively explores neural network inference on integrated edge devices and proposes EdgeNN, the first neural network inference solution on CPU-GPU integrated edge devices. EdgeNN has three novel characteristics. First, EdgeNN can adaptively utilize the unified physical memory and conduct the zero-copy optimization. Second, EdgeNN involves a novel inference-targeted inter- and intra-kernel CPU-GPU hybrid execution approach, which co-runs the CPU with the GPU to fully utilize the edge device’s computing resources. Third, EdgeNN adopts a fine-grained adaptive inference tuning approach, which can divide the complicated inference structure into sub-tasks mapped to the CPU and the GPU. Experiments show that on six popular neural network inference tasks, EdgeNN brings an average of 3.97×, 3.12×, and 8.80× speedups to inference on the CPU of the integrated device, inference on a mobile phone CPU, and inference on an edge CPU device. Additionally, it achieves 22.02% time benefits to the direct execution of the original programs. Specifically, 9.93% comes from better utilization of unified memory, and 10.76% comes from CPU-GPU hybrid execution. Besides, EdgeNN can deliver 29.14× and 5.70× higher energy efficiency than the edge CPU and the discrete GPU, respectively. We have made EdgeNN available at https://***/ChenyangZhang-cs/EdgeNN.
The causality relation modeling remains a challenging task for group activity recognition. The causality relations describe the influence on the centric actor (effect actor) from its correlative actors (cause actors)....
The causality relation modeling remains a challenging task for group activity recognition. The causality relations describe the influence on the centric actor (effect actor) from its correlative actors (cause actors). Most existing graph models focus on learning the actor relation with synchronous temporal features, which is insufficient to deal with the causality relation with asynchronous temporal features. In this paper, we propose an Actor-Centric Causality Graph Model, which learns the asynchronous temporal causality relation with three modules, i.e., an asynchronous temporal causality relation detection module, a causality feature fusion module, and a causality relation graph inference module. First, given a centric actor and its correlative actor, we analyze their influences to detect causality relation. We estimate the self influence of the centric actor with self regression. We estimate the correlative influence from the correlative actor to the centric actor with correlative regression, which uses asynchronous features at different timestamps. Second, we synchronize the two action features by estimating the temporal delay between the cause action and the effect action. The synchronized features are used to enhance the feature of the effect action with a channel-wise fusion. Third, we describe the nodes (actors) with causality features and learn the edges by fusing the causality relation with the appearance relation and distance relation. The causality relation graph inference provides crucial features of effect action, which are complementary to the base model using synchronous relation inference. Experiments show that our method achieves state-of-the-art performance on the Volleyball dataset and Collective Activity dataset.
Configuration tuning is essential to optimize the performance of systems(e.g.,databases,key-value stores).High performance usually indicates high throughput and low *** present,most of the tuning tasks of systems are ...
详细信息
Configuration tuning is essential to optimize the performance of systems(e.g.,databases,key-value stores).High performance usually indicates high throughput and low *** present,most of the tuning tasks of systems are performed artificially(e.g.,by database administrators),but it is hard for them to achieve high performance through tuning in various types of systems and in various *** recent years,there have been some studies on tuning traditional database systems,but all these methods have some *** this article,we put forward a tuning system based on attention-based deep reinforcement learning named WATuning,which can adapt to the changes of workload characteristics and optimize the system performance efficiently and ***,we design the core algorithm named ATT-Tune for WATuning to achieve the tuning task of *** algorithm uses workload characteristics to generate a weight matrix and acts on the internal metrics of systems,and then ATT-Tune uses the internal metrics with weight values assigned to select the appropriate ***,WATuning can generate multiple instance models according to the change of the workload so that it can complete targeted recommendation services for different types of ***,WATuning can also dynamically fine-tune itself according to the constantly changing workload in practical applications so that it can better fit to the actual environment to make *** experimental results show that the throughput and the latency of WATuning are improved by 52.6%and decreased by 31%,respectively,compared with the throughput and the latency of CDBTune which is an existing optimal tuning method.
data-free quantization (DFQ) recovers the performance of quantized network (Q) without the original data, but generates the fake sample via a generator (G) by learning from full-precision network (P), which, however, ...
data-free quantization (DFQ) recovers the performance of quantized network (Q) without the original data, but generates the fake sample via a generator (G) by learning from full-precision network (P), which, however, is totally independent of Q, overlooking the adaptability of the knowledge from generated samples, i.e., informative or not to the learning process of Q, resulting into the overflow of generalization error. Building on this, several critical questions — how to measure the sample adaptability to Q under varied bit-width scenarios? whether the largest adaptability is the best? how to generate the samples with adaptive adaptability to improve Q's generalization? To answer the above questions, in this paper, we propose an Adaptive data-Free Quantization (AdaDFQ) method, which revisits DFQ from a zero-sum game perspective upon the sample adaptability between two players — a generator and a quantized network. Following this viewpoint, we further define the disagreement and agreement samples to form two boundaries, where the margin between two boundaries is optimized to adaptively regulate the adaptability of generated samples to Q, so as to address the over-and-under fitting issues. Our AdaDFQ reveals: 1) the largest adaptability is NOT the best for sample generation to benefit Q's generalization; 2) the knowledge of the generated sample should not be informative to Q only, but also related to the category and distribution information of the training data for P. The theoretical and empirical analysis validate the advantages of AdaDFQ over the state-of-the-arts. Our code is available at https://***/hfutqian/AdaDFQ.
We introduce TABLELLM, a robust large language model (LLM) with 8 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets...
详细信息
Congenital heart disease(CHD)is one of the most common causes of major birth defects,with a prevalence of 1%.Although an increasing number of studies have reported the etiology of CHD,the findings scattered throughout...
详细信息
Congenital heart disease(CHD)is one of the most common causes of major birth defects,with a prevalence of 1%.Although an increasing number of studies have reported the etiology of CHD,the findings scattered throughout the literature are difficult to retrieve and utilize in research and clinical *** therefore developed CHDbase,an evidence-based knowledgebase of CHD-related genes and clinical manifestations manually curated from 1114 publications,linking 1124 susceptibility genes and 3591 variations to more than 300 CHD types and related *** such as the information of each publication and the selected population and samples,the strategy of studies,and the major findings of studies were integrated with each item of the research *** also integrated functional annotations through parsing50 databases/tools to facilitate the interpretation of these genes and variations in disease *** further prioritized the significance of these CHD-related genes with a gene interaction network approach and extracted a core CHD sub-network with 163 *** clear genetic landscape of CHD enables the phenotype classification based on the shared genetic ***,CHDbase provides a comprehensive and freely available resource to study CHD susceptibilities,supporting a wide range of users in the scientific and medical *** is accessible at http://***.
Existing low-rank adaptation (LoRA) methods face challenges on sparse large language models (LLMs) due to the inability to maintain sparsity. Recent works introduced methods that maintain sparsity by augmenting LoRA t...
详细信息
Major depressive disorder (MDD) is one of the most common and severe mental illnesses, posing a huge burden on society and families. Recently, some multimodal methods have been proposed to learn a multimodal embedding...
详细信息
Major depressive disorder (MDD) is one of the most common and severe mental illnesses, posing a huge burden on society and families. Recently, some multimodal methods have been proposed to learn a multimodal embedding for MDD detection and achieved promising performance. However, these methods ignore the heterogeneity/homogeneity among various modalities. Besides, earlier attempts ignore interclass separability and intraclass compactness. Inspired by the above observations, we propose a graph neural network (GNN)-based multimodal fusion strategy named modal-shared modal-specific GNN, which investigates the heterogeneity/homogeneity among various psychophysiological modalities as well as explores the potential relationship between subjects. Specifically, we develop a modal-shared and modal-specific GNN architecture to extract the inter/intramodal characteristics. Furthermore, a reconstruction network is employed to ensure fidelity within the individual modality. Moreover, we impose an attention mechanism on various embeddings to obtain a multimodal compact representation for the subsequent MDD detection task. We conduct extensive experiments on two public depression datasets and the favorable results demonstrate the effectiveness of the proposed algorithm.
Blockchain technology makes it possible to design robust decentralized federated learning (FL). Minimizing the communication cost and storage consumption incurred is one of the essential challenges. In addition, maint...
详细信息
暂无评论