With the exponential growth of biomedical knowledge in unstructured text repositories such as PubMed, it is imminent to establish a knowledge graph-style, efficient searchable and targeted database that can support th...
详细信息
ISBN:
(纸本)9798350337488
With the exponential growth of biomedical knowledge in unstructured text repositories such as PubMed, it is imminent to establish a knowledge graph-style, efficient searchable and targeted database that can support the need of information retrieval from researchers and clinicians. To mine knowledge from graph databases, most previous methods view a triple in a graph (see Fig. 1) as the basic processing unit and embed the triplet element (i.e. drugs/chemicals, proteins/genes and their interaction) as separated embedding matrices, which cannot capture the semantic correlation among triple elements. To remedy the loss of semantic correlation caused by disjoint embeddings, we propose a novel approach to learn triple embeddings by combining entities and interactions into a unified representation. Furthermore, traditional methods usually learn triple embeddings from scratch, which cannot take advantage of the rich domain knowledge embedded in pre-trained models, and is also another significant reason for the fact that they cannot distinguish the differences implied by the same entity in the multi-interaction triples. In this paper, we propose a novel fine-tuning based approach to learn better triple embeddings by creating weakly supervised signals from pre-trained knowledge graph embeddings. The method automatically samples triples from knowledge graphs and estimates their pairwise similarity from pre-trained embedding models. The triples are then fed pairwise into a Siamese-like neural architecture, where the triple representation is fine-tuned in the manner bootstrapped by triple similarity scores. Finally, we demonstrate that triple embeddings learned with our method can be readily applied to several downstream applications (e.g. triple classification and triple clustering). We evaluated the proposed method on two open-source drug-protein knowledge graphs constructed from PubMed abstracts, as provided by BioCreative. Our method achieves consistent improvement in both t
This article designs and implements a runtime library for general dataflow programming, DFCPP (Luo Q, Huang J, Li J, Du Z. Proceedings of the 52nd International Conference on Parallel Processing Workshops. ACM;2023:14...
详细信息
This article designs and implements a runtime library for general dataflow programming, DFCPP (Luo Q, Huang J, Li J, Du Z. Proceedings of the 52nd International Conference on Parallel Processing Workshops. ACM;2023:145-152.), and builds upon it to design and implement a multi-machine C++ dataflow library, M-DFCPP. In comparison to existing dataflow programming environments, DFCPP features a user-friendly interface and richer expressive capabilities (Luo Q, Huang J, Li J, Du Z. Proceedings of the 52nd International Conference on Parallel Processing Workshops. ACM;2023:145-152.), enabling the representation of various types of dataflow actor tasks (static, dynamic and conditional task). Besides that, DFCPP addresses the memory management and task scheduling for non-uniform memory access architectures, while other dataflow libraries lack attention to these issues. M-DFCPP extends the capability of current dataflow runtime libraries (DFCPP, taskflow, openstream, etc.) and capable of multi-machine computing, while maintains the API compatible with DFCPP. M-DFCPP adopts the concepts of master and follower (Dean J, Ghemawat S. Commun ACM. 2008;51(1):107-113;Ghemawat S, Gobioff H, Leung ST. ACM SIGOPS Operating Systems Review. ACM;2003:29-43.), which form a worksharing framework as many multi-machine system. To shift to the M-DFCPP framework, a filtering layer is inserted to the original DFCPP, transforming it into followers that can cooperate with each other. The master is made of modules for scheduling, data processing, graph partition, state management and so forth. In benchmark tests with workload with directed acyclic graph topology of binary trees and random graphs, DFCPP demonstrated performance improvements of 20% and 8%, respectively, compared to the second fastest library. M-DFCPP consistently exhibits outstanding performance across varying levels of concurrency and task workloads, achieving a maximum speedup of more than 20 over DFCPP, when the task parallelism e
With the development of cyberspace security attack and defense, the malware detection model based on machine learning is also facing the threat of adversarial examples. An important way to defend against such threats ...
详细信息
The interplay between superconductivity and the Kondo effect has stimulated significant interest in condensed matter *** compete when their critical temperatures are close and can give rise to a quantum phase transiti...
详细信息
The interplay between superconductivity and the Kondo effect has stimulated significant interest in condensed matter *** compete when their critical temperatures are close and can give rise to a quantum phase transition that can mimic Majorana zero ***,we have fabricated and measured Al-InSb nanowire quantum dot-Al *** the Kondo regime,a supercurrent-induced zero-bias conductance peak *** zero-bias peak shows an anomalous negative magnetoresistance(NMR)at weak magnetic *** attribute this anomalous NMR to quasiparticle trapping at vortices in the superconductor leads as a weak magnetic field is *** trapping effect lowers the quasiparticle-caused dissipation and thus enhances the Josephson *** work connects the vortex physics and the supercurrent tunneling in Kondo regimes and can help further understand the physics of Josephson quantum dot system.
Plant disease diagnosis in time can inhibit the spread of the disease and prevent a large-scale drop in production,which benefits food *** detection-based plant disease diagnosis methods have attracted widespread atte...
详细信息
Plant disease diagnosis in time can inhibit the spread of the disease and prevent a large-scale drop in production,which benefits food *** detection-based plant disease diagnosis methods have attracted widespread attention due to their accuracy in classifying and locating ***,existing methods are still limited to single crop disease *** importantly,the existing model has a large number of parameters,which is not conducive to deploying it to agricultural mobile ***,reducing the number of model parameters tends to cause a decrease in model *** solve these problems,we propose a plant disease detection method based on knowledge distillation to achieve a lightweight and efficient diagnosis of multiple diseases across multiple *** detail,we design 2 strategies to build 4 different lightweight models as student models:the YOLOR-Light-v1,YOLOR-Light-v2,Mobile-YOLOR-v1,and Mobile-YOLOR-v2 models,and adopt the YOLOR model as the teacher *** develop a multistage knowledge distillation method to improve lightweight model performance,achieving 60.4%mAP@.5 in the PlantDoc dataset with small model parameters,outperforming existing ***,the multistage knowledge distillation technique can make the model lighter while maintaining high *** only that,the technique can be extended to other tasks,such as image classification and image segmentation,to obtain automated plant disease diagnostic models with a wider range of lightweight applicability in smart *** code is available at https://***/QDH/MSKD.
Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization probl...
详细信息
The relation is a semantic expression relevant to two named entities in a *** a sentence usually contains several named entities,it is essential to learn a structured sentence representation that encodes dependency in...
详细信息
The relation is a semantic expression relevant to two named entities in a *** a sentence usually contains several named entities,it is essential to learn a structured sentence representation that encodes dependency information specific to the two named *** related work,graph convolutional neural networks are widely adopted to learn semantic dependencies,where a dependency tree initializes the adjacency ***,this approach has two main ***,parsing a sentence heavily relies on external toolkits,which can be ***,the dependency tree only encodes the syntactical structure of a sentence,which may not align with the relational semantic *** this paper,we propose an automatic graph learningmethod to autonomously learn a sentence’s structural *** of using a fixed adjacency matrix initialized by a dependency tree,we introduce an Adaptive Adjacency Matrix to encode the semantic dependency between *** elements of thismatrix are dynamically learned during the training process and optimized by task-relevant learning objectives,enabling the construction of task-relevant semantic dependencies within a *** model demonstrates superior performance on the TACRED and SemEval 2010 datasets,surpassing previous works by 1.3%and 0.8%,*** experimental results show that our model excels in the relation extraction task,outperforming prior models.
Modern C++, a programming language characterized by its extensive use of object-oriented programming (OOP) features, is widely used for system programming. However, C++ compilers often struggle to correctly handle the...
详细信息
Modern C++, a programming language characterized by its extensive use of object-oriented programming (OOP) features, is widely used for system programming. However, C++ compilers often struggle to correctly handle these sophisticated OOP features, resulting in numerous high-profile compiler bugs that can lead to crashes or miscompilation. Despite the significance of OOP-related bugs, existing studies largely overlook OOP features, hindering their ability to discover such bugs. To assist both compiler fuzzer designers and compiler developers, we conduct a comprehensive study of the compiler bugs caused by incorrectly handling C++ OOP-related features. First, we systematically extract 788 OOP-related C++ compiler bugs from GCC and LLVM. Second, derived from the core concepts of OOP and C++, we manually identified a two-level taxonomy of the OOP-related features leading to compiler bugs, which consists of 6 primary categories (e.g., Abstraction & Encapsulation, Inheritance, and Runtime Polymorphism), along with 17 secondary categories (e.g., Constructors & Destructors and Multiple Inheritance). Third, we systematically analyze the root causes, symptoms, fixes, options, and C++ standard versions of these bugs. Our analysis yields 13 key findings, highlighting that features related to the construction and destruction of objects lead to the highest number of bugs, crashes are the most frequent symptom, and while the average time from bug introduction to discovery is 1856 days, fixing the bug once discovered takes only 174 days on average. Additionally, more than half of the bugs can be triggered without any compiler options. These findings offer valuable insights not only for developing new compiler testing approaches but also for improving language design and compiler engineering. Inspired by these findings, we developed a proof-of-concept compiler fuzzer OOPFuzz, specifically targeting OOP-related bugs in C++ compilers. We applied it against the newest release versions
Multiple object tracking (MOT) methods based on single object tracking are of great interest because of their ability to balance efficiency and performance on the strength of the localization capability of single-targ...
详细信息
Link prediction is one of the essential issues in network science, which aims to find the unknown link or estimate the future link in networks. Existing methods are mainly based on the assumption that the network data...
详细信息
ISBN:
(纸本)9798350346558
Link prediction is one of the essential issues in network science, which aims to find the unknown link or estimate the future link in networks. Existing methods are mainly based on the assumption that the network data is completely available and has stable distribution before analysis. In practice, however, complex networks evolve lifelong with massive data. The data in those networks is associated with previous ones, and the distribution will be non-stationary. Compared with conventional link prediction methods, online link prediction in the largescale dynamic network has three main challenges: i) How to analyze massive data with acceptable expenditure;ii) How to predict future links with less topological information;iii) How to make link prediction stably in a dynamic evolutionary network. In this paper, we propose a streaming link prediction model based on lifelong learning and graph neural networks (GNNs), which converts the link prediction problems to graph classification. Our main idea is to design a new topology, the Feature-Inverse-Graph, which turns node pairs into independent graphs and takes the features of the node pair as new vertices. Additionally, we apply a two-phase sampling sketch to deal with the massive data so that the complexity of the model within lifelong evolving networks could be acceptable. Then, the link prediction tasks in the regular graph is converted into a series of individually Feature-Inverse-Graph classifications. In this case, the computational cost of our model will not increase dramatically with the increase of network data, which is further verified by analyzing the computational complexity. The experimental results demonstrate the efficiency and effectiveness of our model by continuously predicting future links of classical datasets. In our experiments, several topological link prediction measures are chosen as features of node pair. For future work, the FIG-LP model can be used as the basic research of intelligent systems su
暂无评论