Distributed computing has attracted significant recent attention for speeding up large-scale computations by disseminating computational jobs from a central master node across several worker nodes/servers. However, wo...
详细信息
Pre-training has emerged as a dominant paradigm in graph representation learning to address data scarcity and generalization challenges. The majority of existing methods primarily focus on refining fine-tuning and pro...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Pre-training has emerged as a dominant paradigm in graph representation learning to address data scarcity and generalization challenges. The majority of existing methods primarily focus on refining fine-tuning and prompting techniques to extract information from pre-trained models. However, the effectiveness of these approaches is contingent upon the quality of the pre-trained knowledge (i.e., latent representations). Inspired by the recent success in topological representation learning, we propose a novel pre-training strategy to capture and learn topological information of graphs. The key to the success of our strategy is to pre-train expressive Graph Neural Networks (GNNs) at the levels of individual nodes while accounting for the key topological characteristics of a graph so that GNNs become sufficiently powerful to effectively encode input graph information. The proposed model is designed to be seamlessly integrated with various downstream graph representation learning tasks. data and code are available at https://***/pyliang-graph/Topology-Informed-Pre-training-of-Graph-Neural-Networks.
Inferring causal relationships as directed acyclic graphs (DAGs) is an important but challenging problem. Differentiable Causal Discovery (DCD) is a promising approach to this problem, framing the search as a continuo...
Inferring causal relationships as directed acyclic graphs (DAGs) is an important but challenging problem. Differentiable Causal Discovery (DCD) is a promising approach to this problem, framing the search as a continuous optimization. But existing DCD methods are numerically unstable, with poor performance beyond tens of variables. In this paper, we propose Stable Differentiable Causal Discovery (SDCD), a new method that improves previous DCD methods in two ways: (1) It employs an alternative constraint for acyclicity; this constraint is more stable, both theoretically and empirically, and fast to compute. (2) It uses a training procedure tailored for sparse causal graphs, which are common in real-world scenarios. We first derive SDCD and prove its stability and correctness. We then evaluate it with both observational and interventional data and in both small-scale and large-scale settings. We find that SDCD outperforms existing methods in convergence speed and accuracy, and can scale to thousands of variables.
Solving partial differential equations (PDEs) is a common task in numerical mathematics and scientific computing. Typical discretization schemes, for example, finite element (FE), finite volume (FV), or finite differe...
详细信息
This paper studies the Big data ecosystem and regulatory approaches for the technological issues used in Kazakhstan. Key points are considered in terms of the importance and necessity of managing data at various level...
详细信息
In this paper we consider recovering combinatorial objects from many noisy observations. The first part of the paper concerns reconstructing trees from traces in the tree edit distance model. Previous work focused on ...
详细信息
ISBN:
(数字)9798350382846
ISBN:
(纸本)9798350382853
In this paper we consider recovering combinatorial objects from many noisy observations. The first part of the paper concerns reconstructing trees from traces in the tree edit distance model. Previous work focused on reconstructing various classes of labelled trees, while our work gives reductions from the classic string reconstruction setting to unlabelled trees. In the second part of the paper we discuss combinatorial identities of the binary deletion channel on finite and infinite strings. We link probabilities of observing bits in a trace to derivatives of certain generating functions. We also give identities for the deletion channel, conditioned on traces having fixed length.
The Stock Market has developed as a revolutionary force, transforming global economies through democratizing wealth creation. With the use of Artificial Intelligence and Machine Learning, investors can navigate compli...
详细信息
ISBN:
(数字)9798331525439
ISBN:
(纸本)9798331525446
The Stock Market has developed as a revolutionary force, transforming global economies through democratizing wealth creation. With the use of Artificial Intelligence and Machine Learning, investors can navigate complicated financial terrain. While improving predictive accuracy, the underlying complexities and varied market opinions can render decision-making effective. This study emphasizes trend analysis and prediction of future values, enabling users to make smart investment decisions with little effort. Using a Hybrid Machine Learning Model, the method brings together several top-performing algorithms to improve predictive power and trend observation. The most important components of the project are Market data Retrieval, Time Series Forecasting, Sentiment Analysis and data Visualization, all of which assist in creating thorough reports that aid strategic investment decisions.
We study distributed goodness-of-fit testing for discrete distribution under bandwidth and differential privacy constraints. Information constraint distributed goodness-of-fit testing is a problem that has received co...
详细信息
This survey explores the synergistic potential of Large Language Models (LLMs) and Vector databases (VecDBs), a burgeoning but rapidly evolving research area. With the proliferation of LLMs comes a host of challenges,...
详细信息
ISBN:
(数字)9798350390971
ISBN:
(纸本)9798350390988
This survey explores the synergistic potential of Large Language Models (LLMs) and Vector databases (VecDBs), a burgeoning but rapidly evolving research area. With the proliferation of LLMs comes a host of challenges, including hallucinations, outdated knowledge, prohibitive commercial application costs, and memory issues. VecDBs emerge as a compelling solution to these issues by offering an efficient means to store, retrieve, and manage the high-dimensional vector representations intrinsic to LLM operations. Through this nuanced review, we delineate the foundational principles of LLMs and VecDBs and critically analyze their integration’s impact on enhancing LLM functionalities. This discourse extends into a discussion on the speculative future developments in this domain, aiming to catalyze further research into optimizing the confluence of LLMs and VecDBs for advanced data handling and knowledge extraction capabilities.
Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double expon...
详细信息
暂无评论