Hierarchical clustering has usually been addressed by discrete optimization using heuristics or continuous optimization of relaxed scores for hierarchies. In this work, we propose to optimize expected scores under a p...
ISBN:
(纸本)9798331314385
Hierarchical clustering has usually been addressed by discrete optimization using heuristics or continuous optimization of relaxed scores for hierarchies. In this work, we propose to optimize expected scores under a probabilistic model over hierarchies. (1) We show theoretically that the global optimal values of the expected Dasgupta cost and Tree-Sampling divergence (TSD), two unsupervised metrics for hierarchical clustering, are equal to the optimal values of their discrete counterparts contrary to some relaxed scores. (2) We propose Expected Probabilistic Hierarchies (EPH), a probabilistic model to learn hierarchies in data by optimizing expected scores. EPH uses differentiable hierarchy sampling enabling end-to-end gradient descent based optimization, and an unbiased subgraph sampling approach to scale to large datasets. (3) We evaluate EPH on synthetic and real-world datasets including vector and graph datasets. EPH outperforms all other approaches quantitatively and provides meaningful hierarchies in qualitative evaluations.
This paper presents a method for retinal disease classification using optical coherence tomography (OCT) scans, specifically addressing the challenge of variable B-scan density across dataset volumes. Deep learning me...
详细信息
In autonomous driving, understanding the surroundings is crucial for safety. Since most object detection systems are designed to identify known objects, they may miss unknown or novel objects, which can be dangerous. ...
详细信息
For the solutions Φ(z) of functional equations Φ(z) = P(z) + Φ(Q(z)), we derive a complete asymptotic of power series coefficients. As an application, we improve significantly an asymptotic of the number of 2,3-tre...
详细信息
Road infrastructure safety and maintenance have received more attention recently due to the significant influence that it has on traffic flow and road user safety. Potholes are one common kind of road defect that seri...
详细信息
It is known that the left tail asymptotic for supercritical branching processes in the Schröder case satisfies a power law multiplied by some multiplicatively periodic function. We provide an explicit expression ...
Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical under...
详细信息
Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these algorithms remains challenging due to the nonlinearity of the action-value approximation. In this paper, we develop an improved non-asymptotic analysis of the neural TD method with a general L-layer neural network. New proof techniques are developed and an improved new Õ(ϵ-1) sample complexity is derived. To our best knowledge, this is the first finite-time analysis of neural TD that achieves an Õ(ϵ-1) complexity under the Markovian sampling, as opposed to the best known Õ(ϵ-2) complexity in the existing literature. Copyright 2024 by the author(s)
Bayesian Neural Networks (BNNs) offer probability distributions for model parameters, enabling uncertainty quantification in predictions. However, they often underperform compared to deterministic neural networks. Uti...
详细信息
The Neural Tangent Kernel (NTK) viewpoint is widely employed to analyze the training dynamics of overparameterized Physics-Informed Neural Networks (PINNs). However, unlike the case of linear Partial Differential Equa...
暂无评论