Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely...
Graph learning is a prevalent field that operates on ubiquitous graph data. Effective graph learning methods can extract valuable information from graphs. However, these methods are non-robust and affected by missing ...
详细信息
Diffusion models are initially designed for image generation. Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative task...
Introduction: Point clouds obtained from capture devices or 3D reconstruction techniques are often noisy and interfere with downstream ***: The paper aims to recover the underlying surface of noisy point ***: We desig...
详细信息
Nowadays, data parallelism has been widely applied to train large datasets on distributed deep learning clusters, but it has suffered from costly global parameter updates at batch barriers. Performance imbalance among...
详细信息
Nowadays, data parallelism has been widely applied to train large datasets on distributed deep learning clusters, but it has suffered from costly global parameter updates at batch barriers. Performance imbalance among worker instances, introduced by uneven workload partitioning or biased resource allocation, can cause straggly workers, which can lead to severe impacts on both training speed and result accuracy. This paper studies the issue focusing on the tradeoff between training speed and result accuracy. We propose Cooperate Grouping Parallel (CGP), a hybrid parameter update solution that allows the flexibility of both synchronous and asynchronous update schemes. We introduce a novel Cooperate Worker Grouping Problem (CWGP) that seeks a task grouping configuration that leads to maximum model accuracy and holds customized training speed guarantees. We propose an evolution-based Pareto local searching algorithm to compute efficient worker grouping configurations. Comprehensive evaluation results are presented to demonstrate the effectiveness of CGP under both persistent and fluctuant imbalances. The proposed method alleviates the imbalance impacts without introducing extra adjustment over-heads.
In online insurance, one of the central challenges is the cold-starting of new insurance products, which means there are no previous samples to refer to. Previous studies have mainly focused on improving the predictio...
In online insurance, one of the central challenges is the cold-starting of new insurance products, which means there are no previous samples to refer to. Previous studies have mainly focused on improving the prediction accuracy of new items, but they have failed to consider the revenue generated by existing items. As a result, the total revenue may suffer a loss even if new items get conversions. To address this issue, in this paper we propose RACRec, a Revenue-Aware Cold-start Recommendation framework for online insurance. Unlike previous works, RACRec uses a revenue-based objective function to ensure profitability when cold-starting new items. With this dedicated objective, RACRec orchestrates the cold-start and warm-start model by predicting their expected revenue from each user, thereby preserving the total profit. In order to accurately predict revenue, a double-ranking scheme is designed for the warm-start model to mitigate position bias, while an item embedding alignment algorithm is proposed for the cold-start model to learn the revenue of new items from similar old items. Furthermore, a reinforced orchestrate update scheme is designed to eliminate the impact of sparse conversions and continuously update revenue estimation. Extensive offline and online experiments have been conducted to validate the effectiveness of RACRec.
As an important post-translational modification, glutarylation plays a crucial role in a variety of cellular functions. Recently, diverse computational methods for glutarylation site identification have been proposed....
详细信息
As an important post-translational modification, glutarylation plays a crucial role in a variety of cellular functions. Recently, diverse computational methods for glutarylation site identification have been proposed. However, the class imbalance problem due to data noise and uncertainty of non-glutarylation sites remains a great challenge. In this article, we propose a novel semi-supervised learning algorithm, called WGAN-GP_Glu, for identifying reliable non-glutarylation lysine sites from those without glutarylation annotation. WGAN-GP_Glu method is a multi-module framework algorithm, which mainly includes a reliable negative sample selection module, a deep feature extraction module, and a glutarylation site prediction module. In reliable negative sample selection module, we design an improved method of Wasserstein GAN with Gradient Penalty (WGAN-GP), named ReliableWGAN-GP, including three parts, two generators G1, G2 and a discriminator D, which can select reliable non-glutarylation samples from a great number of unlabeled samples. Generator G1 is utilized to generate noise data from unlabeled samples. For generator G2, both the positive sample and the noise data are used as inputs to improve the discriminant capability of discriminator D. Then, convolutional neural network and bidirectional long short-term memory network combined with attention mechanism are utilized to extract deep features for glutarylation samples and reliable non-glutarylation samples. Finally, a glutarylation site prediction module based on the three-layer fully connected layer is designed to make class predictions for samples. The sensitivity, specificity, accuracy and Matthew correlation coefficient of WGAN-GP_Glu on the independent test data set reach 90.58 %, 95.82 %, 94.44 % and 0.8645, respectively, which surpassed the existing methods for glutarylation sites prediction. Therefore, WGAN-GP_Glu can serve as a powerful tool in identifying glutarylation sites and the ReliableWGAN-GP algo
Federated recommendation system usually trains a global model on the server without direct access to users' private data on their own devices. However, this separation of the recommendation model and users' pr...
详细信息
Deep learning often requires large amounts of data from different institutions. Federated learning, as a distributed training framework, enables multiple participants to collaboratively train models without collecting...
详细信息
This paper focuses on an elastic dislocation problem that is motivated by applications in the geophysical and seismological communities. In our model, the displacement satisfies the Lamé system in a bounded domai...
详细信息
暂无评论