Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely...
ISBN:
(纸本)9798331314385
Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely, diffusion feature. We discover that diffusion feature has been hindered by a hidden yet universal phenomenon that we call content shift. To be specific, there are content differences between features and the input image, such as the exact shape of a certain object. We locate the cause of content shift as one inherent characteristic of diffusion models, which suggests the broad existence of this phenomenon in diffusion feature. Further empirical study also indicates that its negative impact is not negligible even when content shift is not visually perceivable. Hence, we propose to suppress content shift to enhance the overall quality of diffusion features. Specifically, content shift is related to the information drift during the process of recovering an image from the noisy input, pointing out the possibility of turning off-the-shelf generation techniques into tools for content shift suppression. We further propose a practical guideline named GATE to efficiently evaluate the potential benefit of a technique and provide an implementation of our methodology. Despite the simplicity, the proposed approach has achieved superior results on various tasks and datasets, validating its potential as a generic booster for diffusion features. Our code is available at https://***/Darkbblue/diffusion-content-shift.
Multi-agent reinforcement learning (MARL) algorithms have achieved great breakthroughs in many aspects. The MARL algorithms can learn effective policies in ideal simulation environments. But different from the ideal s...
详细信息
Deep learning methods have shown significant performance in medical image analysis tasks. However, they generally act like ”black box” without explanations in both feature extraction and decision processes, leading ...
详细信息
ISBN:
(数字)9781665468190
ISBN:
(纸本)9781665468206
Deep learning methods have shown significant performance in medical image analysis tasks. However, they generally act like ”black box” without explanations in both feature extraction and decision processes, leading to lack of clinical insights and high risk assessments. To aid deep learning in envisioning diseases with visual clues, we propose Representation Group-Disentangling Network (RGD-Net), which can completely disentangle feature space of input X-ray images into several independent feature groups, each corresponding to a specific disease. Taking several semantically related and labeled X-ray images as input, RGD-Net firstly extracts completely group-disentangled representations of diseases through Group-Disentangle Module, which applies group-swap and linking operations to construct latent space by enforcing semantic consistency of attributes. To prevent learning degenerate representations defined as shortcut problem, we further introduce adversarial constricts on mapping from features to diseases, thus avoiding model collapse with former free-form disentanglement. Experiments on chestxray-14 and ChestXpert datasets demonstrate that RGD-Net are effective in predicting diseases with remarkable advantages, which leverage potential factors contributing to different diseases, thus enhancing interpretability in working patterns of deep learning methods.
To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant functionalities. Npm displays the tags provided by the creators for th...
详细信息
ISBN:
(纸本)9781665437851
To prevent JavaScript developers from reinventing wheels, npm ecosystem provides numerous third-party libraries for developers to realize relevant functionalities. Npm displays the tags provided by the creators for these packages to help developers find suitable ones. However, not all creators have the habit of tagging their packages, and thus npm cannot provide tag information of a lot of packages for developers to help them understand the package functionalities effectively. Considering that many tags are unrelated to the functionality of packages, we propose a method to find out the tags that are important to distinguish the functionality categories of packages and assign them to untagged packages for assisting developers in the process of retrieving the packages. Firstly, we analyze the attribute of existing tags in npm to establish category tags (functionality categories). Then, we further mine the readme of tagged packages to generate keywords for each category tag. Finally, our method identifies category tags for untagged packages by measuring the similarity between their readme and the keywords of category tags. The evaluation demonstrates that our approach has a good performance in assigning category tags to untagged packages.
With the boosting of neural networks, recommendation methods become significantly improved by their powerful ability of prediction and inference. Existing neural-network based recommender systems (NN-RSs) usually firs...
详细信息
With the boosting of neural networks, recommendation methods become significantly improved by their powerful ability of prediction and inference. Existing neural-network based recommender systems (NN-RSs) usually first employ matrix embedding (ME) as a pre-process to learn users’ and items’ representations (latent vectors), then input these representations to a specific modified neural network framework to make accurate Top-k recommendations. Obviously, the performance of ME has a significant effect on RS models. However, most NN-RSs focus on accuracy by building representations from the direct user-item interactions (e.g., user-item rating matrix), while ignoring the underlying relatedness between users and items (e.g., users who rate the same ratings for the same items should be embedded into similar representations), which is an ideological disadvantage. On the other hand, ME models directly employ inner products as a default loss function metric that cannot project users and items into a proper latent space, which is a methodological disadvantage. In this paper, we propose a supervised collaborative representation learning model - Magnetic Metric Learning (MML) - to map users and items into a unified latent vector space, enhancing the representation learning for NN-RSs. Firstly, MML utilizes dual triplets to model not only the observed relationships between users and items, but also the underlying relationships between users as well as items to overcome the ideological disadvantage. Specifically, a modified metric-based dual loss function is proposed in MML to gather similar entities and disperse the dissimilar ones. With MML, we can easily compare all the relationships (user to user, item to item, user to item) according to the weighted metric, which overcomes the methodological disadvantage. We conduct extensive experiments on four real-world datasets with large item space. The results demonstrate that MML can learn a proper unified latent space for representa
Unmanned aerial vehicles (UAVs) have gained considerable attention as a platform for establishing aerial wireless networks and communications. However, the line-of-sight dominance in air-to-ground communications often...
详细信息
Partial Multi-label Learning (PML) aims to induce the multi-label predictor from datasets with noisy supervision, where each training instance is associated with several candidate labels but only partially valid. To a...
详细信息
A process of gene expression which is designed and modeled by machine language, and by using to process complex data. At the molecular biological level, classifying and sorting the related substances involved in the t...
详细信息
With the prevalence of online review websites, large-scale data promote the necessity of focused analysis. This task aims to capture the information that is highly relevant to a specific aspect. However, the broad sco...
详细信息
The proliferation of fake news on online social media has severely misled public perception of event authenticity. To combat this, various Fake News Detection (FND) methods have been developed for specific domains, ty...
详细信息
暂无评论