Bug fixing holds significant importance in software development and maintenance. Recent research has made substantial strides in exploring the potential of large language models (LLMs) for automatically resolving soft...
详细信息
Automatic Music Transcription (AMT) entails creating an algorithm that converts an acoustic signal from an audio file into the corresponding sheet music representation. This paper uses deep learning met...
详细信息
We present DeepSAT, a novel end-to-end learning framework for the Boolean satisfiability (SAT) problem. Unlike existing solutions trained on random SAT instances with relatively weak supervision, we propose applying t...
ISBN:
(纸本)9798350323481
We present DeepSAT, a novel end-to-end learning framework for the Boolean satisfiability (SAT) problem. Unlike existing solutions trained on random SAT instances with relatively weak supervision, we propose applying the knowledge of the well-developed electronic design automation (EDA) field for SAT solving. Specifically, we first resort to logic synthesis algorithms to pre-process SAT instances into optimized and-inverter graphs (AIGs). By doing so, the distribution diversity among various SAT instances can be dramatically reduced, which facilitates improving the generalization capability of the learned model. Next, we regard the distribution of SAT solutions being a product of conditional Bernoulli distributions. Based on this observation, we approximate the SAT solving procedure with a conditional generative model, leveraging a novel directed acyclic graph neural network (DAGNN) with two polarity prototypes for conditional SAT modeling. To effectively train the generative model, with the help of logic simulation tools, we obtain the probabilities of nodes in the AIG being logic '1' as rich supervision. We conduct comprehensive experiments on various SAT problems. Our results show that, DeepSAT achieves significant accuracy improvements over state-of-the-art learning-based SAT solutions, especially when generalized to SAT instances that are relatively large or with diverse distributions.
This paper proposes a novel ETC-MTCTR, which is designed to enable more accurate, versatile and efficient traffic classification in the context of multi-scenario, low-resource encrypted traffic. Through three modules ...
详细信息
ISBN:
(数字)9798350368369
ISBN:
(纸本)9798350368376
This paper proposes a novel ETC-MTCTR, which is designed to enable more accurate, versatile and efficient traffic classification in the context of multi-scenario, low-resource encrypted traffic. Through three modules of Datagram Token conversion, pretraining and fine-tuning, the method uses large-scale unlabeled encrypted traffic for pretraining, mining and learning the traffic context and transmission relationship of encrypted traffic classification tasks, so that a small number of labeled data samples can be effectively used in the fine-tuning stage. Significantly improve the performance of the model on specific downstream classification tasks, enhance the accuracy, adaptability and robustness of the model in diverse environments, limited resources and new encryption security protocols, and realize efficient encryption traffic classification in multi-scenario and low-resource background. The results show that ETC-MTCTR achieves the best performance on three tasks: encryption malware classification, VPN encrypted traffic classification, and TLS 1.3 encryption application classification. Its F1 score is improved by 0.22% in the classification task of encrypted malware, 1.4% in the classification task of VPN encrypted traffic App, 4.56% in the classification task of VPN encrypted traffic Service, and 9.89% in the classification task of TLS 1.3 encrypted application, which is significantly better than other comparison methods.
Generative Adversarial Network (GAN) inversion have demonstrated excellent performance in image inpainting that aims to restore lost or damaged image texture using its unmasked content. Previous GAN inversion-based me...
详细信息
Generative Adversarial Network (GAN) inversion have demonstrated excellent performance in image inpainting that aims to restore lost or damaged image texture using its unmasked content. Previous GAN inversion-based methods usually utilize well-trained GAN models as effective priors to generate the realistic regions for missing holes. Despite excellence, they ignore a hard constraint that the unmasked regions in the input and the output should be the same, resulting in a gap between GAN inversion and image inpainting and thus degrading the performance. Besides, existing GAN inversion approaches often consider a single modality of the input image, neglecting other auxiliary cues in images for improvements. Addressing these problems, we propose a novel GAN inversion approach, dubbed MMInvertFill, for image inpainting. MMInvertFill contains primarily a multimodal guided encoder with a pre-modulation and a GAN generator with F&W+ latent space. Specifically, the multimodal encoder aims to enhance the multi-scale structures with additional semantic segmentation edge texture modalities through a gated mask-aware attention module. Afterwards, a pre-modulation is presented to encode these structures into style vectors. To mitigate issues of conspicuous color discrepancy and semantic inconsistency, we introduce the F&W+ latent space to bridge the gap between GAN inversion and image inpainting. Furthermore, in order to reconstruct faithful and photorealistic images, we devise a simple yet effective Soft-update Mean Latent module to capture more diversified in-domain patterns for generating high-fidelity textures for massive corruptions. In our extensive experiments on six challenging datasets, including CelebA-HQ [25], Places2 [75], OST [51], CityScapes [8], MetFaces [22] and Scenery [62], we show that our MMInvertFill qualitatively and quantitatively outperforms other state-of-the-arts and it supports the completion of out-of-domain images effectively. Our project webpage incl
Medical image segmentation is a crucial task in medical image analysis, but it can be very challenging especially when there are less labeled data but with large unlabeled data. Contrastive learning has proven to be e...
详细信息
The Industrial Internet of Things (IIoT) enables communication among automation systems, machinery, and sensors in an industrial setting. To optimize critical industrial operations, a substantial volume of data concer...
详细信息
Modeling stochastic multi-ship trajectories is vital for maritime safety and interaction efficiency. Recent researches show that diffusion models excel in trajectory prediction, surpassing GANs and VAEs in generation ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Modeling stochastic multi-ship trajectories is vital for maritime safety and interaction efficiency. Recent researches show that diffusion models excel in trajectory prediction, surpassing GANs and VAEs in generation quality, diversity and stability. However, their slow sampling speed remains a major limitation, as producing high-quality trajectories typically requires hundreds of denoising steps. We introduce DDA, a novel method that accelerates multi-ship trajectory generation by distilling the reverse diffusion process, progressively reducing sampling steps by half while minimizing quality loss. We use CVAE-based encoder to map multimodal inputs into state embeddings in the latent space, and use distillation diffusion in the latent space to more quickly and better represent multi-ship trajectories. The diffusion model uses Transformer-based core, and we incorporate SO(2) invariance and equivariance to enhance model representation. Validation on real-world AIS datasets shows that the student model retains high-quality trajectory generation while sampling speed is approximately 30 times faster.
In real-world scenarios, extreme cases where pedestrians suddenly emerge from blind spots or occlusions, leaving only a minimal amount of observable trajectory points, occur frequently. This presents a significant cha...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In real-world scenarios, extreme cases where pedestrians suddenly emerge from blind spots or occlusions, leaving only a minimal amount of observable trajectory points, occur frequently. This presents a significant challenge for autonomous driving and robotic navigation, where pedestrian safety and timely response are critical considerations. To address this challenge, we propose a framework for instantaneous trajectory prediction using Latent bidirectional Cooperative Diffusion (LCD). It designs a complementary mechanism that constructs a coupled bidirectional cooperative diffusion model. LCD simultaneously and progressively generates unobserved past trajectories and future trajectories, feeding each other as conditions into the cross-attention module for mutual guidance. This framework employs CVAE as its encoder to map the observed multi-model trajectories into a high-dimensional latent space to enhance complex representations. Experiments conducted on the ETH/UCY and SDD datasets demonstrate the superiority of our framework.
Graph learning-based multi-modal integration and classification is one of the most challenging tasks for disease prediction. To effectively offset the negative impact among modalities in the process of multi-modal int...
详细信息
Graph learning-based multi-modal integration and classification is one of the most challenging tasks for disease prediction. To effectively offset the negative impact among modalities in the process of multi-modal integration and heterogeneous information extractions from graphs, we propose a novel method called Multi-modal Multi-Kernel Graph Learning (MMKGL). To solve the problem of negative impact among modalities, we propose a multi-modal graph embedding module to construct a multi-modal graph. Different from conventional methods that manually construct static graphs for all modalities, each modality generates a separate graph by adaptive learning, where a function graph and a supervision graph are introduced for optimization during the multi-graph fusion embedding process. We then propose a multi-kernel graph learning module to extract heterogeneous information from the multi-modal graph. The information in the multi-modal graph at different levels is aggregated by convolutional kernels with different receptive field sizes, followed by generating a cross-kernel discovery tensor for disease prediction. Our method is evaluated on the benchmark Autism Brain Imaging Data Exchange (ABIDE) dataset and outperforms the state-of-the-art methods. In addition, discriminative brain regions associated with autism are identified by our model, providing guidance for the study of autism pathology.
暂无评论