Exploration is a critical challenge for deep reinforcement learning methods. Although existing works such as actor-critic algorithms have made much progress, most still suffer from the sample inefficiency problem in c...
详细信息
Exploration is a critical challenge for deep reinforcement learning methods. Although existing works such as actor-critic algorithms have made much progress, most still suffer from the sample inefficiency problem in complex environments where rewards are sparse. parallel sampling, which uses multiple actors with the same policy interacting with the environment, is an effective approach to improve sample efficiency. However, parallel parameter-sharing actors collect similar samples, which generally hinders the improvement of the overall exploration process. In this paper, we propose a Policy Diversity enhanced approach for parallel Actor-Critic (PDAC). Specifically, we extend the parallel actor-critic architecture to the PDAC framework composed of a shared critic and parallel distinct actors. Then we introduce the KL-divergence of the action probability distribution between parallel actors as the intrinsic reward to encourage actors to explore diverse strategies. We evaluate our approach in multiple challenging procedurally-generated tasks and compare it with state-of-the-art algorithms. Experiments show that PDAC makes significant progress in the comparison, in terms of cumulative rewards and sample efficiency.
Code representation learning is an important way to encode the semantics of source code through pre-training. The learned representation supports a variety of downstream tasks, such as natural language code search and...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Code representation learning is an important way to encode the semantics of source code through pre-training. The learned representation supports a variety of downstream tasks, such as natural language code search and code defect detection. Inspired by pre-trained models for natural language representation learning, existing approaches often treat the source code or its structural information (e.g., Abstract Syntax Tree or AST) as a plain token sequence. Unlike natural language, programming language has its unique code unit information (e.g., identifiers and expressions) and logic information (e.g., the functionality of a code snippet). To further explore those properties, we propose Abstract Code Embedding (AbCE), a self-supervised learning method that considers the abstract semantics of code logic. Instead of scattered tokens, AbCE treats an entire node or a subtree in an AST as a basic code unit during pre-training, which preserves the entirety of a coding unit. Moreover, AbCE learns the abstract semantics of AST nodes via a self-distillation way. Experimental results show that it achieves significant improvements over state-of-the-art baselines on code search tasks and comparable performance on code clone detection and defect detection tasks even without using contrastive learning or curriculum learning.
Self-supervised time series anomaly detection (TSAD) demonstrates remarkable performance improvement by extracting high-level data semantics through proxy tasks. Nonetheless, most existing self-supervised TSAD techniq...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Self-supervised time series anomaly detection (TSAD) demonstrates remarkable performance improvement by extracting high-level data semantics through proxy tasks. Nonetheless, most existing self-supervised TSAD techniques rely on manual- or neural-based transformations when designing proxy tasks, overlooking the intrinsic temporal patterns of time series. This paper proposes a local temporal pattern learning-based time series anomaly detection (LTPAD). LTPAD first generates sub-sequences. Pairwise sub-sequences naturally manifest proximity relationships along the time axis, and such correlations can be used to construct supervision and train neural networks to facilitate the learning of temporal patterns. Time intervals between two sub-sequences serve as labels for sub-sequence pairs. By classifying these labeled data pairs, our model captures the local temporal patterns of time series, thereby modeling the temporal pattern-aware "normality". Abnormal scores of testing data are acquired by evaluating their conformity to these learned patterns shared in training data. Extensive experiments show that LTPAD significantly outperforms state-of-the-art competitors.
The talking head generation aims to synthesize a speech video of the source identity from a driving video or audio or text data irrelevant to the source identity. It can not only be applied to games and virtual realit...
详细信息
With the rapid development of Internet technology, various network attack methods come out one after the other. SQL injection has become one of the most severe threats to Web applications and seriously threatens vario...
详细信息
Network traffic classification is crucial for network security and network management and is one of the most important network tasks. Current state-of-the-art traffic classifiers are based on deep learning models to a...
详细信息
Network traffic classification is crucial for network security and network management and is one of the most important network tasks. Current state-of-the-art traffic classifiers are based on deep learning models to automatically extract features from packet streams. Unfortunately, current approaches fail to effectively combine the structural information of traffic packets with the content features of the packets, resulting in limited classification accuracy. In this paper, we propose a graph neural network model for network traffic classification, which can well perceive the interaction feature of packets in traffic. Firstly, we design a graph structure for packets’ flows to hold the interaction information between packets, which embeds both packet contents and sequence relationships into a unified graph. Secondly, we propose a graph neural network framework for graph classification to automatically learn the structural features of the packets’ flows together with the packets’ features. Extensive evaluation results on real-world traffic data show that the proposed model improves the prediction accuracy of improves the prediction accuracy by 2% to 37% for malicious traffic classification.
Blockchain technology has been extensively uti-lized in decentralized data-sharing applications, with the immutability of blockchain providing a witness for the circulation of data. However, current blockchain data-sh...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Blockchain technology has been extensively uti-lized in decentralized data-sharing applications, with the immutability of blockchain providing a witness for the circulation of data. However, current blockchain data-sharing solutions still fail to address the simultaneous screening needs of both the sender and receiver with multi-keywords. Without the capability to support bilateral simultaneous filtering, the disclosure of reasons for matching failures could inadvertently expose sensitive user data. Therefore, the challenge lies in enabling ciphertexts with multiple keywords and receivers with multiple interests to achieve mutual and simultaneous matching. Based on the technical foundations of SE (Searchable Encryption), MABE (Multi-Attribute Based Encryption), and polynomial fitting, this paper proposes a scheme called DMSA (Decentralized and Multi-keyword selective Sharing and selective Acquisition). This scheme can satisfy soundness, enabling ciphertexts carrying multiple keywords and receivers representing multiple interests to match each other simultaneously. We conducted a security analysis that confirms the security of DMSA against chosen-plaintext attacks. Our experimental results demonstrate a significant efficiency improvement, with a 67% increase over single-keyword data-sharing schemes and a 16% enhancement compared to the existing multi-keyword data-sharing solution.
Relation extraction as an important Natural Language processing (NLP) task is to identify relations between named entities in text. Recently, graph convolutional networks over dependency trees have been widely used to...
详细信息
Federated learning (FL) is a decentralized machine learning framework that prioritizes privacy by allowing clients to train statistical models without sharing their private data, thus eliminating the impact of data fo...
Federated learning (FL) is a decentralized machine learning framework that prioritizes privacy by allowing clients to train statistical models without sharing their private data, thus eliminating the impact of data fortresses. However, the presence of Byzantine attacks, such as data poisoning and backdoor attack, threatens the robustness of FL schemes. Currently, existing mainstream defense methods are susceptible to multiple adaptive attacks, some of which even violate the privacy principle of FL. Furthermore, these defense schemes become less robust when subjected to targeted poisoning attacks with highly non-IID data distributions. In this work, we propose FedNAT, a novel Byzantine-robust FL framework for whittling away these limitations mentioned above. Specifically, FedNAT first performs a privacy-respecting attention refinement on the activation layer outputs of the local uploads. Then, the server scores the local attentions by calculating their Wasserstein distances and clusters them through the k-median algorithm for global attention aggregation, thus rejecting poisoned local attentions for untargeted attacks. After this process, the global attention is transferred to local attention through the FedNAT loss function, which erases backdoors through the distillation concept. We conduct a comprehensive experimental evaluation to demonstrate that FedNAT significantly outperforms existing robust FL schemes in defending against Byzantine poisoning attacks under both IID and highly non-IID data proportions.
Image deblurring task is an ill-posed one, where exists infinite feasible solutions for blurry image. Modern deep learning approaches usually discard the learning of blur kernels and directly employ end-to-end supervi...
详细信息
暂无评论