One-shot voice conversion (VC) aims to alter the timbre of speech from a source speaker to match that of a target speaker using just a single reference speech from the target, while preserving the semantic content of ...
详细信息
Attention mechanisms are known to effectively increase the performance of a model by focusing on specific sections of the input while executing a task. Squeeze and Excitation module (SE-Net) and Efficient channel atte...
详细信息
ISBN:
(数字)9798331522100
ISBN:
(纸本)9798331522117
Attention mechanisms are known to effectively increase the performance of a model by focusing on specific sections of the input while executing a task. Squeeze and Excitation module (SE-Net) and Efficient channel attention module (ECA-Net) are well-known Channel Attention networks that only involve a few parameters to achieve a clear performance gain. In order to further improve the model’s accuracy, Binarized Neural Networks (BNNs) can be introduced in the same. BNNs are deep learning models that use binarized values for activation functions and training weights, instead of fully precised values that leads to much faster computation, occupying less memory and power. Motivated by this, a method of binarization of the convolutional network is proposed in this paper that can enhance the overall model accuracy. In our work, we focused on evaluating the performance of our binarized attention module over the pre-existing ones. Thus, the objective of this work is to design, implement, and evaluate efficient neural network models that balance accuracy and computational efficiency for image classification tasks on datasets with differing complexity levels, such as CIFAR-10 and CIFAR-100. By exploring various backbone architectures (e.g., MobileNetV2 and ResNet50), attention mechanisms (e.g., ECANet and SENet), and binarization techniques, this study aims to understand the impact of these components on model performance and resource requirements. Ultimately, the goal is to develop optimized models that achieve high accuracy with minimal computational overhead, making them suitable for deployment in resource-constrained environments.
In the fast-paced realm of global financial markets, characterized by rapid trading of both stocks and cryptocurren-cies, it has become essential to grasp the influence of sentiment on market dynamics. With more than ...
详细信息
Recently, image forgery has become an alarming trend with the growth of available easy-to-use editing and generation tools. Modern DeepFake methods have achieved extraordinary progress in realistic face manipulation, ...
Recently, image forgery has become an alarming trend with the growth of available easy-to-use editing and generation tools. Modern DeepFake methods have achieved extraordinary progress in realistic face manipulation, thus raising concerns among the public about the misuse of such technologies. Unfortunately, with the obnoxiously wide range of possible manipulation and artifact-covering methods, most existing state-of-the-art detection methods lack the generalization capability to handle the output variations. To address this issue, a noticeable shift towards using attention mechanisms has emerged using balanced portions of the latest challenging datasets to detect intra-and inter-spatial relations. Our paper provides a comprehensive analysis of modern deep learning-based methods, showing the benefits of the shift. In addition, we make propositions for future research directions and dataset-building methodology.
Edge server placement is a hot issue in mobile edge computing. It is a key prerequisite for deploying edge servers that can meet computing needs and improve resource utilization. This paper studies the joint location ...
详细信息
Recent advancements in prompt-driven image segmentation exemplified by the Segment Anything Model (SAM) have shown remarkable potential for universal medical image segmentation. However, their reliance on manual promp...
详细信息
A recent line of works showed regret bounds in reinforcement learning (RL) can be (nearly) independent of planning horizon, a.k.a. the horizon-free bounds. However, these regret bounds only apply to settings where a p...
详细信息
This paper studies the recognition of oracle character, the earliest known hieroglyphs in China. Essentially, oracle character recognition suffers from the problem of data limitation and imbalance. Recognizing the ora...
详细信息
ISBN:
(纸本)9783030695439
This paper studies the recognition of oracle character, the earliest known hieroglyphs in China. Essentially, oracle character recognition suffers from the problem of data limitation and imbalance. Recognizing the oracle characters of extremely limited samples, naturally, should be taken as the few-shot learning task. Different from the standard few-shot learning setting, our model has only access to large-scale unlabeled source Chinese characters and few labeled oracle characters. In such a setting, meta-based or metric-based few-shot methods are failed to be efficiently trained on source unlabeled data;and thus the only possible methodologies are self-supervised learning and data augmentation. Unfortunately, the conventional geometric augmentation always performs the same global transformations to all samples in pixel format, without considering the diversity of each part within a sample. Moreover, to the best of our knowledge, there is no effective self-supervised learning method for few-shot learning. To this end, this paper integrates the idea of self-supervised learning in data augmentation. And we propose a novel data augmentation approach, named Orc-Bert Augmentor pre-trained by self-supervised learning, for few-shot oracle character recognition. Specifically, Orc-Bert Augmentor leverages a self-supervised BERT model pre-trained on large unlabeled Chinese characters datasets to generate sample-wise augmented samples. Given a masked input in vector format, Orc-Bert Augmentor can recover it and then output a pixel format image as augmented data. Different mask proportion brings diverse reconstructed output. Concatenated with Gaussian noise, the model further performs point-wise displacement to improve diversity. Experimentally, we collect two large-scale datasets of oracle characters and other Chinese ancient characters for few-shot oracle character recognition and Orc-Bert Augmentor pre-training. Extensive experiments on few-shot learning demonstrate the effe
Currently, protocol fuzzing techniques mainly employ two approaches: greybox fuzzing based on mutation and blackbox fuzzing based on generation. Greybox fuzzing techniques use message exchanges between the protocol se...
Currently, protocol fuzzing techniques mainly employ two approaches: greybox fuzzing based on mutation and blackbox fuzzing based on generation. Greybox fuzzing techniques use message exchanges between the protocol server and actual clients as seeds, and generate test cases through mutation. Although this approach can provide coverage information of the SUT's code and state space through instrumentation and feedback, its drawback lies in the relatively random mutation strategy, which makes it challenging to validate the SUT's message verification process. This paper addresses this limitation by utilizing artificial intelligence techniques to extract protocol state machines. It aims to overcome the reliance on manual work in blackbox fuzzing based on generation and leverage its advantages to generate more effective fuzzing test cases. The study utilizes Prompt-Learning technology to analyze the semantic information in protocol RFC documents, obtain corresponding intermediate representations, and extract protocol state machines from these representations. Taking the BGP protocol as an experimental subject, the experimental results demonstrate a certain level of accuracy in the obtained protocol state machines and the ability to generate test cases from these state machines, thereby enhancing the automation level of protocol fuzzing.
In this paper, we propose a novel graph representation learning (GRL) model that aims to improve both representation accuracy and learning efficiency. We design a Two-Level GRL architecture based on the graph partitio...
In this paper, we propose a novel graph representation learning (GRL) model that aims to improve both representation accuracy and learning efficiency. We design a Two-Level GRL architecture based on the graph partitioning: 1) local GRL on nodes within each partitioned subgraph and 2) global GRL on subgraphs. By partitioning the graph through community detection, we enable elaborate node learning in the same community. Based on Two-Level GRL, we introduce an abstracted graph, Community-as-a-Node Graph(CaaN), to effectively maintain the high-level structure with a significantly reduced graph. By applying the CaaN graph to local and global GRL, we propose Two-Level GRL with Community-as-a-Node (CaaN 2L) that effectively maintains the global structure of the entire graph while accurately representing the nodes in each community. A salient point of the proposed model is that it can be applied to any existing GRL model by adopting it as the base model for local and global GRL. Through extensive experiments employing seven popular GRL models, we show that our model outperforms them in both accuracy and efficiency.
暂无评论