作者:
Zhang, WeiChen, RongrongWang, BinDalian Univ
Sch Software Engn Key Lab Adv Design & Intelligent Comp Dalian Peoples R China Dalian Univ
Sch Software Engn Key Lab Adv Designand Intelligent Comp Dalian 116622 Peoples R China
The network structure of digital watermarking algorithm based on deep learning is usually encoder-noise layer-decoder. Most of the existing encoders suffer from the problem of insufficient feature extraction, and the ...
详细信息
The network structure of digital watermarking algorithm based on deep learning is usually encoder-noise layer-decoder. Most of the existing encoders suffer from the problem of insufficient feature extraction, and the introduction of simulated differentiable joint photographic experts group (JPEG) compression in the noise layer cannot ensure the robustness under real JPEG. In this paper, a watermarking algorithm based on multi-scale auto-encoder is proposed, which can effectively extract the image feature information by combining with the channel attention mechanism. At the same time, some parameters of decoder and encoder are shared to reduce redundant feature embedding and improve extraction accuracy. This paper also proposes a robust training scheme against JPEG compression, which can guide the model to store the watermark in the low-frequency region needed for decoding. Experimental results show that the peak signal-to-noise ratio (PSNR) of the proposed algorithm is above 48 and the decoding rate is above 99% under JPEG compression with quality factor Q = 50. Moreover, this scheme can effectively promote the combination of noise layer in training. In addition, the proposed algorithm is also robust to other common network noises.
Automatic API recommendation can accelerate developers' programming and has been studied for years. There are two orthogonal lines of approaches for this task, i.e., information retrieval-based (IR-based) approach...
详细信息
Automatic API recommendation can accelerate developers' programming and has been studied for years. There are two orthogonal lines of approaches for this task, i.e., information retrieval-based (IR-based) approaches and sequence to sequence (seq2seq) model-based approaches. Although these approaches were reported to have remarkable performance, our observation finds two major drawbacks, i.e., IR-based approaches lack the consideration of relations among the recommended APIs, and seq2seq models do not model the API's semantic meaning. To alleviate the above two problems, we propose APIGens, which is a retrieval-enhanced large language model (LLM)-based API recommendation approach to recommend an API sequence for a natural language query. The approach first retrieves similar programming questions in history based on the input natural language query, and then scores the results based on API documents via a scorer model. Finally, these results are used as samples for few-shot learning of LLM. To reduce the risk of encountering local optima, we also extract API seeds from the retrieved results to increase the search scope during the LLM generation process. The results show that our approach can achieve 48.41% ROUGE@10 on API sequence recommendation and the 82.61% MAP on API set recommendation, largely outperforming the state-of-the-art baselines.
We present a reformulation of the Key-Information Extraction (KIE) problem from document images, as a Next-Sentence Prediction (NSP) task for identifying information in hierarchically structured data. KIE implemented ...
详细信息
ISBN:
(纸本)9783031706417;9783031706424
We present a reformulation of the Key-Information Extraction (KIE) problem from document images, as a Next-Sentence Prediction (NSP) task for identifying information in hierarchically structured data. KIE implemented as a Key-Value extraction task, is limited to one-to-one (single key mapping to single value) information extraction and thus does not apply to hierarchical information e.g. information present in complex semi-structured or unstructured tables. The Visual-Question-Answering (VQA) approach tries to solve information extraction from such semi-structured formats, but use visual information extraction backbone architectures along with heavy language models. In the proposed work, we use only a backbone language feature extractor for semantic entity extraction. Unlike, the four entity types in FUNSD ('question', 'answer', 'header' and 'other'), for semi-structured tabular information we define additional classes that define hierarchical elements, like column-header, table-footer, cells, merged-cell, table-summary etc. For these additional entities, we define hierarchical relations like a tuple of entities {table-header entity, column-header entity, row-header entity} that point to the unique entity referred as a value-entity. We treat tuple-entity and value-entity as two sentences and formulate the task of finding how likely is the value-entity to follow the tuple-entity. Empirically, we show that the proposed method, called as, Tuple-Value Identification (TVI), can exhaustively identify all the information in the hierarchical structures. Additionally, TVI also opens up for the potential use for Table Structure Recognition (TSR) for scanned documents in bank statements or medical bills, where the narration columns span multi-lines and is challenging for existing TSR systems.
作者:
Qian, ChaoNanjing Univ
State Key Lab Novel Software Technol Nanjing 210023 Peoples R China
Clustering is a fundamental problem in many areas, which aims to partition a given data set into groups based on some distance measure, such that the data points in the same group are similar while that in different g...
详细信息
Clustering is a fundamental problem in many areas, which aims to partition a given data set into groups based on some distance measure, such that the data points in the same group are similar while that in different groups are dissimilar. Due to its importance and NP-hardness, a lot of methods have been proposed, among which evolutionary algorithms (EAs) are a class of popular ones. Evolutionary clustering has found many successful applications, but all the results are empirical, lacking theoretical support. This article fills this gap by proving that the approximation performance of the global simple evolutionary multiobjective optimizer (GSEMO) (a simple multiobjective EA) for solving four formulations of clustering, i.e., k-tMM, k-center, discrete k-median, and k-means, can be theoretically guaranteed. Furthermore, we consider clustering under fairness, which tries to avoid algorithmic bias, and has recently been an important research topic in machine learning. We prove that for discrete k-median clustering under individual fairness, the approximation performance of the GSEMO can be theoretically guaranteed with respect to both the objective function and the fairness constraint.
Various communication methods allow modern people to communicate with each other more frequently, resulting in a more diverse network structure in today's social network due to its own dynamics. Compared with the ...
详细信息
Mobile applications have become a ubiquitous part of our daily life, providing users with access to various services and utilities. Text input, as an important interaction channel between users and applications, plays...
详细信息
Obfuscation is a method to hide coding strategies for security and privacy. Despite its positive use, malware experts have also used this technique to develop malware applications. A variety of malware has taken over ...
详细信息
Reducing the energy consumption while guaranteeing the quality of service (QoS) in the cloud data centers is challenge task for cloud providers. Dynamic virtual machine (VM) consolidation technology is regarded as a p...
详细信息
Reducing the energy consumption while guaranteeing the quality of service (QoS) in the cloud data centers is challenge task for cloud providers. Dynamic virtual machine (VM) consolidation technology is regarded as a promising approach to satisfy goals. Considering dynamic workload of physical machine (PM) results in VM migration and high resources utilization of PM results in resources contention among VMs that affects working performance of VMs. Hence, it is vital to provide an efficient approach for dynamic VM placement during the consolidation to achieve the objectives while alleviating resources contention among VMs in the data centers. In this paper, the proposed strategy called LBVMP aims to build a novel conception consisting of a balancing flat surface of a PM in terms of CPU, RAM, bandwidth (BW) and another proportion flat surface that the remaining resources capacity of the targeted PM was divided by the request resources (CPU, RAM and BW) of a VM. Then LBVMP calculates the distance between two plats to evaluate VM allocation solutions. Extensive experimental results based on the CloudSim simulator demonstrate that compared with the state-of-the-art algorithm BCAVMP, the proposed strategy enables to reduce the cloud data centers of energy consumption, the number of migrations, SLAV, ESV by an average of 3.50%, 9.40%, 78.40%, 79.91%, respectively.
Sentiment analysis has a wide range of promising applications in software engineering, and the development of deep learning has demonstrated that the uniform representation of different modalities can improve the mode...
详细信息
ISBN:
(纸本)9781665452786
Sentiment analysis has a wide range of promising applications in software engineering, and the development of deep learning has demonstrated that the uniform representation of different modalities can improve the model performance of sentiment analysis. However, in practical applications, multimodal sentiment analysis always faces unsatisfactory situations, especially when the modality has missing samples, most models may fail. For example, social dynamics of technicians in developer communities can face modality unavailability due to privacy settings. Several existing works based on deep learning and regularization methods have explored the modal missing problem, but these works cannot balance the cases of modal general missing (rate < 50%) and severe missing (rate 50%), and do not consider the resource consumption during model inference. Therefore, in this paper, we proposed a prototype augmented multimodal teacher-student network (PAMD) to address the above issues. Specifically, a multi-level and multi-origin distillation strategy is used to minimize the required resources and inference time, and prototype augmentation is used to guarantee the performance of the model when a modality is severely missing. Extensive experiments are conducted on different benchmark datasets to explore a network that balances performance and resource consumption. And It achieves good results in different modalities of missing cases.
In recent years, the rapid development of the Internet of Things (IoT) has attracted significant interest in smart healthcare. However, such collaborative IoT applications still face three major challenges: multi-task...
详细信息
暂无评论