As a prominent parameter-efficient fine-tuning technique in NLP, prompt tuning is being explored its potential in computer vision. Typical methods for visual prompt tuning follow the sequential modeling paradigm stemm...
详细信息
ISBN:
(纸本)1577358872
As a prominent parameter-efficient fine-tuning technique in NLP, prompt tuning is being explored its potential in computer vision. Typical methods for visual prompt tuning follow the sequential modeling paradigm stemming from NLP, which represents an input image as a flattened sequence of token embeddings and then learns a set of unordered parameterized tokens prefixed to the sequence representation as the visual prompts for task adaptation of large vision models. While such sequential modeling paradigm of visual prompt has shown great promise, there are two potential limitations. First, the learned visual prompts cannot model the underlying spatial relations in the input image, which is crucial for image encoding. Second, since all prompt tokens play the same role of prompting for all image tokens without distinction, it lacks the fine-grained prompting capability, i. e., individual prompting for different image tokens. In this work, we propose the Spatially Aligned-and-Adapted Visual Prompt model (SA2VP), which learns a two-dimensional prompt token map with equal (or scaled) size to the image token map, thereby being able to spatially align with the image map. Each prompt token is designated to prompt knowledge only for the spatially corresponding image tokens. As a result, our model can conduct individual prompting for different image tokens in a fine-grained manner. Moreover, benefiting from the capability of preserving the spatial structure by the learned prompt token map, our SA2VP is able to model the spatial relations in the input image, leading to more effective prompting. Extensive experiments on three challenging benchmarks for image classification demonstrate the superiority of our model over other state-of-the-art methods for visual prompt tuning. Code is available at https://***/tommy-xq/SA2VP.
In today's day and age, where huge quantities of textual data are generated every second, it has become difficult to keep ourselves abreast with new information. Documents in the financial sector tell a quantitati...
详细信息
During the enhanced magnetic memory detection of defects, many interference signals appear in the detection signal, which makes it difficult to accurately extract the characteristics of the defect signals and signific...
详细信息
ISBN:
(纸本)9781510689251
During the enhanced magnetic memory detection of defects, many interference signals appear in the detection signal, which makes it difficult to accurately extract the characteristics of the defect signals and significantly affects the detection effectiveness. When the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is employed independently for signal denoising, the noise and feature signals of the transition components will be retained or removed. When wavelet threshold denoising (WTD) is employed independently for signal denoising, the denoising effect is restricted because of the difficulty in determining the number of decomposition layers m and wavelet basis function. To solve these problems, a denoising method for enhanced magnetic memory detection signals based on CEEMDAN and WTD, called CEEMDAN–WTD, is proposed in the paper. Firstly, the correlation coefficient R of the signal components obtained by performing decomposition with the CEEMDAN method is calculated, and the signal components are divided into noise-dominant components, transition components, and useful signal components based on the R. Subsequently, WTD is employed to perform denoising on the two order signal components before and after transition components IMF(k − 2) ~ IMF(k + 2) obtained from the CEEMDAN method. Ultimately, the denoising components IMF(k − 2) ~ IMF(k + 2) and high-order useful signal components obtained by the CEEMDAN method are selected for signal reconstruction to obtain a denoised signal. To validate the effectiveness of the proposed method, the denoising effects of the CEEMDAN–WTD, CEEMDAN, and WTD methods were compared based on the signal-to-noise ratio (SNR). The comparison indicated that the CEEMDAN–WTD denoising method significantly enhanced the denoising effect, and the SNRs of the components of the magnetic field signal could be increased by up to 8.24% and 45.88%, respectively, indicating that the CEEMDAN-WTD denoising method is relativel
Adversarial text attack is an effective way to investigate the vulnerability. Recently, several text attack strategies have been proposed. However, the samples produced through word-level and character-level attacks e...
详细信息
The multidisciplinary project MonumenTAL aims to identify and list the names of works of classical visual works of art in French texts published from the 18th to the 21st century using NLP methods. It is based on a cl...
详细信息
The proceedings contain 14 papers. The topics discussed include: AIR TOUCH: human machine interface using electromyography signals;robotic arm for brake performance testing;the fuzzy-based systems in the communication...
The proceedings contain 14 papers. The topics discussed include: AIR TOUCH: human machine interface using electromyography signals;robotic arm for brake performance testing;the fuzzy-based systems in the communication between a human and a humanoid robot;rocker-bogie stair climbing robot;the design of an intelligent chatbot with naturallanguageprocessing capabilities to support learners;the empirical analysis for effective prediction of crop price using neuro evolutionary algorithm based on machine learning approach;Azerbaijani sign language recognition using machine learning approach;DograNet – a comprehensive offline Dogra handwriting character dataset;congestion-free routing based on a hybrid meta-heuristic algorithm to provide an effective routing protocol by analyzing the significant impacts of QoS parameters in a dynamic VANET environment;and face recognition using nearest neighbor and nearest mean classification framework: empirical analysis, conclusions and future directions.
naturallanguageprocessing (NLP) has emerged as a game-changing technique in the field of strategic planning analysis. This development represents a significant step forward. Improved methods for obtaining meaningful...
详细信息
ISBN:
(数字)9798350364699
ISBN:
(纸本)9798350364705
naturallanguageprocessing (NLP) has emerged as a game-changing technique in the field of strategic planning analysis. This development represents a significant step forward. Improved methods for obtaining meaningful insights from massive amounts of textual data are made available by naturallanguageprocessing (NLP). According to the findings of this research project, the application of naturallanguageprocessing (NLP) technology to enhance the strategic planning processes that are carried out within enterprises is being investigated. naturallanguageprocessing (NLP) approaches, such as semantic analysis, sentiment identification, and topic modeling, can be utilized to acquire a more in-depth grasp of market tendencies, consumer feedback, and internal communications. This can be accomplished by utilizing these techniques. The study demonstrates how naturallanguageprocessing (NLP) can facilitate the discovery of emergent patterns, simplify the understanding of data, and encourage decision-making that is driven by data. There are a great number of case studies that demonstrate the incorporation of naturallanguageprocessing (NLP) into strategic frameworks. These case studies show improvements in the accuracy of forecasting, risk management, and competitive analysis processes. The findings of this study demonstrate the potential for naturallanguageprocessing (NLP) to revolutionize strategic planning by providing a more nuanced and comprehensive evaluation of qualitative data. This, in turn, will ultimately result in strategic decisions that are more informed and agile.
In the domain of axial compressor stall warnings, existing methods largely focus on time or frequency domain analyses, leading to excessive dependence on signal quality, increased rates of false alarms, and merely par...
详细信息
ISBN:
(纸本)9798350387780;9798350387797
In the domain of axial compressor stall warnings, existing methods largely focus on time or frequency domain analyses, leading to excessive dependence on signal quality, increased rates of false alarms, and merely partial extraction of characteristics signaling an imminent stall. To overcome these challenges, it is imperative to develop a model that can swiftly and effectively identify early anomalies that signal an impending stall within a complex time series. Such a model ideally enables a thorough extraction of early stall indicators. This paper introduces the Time-Frequency Generative Adversarial Networks (TFGAN), representing the inaugural innovative fusion of adversarial training with time-frequency analysis. Utilizing the Wigner-Ville distribution function, the TFGAN achieves enhanced fidelity in information exchange between the time and frequency domains. Through adversarial training, the TFGAN discerns the characteristic states of various sequences following time-frequency analysis, precisely identifying conditions that suggest pre-stall precursors, thus fulfilling the goal of accurate stall warnings. empirical results show that the TFGAN model effectively captures the complex time-frequency relationships present in signal data, facilitating the identification of subtle temporal changes across different features. This enables the swift extraction of stall characteristics, anomaly detection, and stall prediction within complex time series data. When compared with a range of generative models, the proposed method markedly outperforms experimental benchmarks in generative accuracy and predictive capability, as demonstrated by both qualitative and quantitative analyses.
Tandem mass spectrometry has played a pivotal role in advancing proteomics, enabling the high-throughput analysis of protein composition in biological tissues. Many deep learning methods have been developed for de nov...
Artificial intelligence (AI) has a subfield called information extraction (IE). In unstructured information sources, IE recognises data that complies to specified semantics, such as persons, places, etc. A important t...
详细信息
暂无评论