Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot *** work always relies on the learned object parts in a ...
详细信息
Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot *** work always relies on the learned object parts in a unified manner,where they attend the same object parts(even with common attention weights)for different few-shot episodic *** this paper,we propose that it should adaptively capture the task-specific object parts that require attention for each few-shot task,since the parts that can distinguish different tasks are naturally *** for a few-shot task,after obtaining part-level deep features,we learn a task-specific part-based dictionary for both aligning and reweighting part features in an ***,part-level categorical prototypes are generated based on the part features of support data,which are later employed by calculating distances to classify query data for *** retain the discriminative ability of the part-level representations(i.e.,part features and part prototypes),we design an optimal transport solution that also utilizes query data in a transductive way to optimize the aforementioned distance calculation for the final *** experiments on five fine-grained benchmarks show the superiority of our method,especially for the 1-shot setting,gaining 0.12%,8.56%and 5.87%improvements over state-of-the-art methods on CUB,Stanford Dogs,and Stanford Cars,respectively.
he advance in Non-Volatile Memory(NVM)has changed the traditional *** to DRAM,NVM has the advantages of nonvolatility and large ***,as the read/write speed of NVM is still lower than that of DRAM,building DRAM/NVM-bas...
详细信息
he advance in Non-Volatile Memory(NVM)has changed the traditional *** to DRAM,NVM has the advantages of nonvolatility and large ***,as the read/write speed of NVM is still lower than that of DRAM,building DRAM/NVM-based hybrid memory systems is a feasible way of adding NVM into the current computer *** paper aims to optimize the well-known B^(+)-tree for hybrid *** novelty of this study is ***,we observed that the space utilization of internal nodes in B^(+)-tree is generally below 70%.Inspired by this observation,we propose to maintain hot keys in the free space within internal nodes,yielding a new index named HATree(Hotness-Aware Tree).The new idea of HATree is to use the unused space of the parent of leaf nodes(PLNs)as the hotspot data ***,no extra space is needed,and the in-node hotspot cache can efficiently improve query ***,to further improve the update performance of HATree,we propose to utilize the eADR technology supported by the third-generation Intel Xeon Scalable Processors to enhance HATree with instant log persistence,which results in the new HATree-Log *** conduct extensive experiments on real hybrid memory architecture involving DRAM and Intel Optane Persistent Memory to evaluate the performance of HATree and *** state-of-the-art indices for hybrid memory,namely NBTree,LBTree,and FPTree,are included in the experiments,and the results suggest the efficiency of HATree and HATree-Log.
Thanks to its ubiquity,using radio frequency (RF) signals for sensing has found widespread *** traditional integrated sensing and communication systems,such as joint radar-communication systems,common sensing tasks in...
Thanks to its ubiquity,using radio frequency (RF) signals for sensing has found widespread *** traditional integrated sensing and communication systems,such as joint radar-communication systems,common sensing tasks include target localization and ***,increasingly intelligent systems,such as smart agriculture,lowaltitude economy,and smart healthcare,have demanded more comprehensive and continuous information sensing capabilities to support higher-level *** sensing has the potential to offer both spatial and temporal continuity,meeting the multi-dimensional sensing needs of these intelligent ***,numerous advanced systems have been proposed,expanding the application scope of RF sensing to be more pervasive,including discrete state ubiquitous sensing tasks (such as material identification [1]),and continuous state ubiquitous sensing tasks (such as health monitoring [2]).With the advent of the 6G era,it is anticipated that the sensing potential of RF systems will be further unleashed.
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a ***,this kind ofmethod is dependent on a...
详细信息
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a ***,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video *** address this issue,this paper proposes a video captioning method by semantic topic-guided ***,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the ***,the semantic topics of video data are extracted using the visual labels retrieved from similar video *** the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video *** this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted ***,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text *** experimental results demonstrate that the proposed method outperforms several state-of-art ***,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset
In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation *** this paper,we aim to reduce the annotation cost of crowd datasets,a...
详细信息
In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation *** this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised *** this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the ***,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd *** addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density *** experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++.
The slow development of traditional computing has prompted the search for new materials to replace silicon-based computers. Bio-computers, which use molecules as the basis of computation, are highly parallel and infor...
详细信息
The slow development of traditional computing has prompted the search for new materials to replace silicon-based computers. Bio-computers, which use molecules as the basis of computation, are highly parallel and information capable, attracting a lot of attention. In this study, we designed a NAND logic gate based on the DNA strand displacement mechanism. We assembled a molecular calculation model, a 4-wire-2-wire priority encoder logic circuit, by cascading the proposed NAND gates. Different concentrations of input DNA chains were added into the system, resulting in corresponding output, through DNA hybridization and strand displacement. Therefore, it achieved the function of a priority encoder. Simulation results verify the effectiveness and accuracy of the molecular NAND logic gate and the priority coding system presented in this study. The unique point of this proposed circuit is that we cascaded only one kind of logic gate, which provides a beneficial exploration for the subsequent development of complex DNA cascade circuits and the realization of the logical coding function of information.
This paper examines fault-tolerant quantized control for neural networks under persistent dwell-time switching, considering the presence of actuator faults and dynamic output quantization. The dynamic scaling factor (...
详细信息
WiFi-based gait recognition technologies have seen significant advancements in recent years. However, most existing approaches rely on a critical assumption: users must walk continuously and maintain a consistent body...
详细信息
Images obtained from hyperspectral sensors provide information about the target area that extends beyond the visible portions of the electromagnetic ***,due to sensor limitations and imperfections during the image acq...
详细信息
Images obtained from hyperspectral sensors provide information about the target area that extends beyond the visible portions of the electromagnetic ***,due to sensor limitations and imperfections during the image acquisition and transmission phases,noise is introduced into the acquired image,which can have a negative impact on downstream analyses such as classification,target tracking,and spectral *** in hyperspectral images(HSI)is modelled as a combination from several sources,including Gaussian/impulse noise,stripes,and *** HSI restoration method for such a mixed noise model is ***,a joint optimisation framework is proposed for recovering hyperspectral data corrupted by mixed Gaussian-impulse noise by estimating both the clean data as well as the sparse/impulse noise ***,a hyper-Laplacian prior is used along both the spatial and spectral dimensions to express sparsity in clean image ***,to model the sparse nature of impulse noise,anℓ_(1)−norm over the impulse noise gradient is *** the proposed methodology employs two distinct priors,the authors refer to it as the hyperspectral dual prior(HySpDualP)*** the best of authors'knowledge,this joint optimisation framework is the first attempt in this *** handle the non-smooth and nonconvex nature of the generalℓ_(p)−norm-based regularisation term,a generalised shrinkage/thresholding(GST)solver is ***,an efficient split-Bregman approach is used to solve the resulting optimisation *** results on synthetic data and real HSI datacube obtained from hyperspectral sensors demonstrate that the authors’proposed model outperforms state-of-the-art methods,both visually and in terms of various image quality assessment metrics.
In the evolving landscape of surveillance and security applications, the task of person re-identification(re-ID) has significant importance, but also presents notable difficulties. This task entails the process of acc...
详细信息
In the evolving landscape of surveillance and security applications, the task of person re-identification(re-ID) has significant importance, but also presents notable difficulties. This task entails the process of accurately matching and identifying persons across several camera views that do not overlap with one another. This is of utmost importance to video surveillance, public safety, and person-tracking applications. However, vision-related difficulties, such as variations in appearance, occlusions, viewpoint changes, cloth changes, scalability, limited robustness to environmental factors, and lack of generalizations, still hinder the development of reliable person re-ID methods. There are few approaches have been developed based on these difficulties relied on traditional deep-learning techniques. Nevertheless, recent advancements of transformer-based methods, have gained widespread adoption in various domains owing to their unique architectural properties. Recently, few transformer-based person re-ID methods have developed based on these difficulties and achieved good results. To develop reliable solutions for person re-ID, a comprehensive analysis of transformer-based methods is necessary. However, there are few studies that consider transformer-based techniques for further investigation. This review proposes recent literature on transformer-based approaches, examining their effectiveness, advantages, and potential challenges. This review is the first of its kind to provide insights into the revolutionary transformer-based methodologies used to tackle many obstacles in person re-ID, providing a forward-thinking outlook on current research and potentially guiding the creation of viable applications in real-world scenarios. The main objective is to provide a useful resource for academics and practitioners engaged in person re-ID. IEEE
暂无评论