Wafer map defect pattern (WMDP) recognition is critical to evaluate the semiconductor manufacturing process. Traditional WMDP recognition algorithms, such as deep learning-based methods, only focus on improving the ac...
详细信息
Data-driven learning models have demonstrated strong benefits in capturing subtle facial movements for micro-expression recognition (MER), but are limited by the available data. Generative models can generate a variet...
Data-driven learning models have demonstrated strong benefits in capturing subtle facial movements for micro-expression recognition (MER), but are limited by the available data. Generative models can generate a variety of new data, but are typically computationally prohibitive compared to efficient Mixup-like methods. In this paper, we propose a novel Facial Micro-Motion-Aware Mixup approach for MER, namely MEMix. Our MEMix constructs a micro-motion-aware mask to select the most salient facial motions and generate a new sample with a mixed motion feature. This mixed motion feature can effectively expand the data distribution, leading to smoother decision boundaries for MER models. To demonstrate the good generality of MEMix, we integrate it with three advanced vision transformer-based models. The results show that the three integrated models consistently achieve performance improvements ranging from 4.07% to 7.32% in accuracy and from 6.54% to 9.18% in F1-score. Besides, to further explore the ability of MEMix, we propose a two-stream network called MixMeFormer, which unlocks the potential of the transformer by simply integrating mixed motion features with facial semantics for MER. Extensive experiments demonstrate that our MixMeFormer outperforms other state-of-the-art methods on three well-known micro-expression datasets.
Tissue P systems are computational models inspired by the way of biochemical substance movement/exchange between two cells or between a cell and the environment, where all communication (symport/antiport) rules used i...
详细信息
This paper aims to enhance the efficiency of teaching and learning by leveraging computer vision technology for automated analysis of student behavior in the classroom. We propose a feature-enhanced method for recogni...
详细信息
Vehicle re-identification (re-id) is challenging due to the small inter-class distance. The differences between similar vehicles can be extremely subtle and only captured at particular scales and semantic levels. In t...
ISBN:
(数字)9781509066315
ISBN:
(纸本)9781509066322
Vehicle re-identification (re-id) is challenging due to the small inter-class distance. The differences between similar vehicles can be extremely subtle and only captured at particular scales and semantic levels. In this paper, we propose a novel Multi-Scale Deep Feature Fusion Network (MSDeep) to conduct both multi-scale and multi-level features for precise vehicle re-id. Based on the backbone deep CNN, MS-Deep mainly consists of two modules: 1) Multi-Scale Fusion (MSF) Block which encapsulates combination of multi-scale streams as MSF feature;2) Multi-Level Fusion (MLF) Block which fuses MSF features of multiple levels to build the final descriptor. Importantly, in MSF, Multi-Scale Attention (MSA) is introduced to dynamically emphasize important channels of each scale, and Level-Wise Attention(LWA) is utilized in MLF to determine the different weightings for each MSF feature of different levels. As a result, experiments show that our MSDeep outperforms state-of-the-art algorithms on challenging VeRi and VehicleID benchmarks in terms of abundant and hierarchical hyper-descriptors.
Deep convolutional neural networks have made significant breakthroughs in medical image classification, under the assumption that training samples from all classes are simultaneously available. However, in real-world ...
详细信息
Identifying Full names/abbreviations for entities is a challenging problem in many applications, e.g. question answering and information retrieval. In this paper, we propose a general extraction method of extracting f...
详细信息
Complex orthogonal designs (CODs) are used to construct space-time block codes in wireless transmission. COD O z with parameter [p, n, k] is a p × n matrix, where nonzero entries are filled by ±z i or ...
详细信息
ISBN:
(纸本)9781467307734;9781467307758
Complex orthogonal designs (CODs) are used to construct space-time block codes in wireless transmission. COD O z with parameter [p, n, k] is a p × n matrix, where nonzero entries are filled by ±z i or ±z i * , i = 1, 2, ..., k, such that equation. In practice, n is the number of antennas, k=p the code rate, and p the decoding delay. One fundamental problem is to construct COD to maximize k/p and minimize p when n is given. Recently, this problem is completely solved by Liang and Adams et al. It's proved that when n = 2m or 2m - 1, the maximal possible rate is (m + 1)/(2m) and the minimum delay ( m-1 2m )(with the only exception n ≡2 (mod 4) where it is 2( m-1 2m )). However, when the number of antennas increase, the minimum delay grows fast and eats the otherwise fast decoding. For example, when n = 14 the minimal delay for a code with maximal rate is 6006! Therefore, it is very important to study whether it is possible, by lowering the rate slightly, to shorten the decoding delay considerably. In this paper, we demonstrate this possibility by constructing a series of CODs with parameter [p, n, k] = [( w - 1 n )+( w + 1 n ), n, ( w n )], where 0 ≤ w ≤ n. Besides that, all optimal CODs, which achieve the maximal rate and minimal delay, are contained in our explicit-form constructions. And this is the first explicit-form construction, while the previous are recursive or algorithmic.
1 Introduction With sharply rising quantity of urban vehicles over the past few decades,traffic jams and safety have gradually become outstanding *** recent years,Vehicular Ad hoc Network(VANET)is appeared and utilize...
详细信息
1 Introduction With sharply rising quantity of urban vehicles over the past few decades,traffic jams and safety have gradually become outstanding *** recent years,Vehicular Ad hoc Network(VANET)is appeared and utilized to solve traffic related *** is a type of self-organized and open-structured network,and provides Vehicle-to-Everything(V2X)*** all kinds of applications in VANET rely on efficient data transmission and interaction[1-3].
Current speech large language models build upon discrete speech representations, which can be categorized into semantic tokens and acoustic tokens. However, existing speech tokens are not specifically designed for spe...
详细信息
暂无评论