Loop-free routing approaches based on distance information combine a local Ordering Condition (LOC) with a distributed reordering computation (DRC). The LOC allows routers to determine whether they can independently c...
With the widespread application of federated learning across various domains, backdoor attacks pose a serious threat to the security of models. In this paper, we proposes a text watermarking-based federated learning b...
详细信息
Motor imagery (MI) decoding methods are pivotal in advancing rehabilitation and motor control research. Effective extraction of spectral-spatial-temporal features is crucial for MI decoding from limited and low signal...
详细信息
Motor imagery (MI) decoding methods are pivotal in advancing rehabilitation and motor control research. Effective extraction of spectral-spatial-temporal features is crucial for MI decoding from limited and low signal-to-noise ratio electroencephalogram (EEG) signal samples based on brain-computer interface (BCI). In this paper, we propose a lightweight Multi-Feature Attention Neural Network (M-FANet) for feature extraction and selection of multi-feature data. M-FANet employs several unique attention modules to eliminate redundant information in the frequency domain, enhance local spatial feature extraction and calibrate feature maps. We introduce a training method called Regularized Dropout (R-Drop) to address training-inference inconsistency caused by dropout and improve the model's generalization capability. We conduct extensive experiments on the BCI Competition IV 2a (BCIC-IV-2a) dataset and the 2019 World robot conference contest-BCI Robot Contest MI (WBCIC-MI) dataset. M-FANet achieves superior performance compared to state-of-the-art MI decoding methods, with 79.28% 4-class classification accuracy (kappa: 0.7259) on the BCIC-IV-2a dataset and 77.86% 3-class classification accuracy (kappa: 0.6650) on the WBCIC-MI dataset. The application of multi-feature attention modules and R-Drop in our lightweight model significantly enhances its performance, validated through comprehensive ablation experiments and visualizations.
Recent research in hyperspectral image (HSI) reconstruction mainly emphasizes the development of intricate mappings through using convolutional neural networks (CNNs). Although CNNs excel at capturing local feature in...
详细信息
This study aims to enhance the accuracy and generalization of motor imagery-based brain-computer interface (MI-BCI) systems using a novel frequency-based graph convolutional neural network technique called Chebyshev G...
详细信息
Correspondence selection, a crucial step in many computer vision tasks, aims to distinguish between inliers and outliers from putative correspondences. The coherence of correspondences is often used for predicting inl...
详细信息
ISBN:
(纸本)9798350323658
Correspondence selection, a crucial step in many computer vision tasks, aims to distinguish between inliers and outliers from putative correspondences. The coherence of correspondences is often used for predicting inlier probability, but it is difficult for neural networks to extract coherence contexts based only on quadruple coordinates. To overcome this difficulty, we propose enhancing the preliminary features using local and global handcrafted coherent characteristics before model learning, which strengthens the discrimination of each correspondence and guides the model to prune obvious outliers. Furthermore, to fully utilize local information, neighbors are searched in coordinate space as well as feature space. These two kinds of neighbors provide complementary and plentiful contexts for inlier probability prediction. Finally, a novel neighbor representation and a fusion architecture are proposed to retain detailed features. Experiments demonstrate that our method achieves state-of-the-art performance on relative camera pose estimation and correspondence selection metrics on the outdoor YFCC100M [1] and the indoor SUN3D [2] datasets.
The aim of intrinsic image decomposition (IID) is to recover reflectance and the shading from a given image. As different combinations are possible, IID is an under constrained problem. Previous approaches try to cons...
详细信息
ISBN:
(纸本)9798350307443
The aim of intrinsic image decomposition (IID) is to recover reflectance and the shading from a given image. As different combinations are possible, IID is an under constrained problem. Previous approaches try to constrain the search space using hand crafted priors. However, these priors are based on strong imaging assumptions and fall short when these do not hold. Deep learning based methods learn the problem end-to-end from the data. But these networks lack any explicit information about the image formation model. In this paper, an IID transformer approach (IDTransformer) is proposed by learning photometric invariant attention, derived from the image formation model, integrated in the transformer framework. The combination of invariant features in both a global and local setting allows the network to not only learn reflectance transitions, but also to group similar reflectance regions, irrespective of the spatial arrangement. Illumination and geometry invariant attention is exploited to generate the reflectance map, while illumination invariant and geometry variant attention is used to compute the shading map. Enabling physics-based explicit attention allows the network to be trained on a relatively small dataset. Ablation studies show that adding invariant attention improves the performance. Experiments on the Intrinsic In the Wild dataset shows competitive results with competing methods. The project page with the code is available at https://***/***/.
Diffeomorphic image registration, offering smooth transformation and topology preservation, is required in many medical image analysis tasks. Traditional methods impose certain modeling constraints on the space of adm...
详细信息
ISBN:
(纸本)9781665493468
Diffeomorphic image registration, offering smooth transformation and topology preservation, is required in many medical image analysis tasks. Traditional methods impose certain modeling constraints on the space of admissible transformations and use optimization to find the optimal transformation between two images. Specifying the right space of admissible transformations is challenging: the registration quality can be poor if the space is too restrictive, while the optimization can be hard to solve if the space is too general. Recent learning-based methods, utilizing deep neural networks to learn the transformation directly, achieve fast inference, but face challenges in accuracy due to the difficulties in capturing the small local deformations and generalization ability. Here we propose a new optimization-based method named DNVF (Diffeomorphic Image Registration with Neural Velocity Field) which utilizes deep neural network to model the space of admissible transformations. A multilayer perceptron (MLP) with sinusoidal activation function is used to represent the continuous velocity field and assigns a velocity vector to every point in space, providing the flexibility of modeling complex deformations as well as the convenience of optimization. Moreover, we propose a cascaded image registration framework (Cas-DNVF) by combining the benefits of both optimization and learning based methods, where a fully convolutional neural network (FCN) is trained to predict the initial deformation, followed by DNVF for further refinement. Experiments on two large-scale 3D MR brain scan datasets demonstrate that our proposed methods significantly outperform the state-of-the-art registration methods.
Vision transformers have achieved remarkable progress in vision tasks such as image classification and detection. However, in instance-level image retrieval, transformers have not yet shown good performance compared t...
详细信息
ISBN:
(纸本)9781665493468
Vision transformers have achieved remarkable progress in vision tasks such as image classification and detection. However, in instance-level image retrieval, transformers have not yet shown good performance compared to convolutional networks. We propose a number of improvements that make transformers outperform the state of the art for the first time. (1) We show that a hybrid architecture is more effective than plain transformers, by a large margin. (2) We introduce two branches collecting global (classification token) and local (patch tokens) information, from which we form a global image representation. (3) In each branch, we collect multi-layer features from the transformer encoder, corresponding to skip connections across distant layers. (4) We enhance locality of interactions at the deeper layers of the encoder, which is the relative weakness of vision transformers. We train our model on all commonly used training sets and, for the first time, we make fair comparisons separately per training set. In all cases, we outperform previous models based on global representation. Public code is available at https://***/dealicious-inc/DToP.
To address the issue of insufficient extraction of global features in CNN monocular depth estimation networks, leading to scale ambiguity and object edge ambiguity in predicted depth maps, this paper presents a pixel-...
详细信息
暂无评论