检索结果-内蒙古大学图书馆

31st ACM International Conference on Multimedia (MM)

作者： Sun, Licai Lian, Zheng Liu, Bin Tao, Jianhua Univ Chinese Acad Sci Chinese Acad Sci Inst Automat Sch Artificial Intelligence Beijing Peoples R China Chinese Acad Sci Inst Automat Beijing Peoples R China Univ Chinese Acad Sci Sch Artificial Intelligence Chinese Acad Sci Inst Automat Beijing Peoples R China Tsinghua Univ Dept Automat Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Beijing Peoples R China

ISBN: (纸本)9798400701085

Dynamic facial expression recognition (DFER) is essential to the development of intelligent and empathetic machines. Prior efforts in this field mainly fall into supervised learning paradigm, which is severely restricted by the limited labeled data in existing datasets. Inspired by recent unprecedented success of masked autoencoders (e.g., VideoMAE), this paper proposes MAE-DFER, a novel self-supervised method which leverages large-scale self-supervised pretraining on abundant unlabeled data to largely advance the development of DFER. Since the vanilla Vision Transformer (ViT) employed in VideoMAE requires substantial computation during fine-tuning, MAE-DFER develops an efficient local-global interaction Transformer (LGI-Former) as the encoder. Moreover, in addition to the standalone appearance content reconstruction in VideoMAE, MAE-DFER also introduces explicit temporal facial motion modeling to encourage LGI-Former to excavate both static appearance and dynamic motion information. Extensive experiments on six datasets show that MAE-DFER consistently outperforms state-of-the-art supervised methods by significant margins (e.g., +6.30% UAR on DFEW and +8.34% UAR on MAFW), verifying that it can learn powerful dynamic facial representations via large-scale self-supervised pre-training. Besides, it has comparable or even better performance than VideoMAE, while largely reducing the computational cost (about 38% FLOPs). We believe MAE-DFER has paved a new way for the advancement of DFER and can inspire more relevant research in this field and even other related tasks. Codes and models are publicly available at https://***/sunlicai/MAE-DFER.

关键词： Dynamic facial expression recognition masked autoencoder

来源：评论

学校读者我要写书评

暂无评论

An Unsupervised Representation Learning Method for MTC Using masked autoencoder 23

An Unsupervised Representation Learning Method for MTC Using...

引用

23rd IEEE International Conference on Communication Technology, ICCT 2023

作者： Xu, Ke Zhang, Xixi Huang, Shennan Zhang, Lanping Shi, Shengnan College of Telecommunications and Information Engineering Nanjing University of Posts and Telecommunications Nanjing China

ISBN: (纸本)9798350325959

Malware traffic classification (MTC) is one of the promising methods to ensure the cybersecurity, which involves identifying and categorizing network traffic to distinguish between benign and malicious activity. Traditional MTC methods, such as port-based and DPI-based approaches, have become obsolete due to their limitations. The deep learning (DL) based methods attract much attention due to their strong feature mining ability. Nevertheless, these methods depend on abundant labeled training samples, which can be expensive to obtain. To address this problem, this paper proposes a novel unsupervised method for MTC with the framework of masked autoencoder. The proposed method is able to mine features from large amounts of unlabeled data, eliminating the reliance on labels. The random mask strategy in the method can effectively reduce the redundant information in the sample, which accelerates the training process and improves the classification performance. Experiments show that our method surpasses the performance of existing unsupervised MTC methods by directly applying the features extracted from the model after pre-training to clustering. Moreover, it also can achieve good classification performance with supervised fine-tuning on a small labeled dataset. © 2023 IEEE.

关键词： Malware traffic classification (MTC) masked autoencoder transformer unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

HC-MAE: Hierarchical Cross-attention masked autoencoder Integrating Histopathological Images and Multi-omics for Cancer Survival Prediction

HC-MAE: Hierarchical Cross-attention Masked Autoencoder Inte...

引用

2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Wang, Suixue Hu, Xiangjun Zhang, Qingchen Hainan University School of Information and Communication Engineering Haikou China School of Computer Science and Technology Hainan University Haikou China Software College Shenyang Normal University Shenyang China

ISBN: (纸本)9798350337488

Accurate cancer survival prediction enables clinicians to tailor treatment regimens based on individual patient prognoses, effectively mitigating over-treatment and inefficient medical resource allocation. Recently, the integration of histopathological images and multi-omics data, together with deep learning, has become increasingly applied to predict cancer survival. However, current deep learning-based integration methods ignore the spatial relationships across various fields of view within gigapixel histopathological images, since they mainly focus on a specific field of view. Inspired by the hierarchical image pyramid transformer (HIPT), we propose a hierarchical cross-attention masked autoencoder (HC-MAE) to integrate histopathological images and multi-omics data for cancer survival prediction. Specifically, HC-MAE aggregates the representations learned from different fields of view, effectively capturing the fine-grained details and the spatial relationships within histopathological images. We conduct experiments to compare the HC-MAE method with current state-of-the-art methods on six cancer datasets sourced from The Cancer Genome Atlas (TCGA). The experimental results demonstrate that HC-MAE achieves superior performance on five out of six cancer datasets, significantly outperforming the compared methods. The code is available at https://***/SuixueWang/HC-MAE. © 2023 IEEE.

关键词： Hierarchical Cross-attention Histopathological Image masked autoencoder Multi-omics Survival Prediction

来源：评论

学校读者我要写书评

暂无评论

Anomaly detection in rolling bearings based on the Mel-frequency cepstrum coefficient and masked autoencoder for distribution estimation

引用

STRUCTURAL CONTROL & HEALTH MONITORING 2022年第11期29卷

作者： Xie, Suchao Liu, Runda Du, Leilei Tan, Hongchuang Cent South Univ Sch Traff & Transportat Engn Minist Educ Key Lab Traff Safety Track Changsha 410075 Hunan Peoples R China Joint Int Res Lab Key Technol Rail Traff Safety Changsha Peoples R China Natl & Local Joint Engn Res Ctr Safety Technol Ra Changsha Peoples R China Xi An Jiao Tong Univ Sch Mech Engn Xian Peoples R China

It is difficult to establish a classification and recognition model of machinery and equipment based on labeled samples in the actual industrial environment because of the imperfect fault modes and data missing. To solve this problem, a semisupervised anomaly detection method based on masked autoencoders of distribution estimation (MADE) is designed. First, the Mel-frequency cepstrum coefficient (MFCC) is employed to extract fault features from vibration signals of rolling bearings. Then, a group of mask matrices are set on each hidden layer to overcome the perfect reconstruction problem of the autoencoders' input, and the full-connection probability of reconstruction is used to replace the reconstruction error and adopted as the anomaly score. Finally, the diagnostic threshold is determined according to the Youden index. Experimental results show that the MADE method can extract fault-sensitive features from a noisy industrial environment and introduce mask matrices renders to make the network autoregressive, thus solving the problem of perfect reconstruction of autoencoders. It is verified based on three rolling bearing datasets that the accuracy, precision, recall, and F1-score of the proposed method are confirmed to be all 100%. Moreover, the accuracy of the proposed method is 17.19% higher than that of the memory-inhibition method on the rolling bearing dataset provided by the Center for Intelligent Maintenance Systems (IMS) in University of Cincinnati (USA). The accuracy of the proposed method is also improved compared with other state-of-the-art anomaly detection methods.

关键词： anomaly detection deep learning distribution estimation masked autoencoder rolling bearing

来源：评论

学校读者我要写书评

暂无评论

SpectralMAE: Spectral masked autoencoder for Hyperspectral Remote Sensing Image Reconstruction

引用

SENSORS 2023年第7期23卷 3728页

作者： Zhu, Lingxuan Wu, Jiaji Biao, Wang Liao, Yi Gu, Dandan Xidian Univ Sch Elect Engn Xian 710071 Peoples R China Natl Key Lab Scattering & Radiat Shanghai 200438 Peoples R China

Accurate hyperspectral remote sensing information is essential for feature identification and detection. Nevertheless, the hyperspectral imaging mechanism poses challenges in balancing the trade-off between spatial and spectral resolution. Hardware improvements are cost-intensive and depend on strict environmental conditions and extra equipment. Recent spectral imaging methods have attempted to directly reconstruct hyperspectral information from widely available multispectral images. However, fixed mapping approaches used in previous spectral reconstruction models limit their reconstruction quality and generalizability, especially dealing with missing or contaminated bands. Moreover, data-hungry issues plague increasingly complex data-driven spectral reconstruction methods. This paper proposes SpectralMAE, a novel spectral reconstruction model that can take arbitrary combinations of bands as input and improve the utilization of data sources. In contrast to previous spectral reconstruction techniques, SpectralMAE explores the application of a self-supervised learning paradigm and proposes a masked autoencoder architecture for spectral dimensions. To further enhance the performance for specific sensor inputs, we propose a training strategy by combining random masking pre-training and fixed masking fine-tuning. Empirical evaluations on five remote sensing datasets demonstrate that SpectralMAE outperforms state-of-the-art methods in both qualitative and quantitative metrics.

关键词： spectral reconstruction hyperspectral imaging masked autoencoder self-supervised learning transformer

来源：评论

学校读者我要写书评

暂无评论

PointGame: Geometrically and Adaptively masked autoencoder on Point Clouds

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2023年 61卷 1-12页

作者： Liu, Yun Yan, Xuefeng Li, Zhiqi Chen, Zhilei Wei, Zeyong Wei, Mingqiang Nanjing Univ Aeronaut & Astronaut Sch Comp Sci & Technol Nanjing 210016 Peoples R China Nanjing Univ Aeronaut & Astronaut Shenzhen Inst Res Shenzhen 211106 Peoples R China Novel Software Technol & Industrializat Collaborat Innovat Ctr Nanjing 210000 Peoples R China Bournemouth Univ Natl Ctr Comp Animat Poole BH12 5BB England

Self-supervised learning is attracting large attention in point cloud understanding. However, exploring discriminative and transferable features still remains challenging due to their nature of irregularity. We propose a geometrically and adaptively masked autoencoder on point clouds for self-supervised learning, termed PointGame. PointGame contains two core components: GATE and EAT. GATE stands for the geometrical and adaptive token embedding module;it not only absorbs the conventional wisdom of geometric descriptors that capture the surface shape effectively, but also exploits adaptive saliency to focus on the salient part of a point cloud. EAT stands for the external attention-based transformer encoder with linear computational complexity, which increases the efficiency of the whole pipeline. Unlike cutting-edge unsupervised learning models, PointGame leverages geometric descriptors to perceive surface shapes and adaptively mines discriminative features from training data. PointGame showcases clear advantages over its competitors on various downstream tasks under both global and local fine-tuning strategies. The code and pretrained models will be publicly available.

关键词： Point cloud compression Feature extraction Three-dimensional displays Task analysis Transformers Representation learning Shape Geometric descriptors geometrical and adaptive token embedding (GATE) masked autoencoder representation learning self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

MAE-EEG-Transformer: A transformer-based approach combining masked autoencoder and cross-individual data augmentation pre-training for EEG classification

引用

BIOMEDICAL SIGNAL PROCESSING AND CONTROL 2024年 94卷

作者： Cai, Miao Zeng, Yu Yunnan Univ Kunming Yunnan Peoples R China Yunnan Univ Sch Informat Sci & Technol Kunming Yunnan Peoples R China

Convolutional neural networks (CNN) may not be ideal for extracting global temporal features from nonstationary Electroencephalogram (EEG) signals. The application of the masking -based method in EEG classification is not well studied, and there is a shortage of commonly accepted models for verifying inter -individual results in motor imagery classification tasks. The MAE-EEG-Transformer, a transformer with masking mechanism, is proposed in this article. It pre -trains by randomly masking signals and forces the model to learn semantic features. The pre -trained encoder module is fine-tuned and moved to the classification task to obtain the category of EEG signals. The effectiveness of features with and without pre -training is compared using t-SNE visualization to demonstrate pre-training's inter -subject efficacy. The MAE EEG Transformer was extensively evaluated across three prevalent datasets in EEG -based motor imagery, demonstrating performance comparable to state-of-the-art models while requiring only approximately 20% of the computational cost (results in Table 1, 2, 3 and 4).

关键词： Motor imagery EEG masked autoencoder Transformer Pre-training

来源：评论

学校读者我要写书评

暂无评论

Multiscale-attention masked autoencoder for missing data imputation of wind turbines

引用

KNOWLEDGE-BASED SYSTEMS 2024年 299卷

作者： Fan, Yuwei Feng, Chenlong Wu, Rui Liu, Chao Jiang, Dongxiang Tsinghua Univ Dept Energy & Power Engn Beijing 100084 Peoples R China Tsinghua Univ State Key Lab Control & Simulat Power Syst & Gener Beijing 100084 Peoples R China

High-quality data is essential for effective operation and maintenance of wind farms. However, data missing is a persistent issue in the supervisory control and data acquisition (SCADA) system, which seriously affects the data quality. To tackle the two limitations of current missing data imputation methods: the gap between training tasks and imputation tasks, and the inadequate extraction of correlations within SCADA data, this work proposes a data-driven framework named multiscale-attention masked autoencoder (MAMAE) for missing data imputation of wind turbines. The MAMAE employs masked autoencoding as a self-supervised training method, bridging the gap between the training and imputing task. Additionally, considering the importance of correlations in imputation for the SCADA data, a multiscale attention architecture built upon transformer is employed. Comprising four transformer stages, each applying attention mechanisms at distinct scales, the multiscale attention efficiently extracts feature, turbine, and temporal correlations. To ameliorate the problem of large computation cost caused by increased sequence length in different scales, localized attention is implemented in shifted windows, reducing the computational complexity from quadratic to a linear relationship with the sequence length. Furthermore, a turbine correlation-based feature combination method is proposed to coordinate with the multiscale attention and introduce turbine correlations into the imputation process. Experiments were conducted on a SCADA dataset collected in a real-world wind farm. The results show that the proposed method achieves higher accuracy than existing methods in most cases (especially in the cases with band missing and feature missing) and the ablation experiments verify the effectiveness of each proposed modification in improving accuracy or efficiency.

关键词： Renewable energy systems Missing data imputation masked autoencoder Multiscale attention Feature combination

来源：评论

学校读者我要写书评

暂无评论

Pixel-Wise Ensembled masked autoencoder for Multispectral Pansharpening

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2024年 62卷 1-1页

作者： Cui, Yongchuan Liu, Peng Ma, Yan Chen, Lajiao Xu, Mengzhen Guo, Xingyan Chinese Acad Sci Aerosp Informat Res Inst Beijing 100094 Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing 101408 Peoples R China Tsinghua Univ Dept Hydraul Engn State Key Lab Hydrosci & Engn Beijing 100190 Peoples R China

Pansharpening requires the fusion of a low-spatial-resolution multispectral (LRMS) image and a panchromatic (PAN) image with rich spatial details to obtain a high-spatial-resolution multispectral (HRMS) image. Recently, deep learning (DL)-based models have been proposed to tackle this problem and have made considerable progress. However, most existing methods rely on the conventional observation model, which treats LRMS as a blurred and downsampled version of HRMS. This observation model may lead to unsatisfactory performance and limited generalization ability at full-resolution evaluation, resulting in severe spectral and spatial distortion, as we observed that while DL-based models show significant improvement over traditional models on reduced-resolution evaluation, their performances deteriorate significantly at full resolution. In this article, we rethink the observation model and present a novel perspective from HRMS to LRMS and propose a pixel-wise ensembled masked autoencoder (PEMAE) to restore HRMS. Specifically, we consider LRMS as the result of pixel-wise masking on HRMS. Thus, LRMS can be seen as a natural input of a masked autoencoder. By ensembling the reconstruction results of multiple masking patterns, PEMAE obtains HRMS with both spectral information of LRMS and spatial details of PAN. In addition, we employ a linear cross-attention mechanism to replace the regular self-attention to reduce the computation to linear time complexity. Extensive experiments demonstrate that PEMAE outperforms state-of-the-art (SOTA) methods in terms of quantitative and visual performance at both reduced- and full-resolution evaluations. The codes are available at https://***/yc-cui/PEMAE.

关键词： Training Deep learning Image reconstruction Transformers Satellites Measurement Feature extraction Deep learning (DL) image fusion masked autoencoder multispectral pansharpening

来源：评论

学校读者我要写书评

暂无评论

A robust operators' cognitive workload recognition method based on masked autoencoder

引用

KNOWLEDGE-BASED SYSTEMS 2024年 301卷

作者： Yu, Xiaoqing Chen, Chun-Hsien Nanyang Technol Univ Sch Mech & Aerosp Engn Singapore 639798 Singapore

Identifying the cognitive workload of operators is crucial in complex human-automation collaboration systems. An excessive workload can lead to fatigue or accidents, while an insufficient workload may diminish situational awareness and efficiency. However, existing supervised learning-based methods for workload recognition are ineffective when dealing with imperfect input data, such as missing or noisy data, which is not practical in real applications. This study introduces a robust Electroencephalogram (EEG)-enabled cognitive workload recognition model using self-supervised learning. The proposed method, DMAEEG, combines the training strategies of denoising autoencoders and masked autoencoders, demonstrating strong robustness against noisy and incomplete data. More specifically, we adopt the temporal convolutional network and multi-head self- attention mechanisms as the backbone, effectively capturing both the temporal and spatial features from EEG. Extensive experiments are conducted to verify the effectiveness and robustness of the proposed method on an open dataset and a self-collected dataset. The results indicate that DMAEEG performs superior to other state-of-the-art across various evaluation metrics. Moreover, DMAEEG maintains high accuracy in workload inference even when EEG signals are corrupted with a high masking ratio or strong noises. This signifies its superiority in capturing robust intrinsic patterns from imperfect EEG data. The proposed method significantly contributes to decoding EEG signals for workload recognition in real-world applications, thereby enhancing the safety and reliability of human-automation interactions.

关键词： Cognitive workload EEG masked autoencoder Robustness Self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：