检索结果-内蒙古大学图书馆

Towards efficient RGB-T semantic segmentation via feature generative distillation strategy

INFORMATION FUSION 2025年 123卷

作者： Zhao, Shenlu Wang, Jingyi Zhang, Qiang Han, Jungong Xidian Univ State Key Lab Electromech Integrated Mfg High Perf Xian 710071 Shaanxi Peoples R China Xidian Univ Ctr Complex Syst Sch Mechanoelect Engn Xian 710071 Shaanxi Peoples R China Tsinghua Univ Dept Automat Beijing 100084 Peoples R China

Recently, multimodal knowledge distillation-based methods for RGB-T semantic segmentation have been developed to enhance segmentation performance and inference speeds. Technically, the crux of these models lies in the feature imitative distillation-based strategies, where the student models imitate the working principles of the teacher models through loss functions. Unfortunately, due to the significant gaps in the representation capability between the student and teacher models, such feature imitative distillation-based strategies may not achieve the anticipatory knowledge transfer performance in an efficient way. In this paper, we propose a novel feature generative distillation strategy for efficient RGB-T semantic segmentation, embodied in the Feature Generative Distillation-based Network (FGDNet), which includes a teacher model (FGDNet-T) and a student model (FGDNet-S). This strategy bridges the gaps between multimodal feature extraction and complementary information excavation by using conditional variational auto-encoder (CVAE) to generate teacher features from student features. Additionally, Multimodal Complementarity Separation modules (MCS-L and MCS-H) are introduced to separate complementary features at different levels. Comprehensive experimental results on four public benchmarks demonstrate that, compared with mainstream RGB-T semantic segmentation methods, our FGDNet-S achieves competitive segmentation performance with lower number of parameters and computational complexity.

关键词： RGB-T semantic segmentation Feature imitative distillation Feature generative distillation conditional variational auto-encoder

来源：评论

学校读者我要写书评

暂无评论

CVAE-LAYOUT: automatic furniture layout with constraints

引用

VISUAL COMPUTER 2024年第11期40卷 7731-7745页

作者： Xuan, Yixin Song, Chao Jin, Jianqiu Yang, Bailin Zhejiang Univ Polytech Inst Hangzhou 310015 Zhejiang Peoples R China Zhejiang Gongshang Univ Sch Comp Sci & Technol Hangzhou 310018 Zhejiang Peoples R China

We propose an automatic layout method for indoor scenes that effectively satisfies specific constraints. Our approach involves enhancing the existing scene representation method to accommodate complex constraints, including the precise placement of doors, windows, and user-specified furniture. To achieve this, we construct a conditional vector that encapsulates the necessary constraints. Moreover, our automatically constrained layout approach is implemented by training a conditional variational autoencoder model. Given the constraints and randomly sampled vectors, the decoder module can generate diversified reasonable indoor layout results. Evaluations show that our model outperforms the existing methods. Furthermore, our model exhibits a lower parameter count and faster execution speed compared with the existing approaches.

关键词： conditional variational auto-encoder Interior design automatic layout Deep learning

来源：评论

学校读者我要写书评

暂无评论

Rethinking Inconsistent Context and Imbalanced Regression in Depression Severity Prediction

引用

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2024年第4期15卷 2154-2168页

作者： Huang, Guanhe Li, Jing Lu, Heli Guo, Ming Chen, Shengyong Nanchang Univ Sch Informat Engn Nanchang 330031 Peoples R China Tianjin Univ Technol Sch Comp Sci & Engn Tianjin 300384 Peoples R China Nanchang Univ Affiliated Hosp 2 Dept Psychosomat Med Nanchang 330006 Peoples R China

As one of the world's most prevalent mental illnesses, depression is not easy to detect since it affects different people in different ways. Recently, linguistic features extracted from transcribed texts have been widely explored in depression detection because they contain a variety of cues about psychological activities. However, the detection performance is limited due to the following two reasons: 1) the dialogue structure is ignored, which causes the Inconsistent Context problem;and 2) Imbalanced Regression occurs due to the long-tailed distribution of depression datasets. To this end, in this paper we investigate the relationship between the local topic and global context in interview transcripts, and bridge the gap between depression symptoms and depression severity. In particular, we propose a model called conditional variational Topic-enriched auto-encoder (CVTAE), which can capture the spatial features from local topics via variational inference, and the temporal features from the global context with attention mechanism. Besides, we apply the re-weighting strategies to assigning weights to the depression labels with different values. Extensive experiments on the DAIC-WOZ dataset in English and a self-constructed database NCUDID in Chinese demonstrate the effectiveness and robustness of CVTAE, while the comprehensive ablation study and case study show its interpretability.

关键词： Depression Feature extraction Vectors Predictive models Context modeling Affective computing Linguistics Depression severity prediction natural language processing topic modeling conditional variational auto-encoder imbalanced regression attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Sensing anomaly of photovoltaic systems with sequential conditional variational autoencoder

引用

APPLIED ENERGY 2024年 353卷

作者： Li, Ding Zhang, Yufei Yang, Zheng Jin, Yaohui Xu, Yanyan Shanghai Jiao Tong Univ AI Inst MoE Key Lab Artificial Intelligence Shanghai 200240 Peoples R China Shanghai Jiao Tong Univ Data Driven Management Decis Making Lab Shanghai 200240 Peoples R China ENGIE 4 Rue Josephine Baker F-93240 Stains France

The market for urban distributed photovoltaics (DPV) is expected to take off in the next decade. However, these systems are often subject to complex urban contexts and sub-optimal conditions, requiring scalable and comprehensive solutions to detect their underperformances. In recent years, deep generative models (DGMs) have exhibited outstanding performance in the anomaly detection domain, dealing with generic high-dimensional time series data. Nevertheless, the existing applications of DGMs in the photovoltaic (PV) sector are still unable to account for environmental information, limiting their performance under various environmental conditions. This study proposes the Sequential conditional variational autoencoder (SCVAE), which can cope with the sequential impacts of the environment on PV power generation. Using real-world data collected from 30 rooftop PV sites located across China, a data processing pipeline is developed to construct the training datasets which contain mostly normal samples for unsupervised SCVAE model training. This work also constructs a synthetic dataset with a wide variety of artificial anomalies in reference to the domain insights and engineering practice of DPV systems. After checking and refining by experts, the synthetic dataset can finally be used to validate the anomaly detection models. The results demonstrate that the SCVAE model outperforms existing state-of-the-art unsupervised anomaly detection models and can be effectively generalized to unseen PV sites. Moreover, the latent variables of SCVAE could be used to identify the type of DPV failure, thereby enabling more targeted diagnostics of anomaly mechanisms.

关键词： Anomaly detection Anomaly diagnosis Photovoltaic (PV) system Time series Deep generative model conditional variational auto-encoder

来源：评论

学校读者我要写书评

暂无评论

Adaptive Augmentation of Medical Data Using Independently conditional variational auto-encoders

引用

IEEE TRANSACTIONS ON MEDICAL IMAGING 2019年第12期38卷 2807-2820页

作者： Pesteie, Mehran Abolmaesumi, Purang Rohling, Robert N. Univ British Columbia Dept Elect & Comp Engn Vancouver BC V6T 1Z4 Canada Univ British Columbia Dept Mech Engn Vancouver BC V6T 1Z4 Canada

Current deep supervised learning methods typically require large amounts of labeled data for training. Since there is a significant cost associated with clinical data acquisition and labeling, medical datasets used for training these models are relatively small in size. In this paper, we aim to alleviate this limitation by proposing a variational generative model along with an effective data augmentation approach that utilizes the generative model to synthesize data. In our approach, the model learns the probability distribution of image data conditioned on a latent variable and the corresponding labels. The trained model can then be used to synthesize new images for data augmentation. We demonstrate the effectiveness of the approach on two independent clinical datasets consisting of ultrasound images of the spine and magnetic resonance images of the brain. For the spine dataset, a baseline and a residual model achieve an accuracy of 85% and 92%, respectively, using our method compared to 78% and 83% using a conventional training approach for image classification task. For the brain dataset, a baseline and a U-net network achieve an accuracy of 84% and 88%, respectively, in Dice coefficient in tumor segmentation compared to 80% and 83% for the convention training approach.

关键词： Data augmentation deep learning conditional variational auto-encoder magnetic resonance ultrasound center-line identification tumor segmentation

来源：评论

学校读者我要写书评

暂无评论

Soil property recovery from incomplete in-situ geotechnical test data using a hybrid deep generative framework

引用

ENGINEERING GEOLOGY 2023年 326卷

作者： Chen, Weihang Ding, Jianwen Wang, Tengfei Connolly, David P. Wan, Xing Southeast Univ Sch Transportat Nanjing 210096 Peoples R China Southwest Jiaotong Univ Sch Civil Engn Chengdu 610031 Peoples R China Southwest Jiaotong Univ MOE Key Lab High Speed Railway Engn Chengdu 610031 Peoples R China Univ Leeds Sch Civil Engn Leeds LS2 9JT England

Geotechnical testing serves to assess the strength and stiffness of in-situ soils, for purposes such as informing foundation design. Despite its importance, time constraints, financial considerations, and site-specific limitations often restrict testing to isolated locations with limited horizontal resolution. Therefore, this paper presents a novel hybrid generative deep learning model designed to approximate soil properties across sites based on sparsely sampled geotechnical data. The model uses geological subsurface samples derived from random field theory as 'a priori' data for a conditional variational auto-encoder (CVAE) model. By doing so, it attempts to map the relationship between in-situ data and the corresponding spatial coordinates, as well as the inherent link between in-situ data and spatial distribution. Then, in the post-processing phase, a Kriging model interpolates minor discrepancies between the measured and predicted values. To demonstrate its practical application, this paper focuses on cone penetration testing (CPT) as the geotechnical test method. The model's development is thoroughly discussed, followed by the validation using in-situ data and an analysis conducted with synthetic data. It is shown that the uncertainty associated with CVAE-Kriging depends upon both the distance from the sample point and the site's inherent complexity. The proposed methodology not only offers refined subsurface modeling but also expands the understanding of uncertainty in geotechnical testing. Practically, it can assist geotechnical engineers with insights during the survey phase.

关键词： Geological subsurface conditional variational auto-encoder Kriging Cone penetration testing Deep learning

来源：评论

学校读者我要写书评

暂无评论

Posterior estimation using deep learning: a simulation study of compartmental modeling in dynamic positron emission tomography

引用

MEDICAL PHYSICS 2023年第3期50卷 1539-1548页

作者： Liu, Xiaofeng Marin, Thibault Amal, Tiss Woo, Jonghye Fakhri, Georges El Ouyang, Jinsong Massachusetts Gen Hosp Gordon Ctr Med Imaging Radiol Dept Boston MA USA Harvard Med Sch Radiol Dept Boston MA USA Massachusetts Gen Hosp Gordon Ctr Med Imaging Radiol Dept Boston MA 02114 USA

BackgroundIn medical imaging, images are usually treated as deterministic, while their uncertainties are largely underexplored. PurposeThis work aims at using deep learning to efficiently estimate posterior distributions of imaging parameters, which in turn can be used to derive the most probable parameters as well as their uncertainties. MethodsOur deep learning-based approaches are based on a variational Bayesian inference framework, which is implemented using two different deep neural networks based on conditional variational auto-encoder (CVAE), CVAE-dual-encoder, and CVAE-dual-decoder. The conventional CVAE framework, that is, CVAE-vanilla, can be regarded as a simplified case of these two neural networks. We applied these approaches to a simulation study of dynamic brain PET imaging using a reference region-based kinetic model. ResultsIn the simulation study, we estimated posterior distributions of PET kinetic parameters given a measurement of the time-activity curve. Our proposed CVAE-dual-encoder and CVAE-dual-decoder yield results that are in good agreement with the asymptotically unbiased posterior distributions sampled by Markov Chain Monte Carlo (MCMC). The CVAE-vanilla can also be used for estimating posterior distributions, although it has an inferior performance to both CVAE-dual-encoder and CVAE-dual-decoder. ConclusionsWe have evaluated the performance of our deep learning approaches for estimating posterior distributions in dynamic brain PET. Our deep learning approaches yield posterior distributions, which are in good agreement with unbiased distributions estimated by MCMC. All these neural networks have different characteristics and can be chosen by the user for specific applications. The proposed methods are general and can be adapted to other problems.

关键词： conditional variational auto-encoder deep learning dynamic brain PET imaging MCMC posterior variational inference

来源：评论

学校读者我要写书评

暂无评论

Data augmentation for fault diagnosis of oil-immersed power transformer

引用

ENERGY REPORTS 2023年 9卷 1211-1219页

作者： Li, Ke Li, Jian Huang, Qi Chen, Yuhui Univ Elect Sci & Technol China Sch Mech & Elect Engn Sichuan Prov Key Lab Power Syst Wide Area Measurem Chengdu 611731 Sichuan Peoples R China Chengdu Univ Technol Coll Nucl Technol & Automation Engn Chengdu 610059 Sichuan Peoples R China

110 kV oil immersed transformer is a key part of the power transmission and transformation system, which determines the power quality and transmission efficiency. Its fault diagnosis can greatly reduce the maintenance cost and improve the economy. At present, the methods of transformer fault diagnosis have a strong dependence on the original data, and the size of the original data directly affects the effect of fault diagnosis. In order to change this situation and achieve higher accuracy of transformer fault diagnosis, this paper firstly uses the conditional variational automatic encoder (CVAE) composed of full connection layers to expand the original samples under each fault category. After data augmentation, the convolutional neural network (CNN) with strong feature extraction ability is selected as the classifier. Finally, the CVAE-CNN model is validated using public dataset and the result is compared to other machine learning algorithms. (c) 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CCBY license (http://***/licenses/by/4.0/).

关键词： Fault diagnosis Convolution neural network conditional variational auto-encoder Data augmentation

来源：评论

学校读者我要写书评

暂无评论

Boosting variational Inference With Margin Learning for Few-Shot Scene-Adaptive Anomaly Detection

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2023年第6期33卷 2813-2825页

作者： Huang, Xin Hu, Yutao Luo, Xiaoyan Han, Jungong Zhang, Baochang Cao, Xianbin Beihang Univ Sch Elect & Informat Engn Beijing 100191 Peoples R China Beihang Univ Sch Astronaut Beijing 100191 Peoples R China Aberystwyth Univ Dept Comp Sci Aberystwyth SY23 3FL Wales Beihang Univ Sch Inst Artificial Intelligence Beijing 100191 Peoples R China Zhongguancun Lab Beijing 100190 Peoples R China Beihang Univ Key Lab Adv Technol Near Space Informat Syst Minist Ind & Informat Technol China Beijing 100191 Peoples R China

Anomaly detection in surveillance videos aims to identify frames where abnormal events happen. Existing approaches assume that the training and testing videos are from the same scene, exhibiting poor generalization performance when encountering an unseen scene. In this paper, we propose a variational Anomaly Detection Network (VADNet), which is characterized by its high scene-adaptation - it can identify abnormal events in a new scene only via referring to a few normal samples without fine-tuning. Our model embodies two major innovations. First, a novel variational Normal Inference (VNI) module is proposed to formulate image reconstruction in a conditional variational auto-encoder (CVAE) framework, which learns a probabilistic decision model instead of a traditional deterministic one. Secondly, a Margin Learning Embedding (MLE) module is leveraged to boost the variational inference and aid in distinguishing normal events. We theoretically demonstrate that minimizing the triplet loss in MLE module facilitates maximizing the evidence lower bound (ELBO) of CVAE, which promotes the convergence of VNI. By incorporating variational inference with margin learning, VADNet becomes much more generative that is able to handle the uncertainty caused by the changed scene and limited reference data. Extensive experiments on several datasets demonstrate that the proposed VADNet can adapt to a new scene effectively without fine-tuning and achieve remarkable performance, which outperforms other methods significantly and establishes new state-of-the-art in the case of few-shot scene-adaptive anomaly detection. We believe our method is closer to real-world application due to its strong generalization ability. All codes are released in https://***/huangxx156/VADNet.

关键词： Few-shot scene-adaptive anomaly detection conditional variational auto-encoder margin learning embedding

来源：评论

学校读者我要写书评

暂无评论

A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters

引用

COMPUTATIONAL & APPLIED MATHEMATICS 2023年第3期42卷 1-23页

作者： Yang, Jingya Niu, Yuanling Zhou, Qingping Cent South Univ Sch Math & Stat HNP LAMA Changsha 410083 Hunan Peoples R China

We propose a conditional variational auto-encoder within Gibbs sampling (CVAE-within-Gibbs) for Bayesian linear inverse problems where the prior or the likelihood function depends on ambiguous hyperparameters. The method builds on ideas from classical sampling theory and recent advances in deep generative models to approximate complicated probability distributions. Specifically, we use a CVAE model which is trained with a large amount of data to learn the conditional density of hyperparameters in the original Gibbs sampler. The learned property of the conditional posterior provides more flexibility than classical Gibbs sampling because it avoids manually or experimentally determining the hyperpriors and their hyperparameters. We demonstrate the performance of the proposed method for three linear inverse problems, i.e., image deblurring, signal denoising, and boundary heat flux identification in a heat conduction problem.

关键词： Inverse problems Bayesian inference Gibbs sampling conditional variational auto-encoder Hyperparameters

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：