检索结果-内蒙古大学图书馆

TSC-PCAC: Voxel Transformer and Sparse Convolution-Based Point Cloud Attribute Compression for 3D Broadcasting

IEEE TRANSACTIONS ON BROADCASTING 2025年第1期71卷 154-166页

作者： Guo, Zixi Zhang, Yun Zhu, Linwei Wang, Hanli Jiang, Gangyi Sun Yat sen Univ Sch Elect & Commun Engn Shenzhen Campus Shenzhen 518017 Peoples R China Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518055 Peoples R China Tongji Univ Dept Comp Sci & Technol Shanghai 200092 Peoples R China Ningbo Univ Fac Informat & Sci & Engn Ningbo 315211 Peoples R China

Point cloud has been the mainstream representation for advanced 3D applications, such as virtual reality and augmented reality. However, the massive data amounts of point clouds is one of the most challenging issues for transmission and storage. In this paper, we propose an end-to-end voxel Transformer and Sparse Convolution based Point Cloud Attribute Compression (TSC-PCAC) for 3D broadcasting. Firstly, we present a framework of the TSC-PCAC, which includes Transformer and Sparse Convolutional Module (TSCM) based variational autoencoder and channel context module. Secondly, we propose a two-stage TSCM, where the first stage focuses on modeling local dependencies and feature representations of the point clouds, and the second stage captures global features through spatial and channel pooling encompassing larger receptive fields. This module effectively extracts global and local inter-point relevance to reduce informational redundancy. Thirdly, we design a TSCM based channel context module to exploit inter-channel correlations, which improves the predicted probability distribution of quantized latent representations and thus reduces the bitrate. Experimental results indicate that the proposed TSC-PCAC method achieves an average of 38.53%, 21.30%, and 11.19% bitrate reductions on datasets 8iVFB, Owlii, 8iVSLF, Volograms, and MVUB compared to the Sparse-PCAC, NF-PCAC, and G-PCC v23 methods, respectively. The encoding/decoding time costs are reduced 97.68%/98.78% on average compared to the Sparse-PCAC. The source code and the trained TSC-PCAC models are available at https://***/igizuxo/TSC-PCAC.

关键词： Point cloud compression Image coding Convolution Three-dimensional displays Geometry Transforms Transformers voxel transformer sparse convolution variational autoencoder channel context module

来源：评论

学校读者我要写书评

暂无评论

A generalized zero-shot semantic learning model for batch process fault diagnosis

引用

MEASUREMENT SCIENCE AND TECHNOLOGY 2025年第1期36卷 016228-016228页

作者： Liu, Kai Zhao, Xiaoqiang Mou, Miao Hui, Yongyong Lanzhou Univ Technol Coll Elect & Informat Engn Lanzhou Peoples R China Gansu Key Lab Adv Control Ind Proc Lanzhou Peoples R China Lanzhou Univ Technol Natl Expt Teaching Ctr Elect & Control Engn Lanzhou Peoples R China

In industrial monitoring, although zero-shot learning successfully solves the problem of diagnosing unseen faults, it is difficult to diagnose both unseen and seen faults. Motivated by this, we propose a generalized zero-shot semantic learning fault diagnosis model for batch processes called joint low-rank manifold distributional semantic embedding and multimodal variational autoencoder (mVAE). Firstly, joint low-rank representation and manifold learning makes the training samples map to the low-rank space, which obtains the global-local features of the samples while reducing the redundancy in the inputs for the training model;secondly, the bias of human-defined semantic attributes is corrected by predicting the attribute error rate;then, fault samples and corrected semantic vectors are embedded into the consistency space, in which the samples are reconstructed using the mVAE to fully integrate the cross-modal information, meanwhile, Barlow matrix is designed to measure the consistency between the fault samples and the attribute vectors, the higher the consistency, the higher the learning efficiency of attribute classifiers;finally, the generalized zero-shot fault diagnosis experiments are designed and conducted on the penicillin fermentation process and the semiconductor etching process to validate the effectiveness, the results show that the proposed model is indeed possible to diagnose target faults without their samples.

关键词： fault diagnosis batch process generalized zero-shot semantic correction variational autoencoder distributed semantic embedding

来源：评论

学校读者我要写书评

暂无评论

Predicting new drug indications based on double variational autoencoders

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2023年第1期164卷 107261-107261页

作者： Huang, Zhaoyang Chen, Shengjian Yu, Liang Xidian Univ Sch Comp Sci & Technol Xian 710071 Shaanxi Peoples R China

Experimental drug development is costly, complex, and time-consuming, and the number of drugs that have been put into application treatment is small. The identification of drug-disease correlations can provide important information for drug discovery and drug repurposing. Computational drug repurposing is an important and effective method that can be used to determine novel treatments for diseases. In recent years, an increasing number of large databases have been utilized for biological data research, particularly in the fields of drugs and diseases. Consequently, researchers have begun to explore the application of deep neural networks in biological data development. One particularly promising method for unsupervised learning is the deep generative model, with the variational autoencoder (VAE) being among the mainstream models. Here, we propose a drug indication prediction algorithm called DIDVAE (predicting new drug indications based on double variational autoencoders), which generates new data by learning the latent variable distribution of known data to achieve the goal of predicting drug-disease associations. In the experiment, we compared the DIDVAE algorithm with the BBNR, DrugNet, MBiRW and DRRS algorithms on a unified dataset. The comprehensive experimental results show that, compared with these prediction algorithms, the DIDVAE algorithm provides an overall improved prediction. In addition, further analysis and verification of the predicted unknown drug-disease association also proved the practicality of the method.

关键词： Drug repurposing Drug indications Generative model variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

ALCR: Adaptive loss based critic ranking toward variational autoencoders with multinomial likelihood and condition for collaborative filtering

引用

KNOWLEDGE-BASED SYSTEMS 2023年第1期278卷

作者： Feng, Jiamei Liu, Mengchi Liang, Xiang Nie, Tingkun South China Normal Univ Sch Comp Sci Guangzhou 510631 Peoples R China

Research on variational autoencoders for collaborative filtering is gradually focusing on implicit feedback. However, most existing studies have two limitations: (1) they overlook the impact of user- item interaction data in implicit feedback on the representations of both users and items, which can affect the latent representations;(2) their attention is mainly focused on the immediate feedback of recommended items, ignoring interactions between feedback and ground-truth values, and neglecting the difference on loss functions between different training processes. To address these limitations, we first propose a condition for variational autoencoders to control user and item representations to learn more useful information from the latent representations. Then, we train an adaptive loss critic ranking to directly provide ranking scores in collaborative filtering recommendations, which aims to minimize loss and improve interactions during different critic training processes. Extensive experiments on three big real-world social media datasets demonstrate that this approach outperforms the existing twelve models under NDCG and Recall metric estimation settings and significantly improves the performance of a variety of prediction models.& COPY;2023 Elsevier B.V. All rights reserved.

关键词： Collaborative filtering variational autoencoder Rank Actor-critic Adaptive loss

来源：评论

学校读者我要写书评

暂无评论

Anomaly detection in Fourier transform infrared spectroscopy of geological specimens using variational autoencoders

引用

ORE GEOLOGY REVIEWS 2023年 158卷

作者： Gonzalez, C. M. Horrocks, T. Wedge, D. Holden, E. J. Hackman, N. Green, T. Univ Western Australia Ctr Data driven Geosci Sch Earth Sci Crawley WA 6009 Australia Rio Tinto Iron Ore Perth WA 6000 Australia

Fourier Transform infrared spectroscopy (FTIR) is an emerging cost effective and rapid mineralogical charac-terization technique being applied in the geosciences. Detecting anomalous FTIR spectra is especially relevant to the geoscience domain, as it may indicate abrupt changes in geology or mineralogical composition of the rock sample being examined. Given a large volume of data, detecting anomalies that exhibit significant and abrupt spatial and compositional variability is a time-consuming and challenging task. This paper explores the use of an unsupervised variational autoencoder (VAE) for determining anomalies that may exist within a set of FTIR spectra collected from reverse circulation (RC) drill chip samples spanning several iron ore deposits from the Pilbara region in Western Australia. Diffuse reflectance infrared Fourier transform spectroscopy (DRIFTS) were measured from 1,579 two-metre composite samples. Our results showed that the VAE was effective in separating anomalous spectra from spectra typical of unmineralized banded iron formation by leveraging the probabilistic latent representation of the spectra in as few as two latent dimensions. To validate our results, detected anomalous samples were compared with their respective geochemical assays to analyse their mineralogical differences, which may have led to the anomalous spectra. In the iron ore sample data used in this study, the observed spectral anomalies were shown to have elevated concentrations of Al2O3 and TiO2 wt.% while being several standard deviations below the mean Fe2O3 wt.% indicating mineralogies rich in shale as opposed to iron oxide rich mineralogies. While the paper demonstrates the efficacy of the VAE in anomaly detection, it can also be effective in assuring the quality of the FTIR data as a pre-processing step, which is critically important for machine learning applications.

关键词： variational autoencoder Fourier transform infrared spectroscopy Anomaly detection Iron ore Pilbara

来源：评论

学校读者我要写书评

暂无评论

Learning and Predicting Photonic Responses of Plasmonic Nanoparticle Assemblies via Dual variational autoencoders

引用

SMALL 2023年第25期19卷 2205893-2205893页

作者： Yaman, Muammer Y. Kalinin, Sergei V. Guye, Kathryn N. Ginger, David S. Ziatdinov, Maxim Univ Washington Dept Chem Seattle WA 98195 USA Univ Tennessee Dept Mat Sci & Engn Knoxville TN 37996 USA Pacific Northwest Natl Lab Phys Sci Div Phys & Computat Sci Directorate Richland WA 99354 USA Oak Ridge Natl Lab Ctr Nanophase Mat Sci Oak Ridge TN 37831 USA Oak Ridge Natl Lab Computat Sci & Engn Div Oak Ridge TN 37831 USA

The application of machine learning is demonstrated for rapid and accurate extraction of plasmonic particles cluster geometries from hyperspectral image data via a dual variational autoencoder (dual-VAE). In this approach, the information is shared between the latent spaces of two VAEs acting on the particle shape data and spectral data, respectively, but enforcing a common encoding on the shape-spectra pairs. It is shown that this approach can establish the relationship between the geometric characteristics of nanoparticles and their far-field photonic responses, demonstrating that hyperspectral darkfield microscopy can be used to accurately predict the geometry (number of particles, arrangement) of a multiparticle assemblies below the diffraction limit in an automated fashion with high fidelity (for monomers (0.96), dimers (0.86), and trimers (0.58). This approach of building structure-property relationships via shared encoding is universal and should have applications to a broader range of materials science and physics problems in imaging of both molecular and nanomaterial systems.

关键词： darkfield scattering spectra machine learning plasmonic gold particles scanning electron microscopy structure-property prediction variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Finding simplicity: unsupervised discovery of features, patterns, and order parameters via shift-invariant variational autoencoders

引用

MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2023年第4期4卷 045033页

作者： Ziatdinov, Maxim Wong, Chun Yin (Tommy) Kalinin, Sergei, V Oak Ridge Natl Lab Ctr Nanophase Mat Sci Oak Ridge TN 37831 USA Oak Ridge Natl Lab Computat Sci & Engn Div Oak Ridge TN 37831 USA Univ Tennessee Bredesen Ctr Interdisciplinary Res & Grad Educ Knoxville TN 37996 USA Univ Tennessee Dept Mat Sci & Engn Knoxville TN 37996 USA

Recent advances in scanning tunneling and transmission electron microscopies (STM and STEM) have allowed routine generation of large volumes of imaging data containing information on the structure and functionality of materials. The experimental data sets contain signatures of long-range phenomena such as physical order parameter fields, polarization, and strain gradients in STEM, or standing electronic waves and carrier-mediated exchange interactions in STM, all superimposed onto scanning system distortions and gradual changes of contrast due to drift and/or mis-tilt effects. Correspondingly, while the human eye can readily identify certain patterns in the images such as lattice periodicities, repeating structural elements, or microstructures, their automatic extraction and classification are highly non-trivial and universal pathways to accomplish such analyses are absent. We pose that the most distinctive elements of the patterns observed in STM and (S)TEM images are similarity and (almost-) periodicity, behaviors stemming directly from the parsimony of elementary atomic structures, superimposed on the gradual changes reflective of order parameter distributions. However, the discovery of these elements via global Fourier methods is non-trivial due to variability and lack of ideal discrete translation symmetry. To address this problem, we explore the shift-invariant variational autoencoders (shift-VAEs) that allow disentangling characteristic repeating features in the images, their variations, and shifts that inevitably occur when randomly sampling the image space. Shift-VAEs balance the uncertainty in the position of the object of interest with the uncertainty in shape reconstruction. This approach is illustrated for model 1D data, and further extended to synthetic and experimental STM and STEM 2D data. We further introduce an approach for training shift-VAEs that allows finding the latent variables that comport to known physical behavior. In this specific case, t

关键词： variational autoencoder scanning tunneling microscopy scanning transmission electron microscopy unsupervised learning latent representation

来源：评论

学校读者我要写书评

暂无评论

Data Augmentation with Cross-Modal variational autoencoders (DACMVA) for Cancer Survival Prediction

引用

INFORMATION 2024年第1期15卷 7页

作者： Rajaram, Sara Mitchell, Cassie S. Georgia Inst Technol Lab Pathol Dynam Atlanta GA 30332 USA Emory Univ Atlanta GA 30332 USA Georgia Inst Technol Ctr Machine Learning Georgia Tech Atlanta GA 30332 USA

The ability to translate Generative Adversarial Networks (GANs) and variational autoencoders (VAEs) into different modalities and data types is essential to improve Deep Learning (DL) for predictive medicine. This work presents DACMVA, a novel framework to conduct data augmentation in a cross-modal dataset by translating between modalities and oversampling imputations of missing data. DACMVA was inspired by previous work on the alignment of latent spaces in autoencoders. DACMVA is a DL data augmentation pipeline that improves the performance in a downstream prediction task. The unique DACMVA framework leverages a cross-modal loss to improve the imputation quality and employs training strategies to enable regularized latent spaces. Oversampling of augmented data is integrated into the prediction training. It is empirically demonstrated that the new DACMVA framework is effective in the often-neglected scenario of DL training on tabular data with continuous labels. Specifically, DACMVA is applied towards cancer survival prediction on tabular gene expression data where there is a portion of missing data in a given modality. DACMVA significantly (p << 0.001, one-sided Wilcoxon signed-rank test) outperformed the non-augmented baseline and competing augmentation methods with varying percentages of missing data (4%, 90%, 95% missing). As such, DACMVA provides significant performance improvements, even in very-low-data regimes, over existing state-of-the-art methods, including TDImpute and oversampling alone.

关键词： data augmentation variational autoencoder Generative Adversarial Network cancer survival prediction

来源：评论

学校读者我要写书评

暂无评论

A hierarchical and interlamination graph self-attention mechanism-based knowledge graph reasoning architecture

引用

INFORMATION SCIENCES 2025年 686卷

作者： Wu, Yuejia Zhou, Jian-tao Inner Mongolia Univ Coll Comp Sci Hohhot Peoples R China Natl & Local Joint Engn Res Ctr Intelligent Infor Hohhot Peoples R China Minist Educ Engn Res Ctr Ecol Big Data Hohhot Peoples R China Inner Mongolia Engn Lab Cloud Comp & Serv Softwar Hohhot Peoples R China Inner Mongolia Key Lab Social Comp & Data Proc Hohhot Peoples R China Inner Mongolia Key Lab Discipline Inspect & Super Hohhot Peoples R China Inner Mongolia Engn Lab Big Data Anal Technol Hohhot Peoples R China

Knowledge Graph (KG) is an essential research field in graph theory, but its inherent incompleteness and sparsity influence its performance in several fields. Knowledge Graph Reasoning (KGR) aims to ameliorate those problems by mining new knowledge from subsistent knowledge. As one of the downstream tasks of KGR, link prediction is of great significance for improving the quality of KG. Recently, the Graph Neural Network (GNN)-based method became the most effective way to achieve the link prediction task. However, it still suffers from problems such as incomplete neighbor and relation-level information aggregation and unstable learning of the entity's features. To improve those issues, a Hierarchical and Interlamination Graph Self-attention Mechanism- based (HIGSM) plug-and-play architecture is proposed for KGR in this paper. It is composed of three-level layers: feature extractor, encoder, and decoder. The feature extractor makes our architecture more effective and stable for the retrieval of new features. The encoder is equipped with a two-stage encoding mechanism accompanied by two mixture-of-expert strategies, which enables our architecture to capture more practical reasoning information to improve prediction accuracy and generalization of the model. The decoder can use existing KGR models and compute the scores of triples in KG. The extensive experimental results and ablation studies on four KGs unambiguously demonstrate the state-of-the-art prediction performance of the proposed HIGSM architecture compared to current GNN-based methods.

关键词： Knowledge graph reasoning Graph self-attention mechanism variational autoencoder Mixture-of-expert Link prediction

来源：评论

学校读者我要写书评

暂无评论

Application of Artificial Intelligence Virtual Image Technology in Photography Art Creation Under Deep Learning

引用

IEEE ACCESS 2025年 13卷 14542-14556页

作者： Yao, Qiong Mahasarakham Univ Fac Fine Appl Arts & Cultural Sci Maha Sarakham 44150 Thailand

With the continuous advancement of artificial intelligence (AI) and deep learning technologies, virtual image generation exhibits significant potential for application in photographic art creation. The primary objective of this study is to investigate the use of AI virtual image technology in photography, particularly focusing on achieving creative expression and artistic style transfer through deep learning models. Consequently, this study proposes a novel model that integrates conditional generative adversarial networks (cGANs) with variational autoencoders (VAEs). This model aims to effectively address the challenges associated with image generation and style conversion in photographic art by leveraging the realistic generation capabilities of cGANs alongside the diversity maintenance features of VAEs. In the experimental section, the proposed cGANs + VAEs model is systematically compared with traditional Deep Convolutional GANs (DCGAN) and Pix2Pix models through empirical analysis. The experimental results indicate that the cGANs + VAEs model significantly outperforms traditional models in terms of image quality, artistic expression, and user satisfaction. Expert reviews further confirm the model's superiority in artistic style imitation and creative generation. Additionally, user surveys reveal that most participants are highly satisfied with the images generated by the model, particularly regarding artistic perception and visual effects. Moreover, the cGANs + VAEs model demonstrates strong performance in Frechet Inception Distance (FID) and Inception Score (IS) across multiple datasets, yielding FID values of 13.67, 9.45, and 11.90 on the COCO, CelebA, and WikiArt datasets, respectively. In summary, the proposed cGANs + VAEs model not only achieves remarkable advancements in the technical performance of image generation but also exhibits considerable potential for practical applications in photographic art creation.

关键词： Image synthesis Deep learning Art Artificial intelligence Photography Training Diversity reception Data models Analytical models Translation conditional generative adversarial networks variational autoencoder photography artistic creation virtual image technology

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：