检索结果-内蒙古大学图书馆

18th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2017)

作者： Hsu, Chin-Cheng Hwang, Hsin-Te Wu, Yi-Chiao Tsao, Yu Wang, Hsin-Min Acad Sinica Inst Informat Sci Taipei Taiwan Acad Sinica Res Ctr Informat Technol Innovat Taipei Taiwan

ISBN: (纸本)9781510848764

Building a voice conversion (VC) system from non-parallel speech corpora is challenging but highly valuable in real application scenarios. In most situations, the source and the target speakers do not repeat the same texts or they may even speak different languages. In this case, one possible, although indirect, solution is to build a generative model for speech. Generative models focus on explaining the observations with latent variables instead of learning a pairwise transformation function, thereby bypassing the requirement of speech frame alignment. In this paper, we propose a non-parallel VC framework with a variational autoencoding Wasserstein generative adversarial network (VAW-GAN) that explicitly considers a VC objective when building the speech model. Experimental results corroborate the capability of our framework for building a VC system from unaligned data, and demonstrate improved conversion quality.

关键词： non-parallel voice conversion Wasserstein generative adversarial network GAN variational autoencoder VAE

来源：评论

学校读者我要写书评

暂无评论

Enhanced medical image generation through advanced latent space diffusion

Materials and Emerging Technologies for Sustainability

引用

Materials and Emerging Technologies for Sustainability 1000年

作者： Khoi M. Nguyen Hung C. Nguyen H. Huy Nguyen The-Bang Nguyen Phuoc-Thinh Nguyen Vu Tran Phu Pham Quang-Thinh Bui University of Information Technology Vietnam National University Ho Chi Minh City Ho Chi Minh City Vietnam Faculty of Information Technology HUTECH University Ho Chi Minh City Vietnam Ho Chi Minh City University of Technology (HCMUT) Vietnam National University Ho Chi Minh City Ho Chi Minh City Vietnam School of Business Information Technology University of Economics Ho Chi Minh City Ho Chi Minh City Vietnam Office of Scientific Research Technology Management and International Cooperation Tien Giang University Tien Giang Vietnam

Collecting high-quality medical image data for machine learning applications remains a significant challenge due to data scarcity, privacy concerns, and high annotation costs. To address these issues, vision generative models, particularly Latent Diffusion Models (LDMs), have emerged as state-of-the-art solutions that reduce computational demands while maintaining superior performance in data generation tasks. In this study, we propose an enhanced LDM-based approach that integrates separable self-attention mechanisms within the diffusion process, positioned after residual blocks, to improve the capture of detailed features and maintain spatial consistency. This modification reduces memory usage by 82.94% and decreases the Fréchet Inception Distance (FID) by 25.01% compared to traditional self-attention models, all while preserving image quality. Our method addresses critical challenges such as data scarcity and computational efficiency in medical imaging by combining variational autoencoders (VAEs) for latent space mapping with U-Net for noise prediction. Evaluations on five datasets — PneumoniaMNIST, BloodMNIST, ChestMNIST, Dental4k, and HandMNIST — demonstrate significant improvements in computational efficiency, memory usage, and the quality of generated images, showcasing the potential of our approach for scalable and effective medical image synthesis.

关键词： Diffusion models latent space latent diffusion models variational autoencoder medical image generation

来源：评论

学校读者我要写书评

暂无评论

An Overview of Deep Generative Models in Functional and Evolutionary Genomics

引用

Biomedical Data Science 1000年第1期6卷 173-189页

作者： Burak Yelmen Flora Jay 2Institute of Genomics University of Tartu Tartu Estonia 1Laboratoire Interdisciplinaire des Sciences du Numérique CNRS UMR 9015 INRIA Université Paris-Saclay Orsay France email: flora.jay@lisn.fr

Following the widespread use of deep learning for genomics, deep generative modeling is also becoming a viable methodology for the broad field. Deep generative models (DGMs) can learn the complex structure of genomic data and allow researchers to generate novel genomic instances that retain the real characteristics of the original dataset. Aside from data generation, DGMs can also be used for dimensionality reduction by mapping the data space to a latent space, as well as for prediction tasks via exploitation of this learned mapping or supervised/semi-supervised DGM designs. In this review, we briefly introduce generative modeling and two currently prevailing architectures, we present conceptual applications along with notable examples in functional and evolutionary genomics, and we provide our perspective on potential challenges and future directions.

关键词： deep generative models variational autoencoder generative adversarial network functional genomics evolutionary genetics

来源：评论

学校读者我要写书评

暂无评论

GMG-NCDVAE: Guided de novo Molecule Generation using NLP Techniques and Constrained Diverse variational autoencoder

引用

ACM Transactions on Asian and Low-Resource Language Information Processing 1000年

作者： Arun Singh Bhadwal Kamal Kumar Neeraj Kumar National Institute of Technology india Thapar Institute of Technology india

Text processing techniques in Natural Language Processing (NLP) find applications in many industries such as pharmaceutical, automation, and automotive. Drug design using variational autoencoders is a popular data-assisted technique to design drug molecules with control over molecular properties. It generates continuous latent space, which can be optimized. This paper introduces a constrained variational autoencoder-based molecular generation structure using the SMILES format. The proposal is accompanied by the generation of molecules, filtering them based on scores, and subsequently determining the optimal molecules by using NLP matured techniques. To generate more meaningful latent space, a condition vector of molecular properties is combined with the SMILES representation of molecules. A tunable parameter (diversity,D) is also used to control the diversity in the generated molecules. The proposed architecture is evaluated using standard datasets. Validity, uniqueness, and FCD are evaluation matrices used to access the performance of model. The validity of proposed model is maximum (92.11%) at diversity level 1. As diversity level increases the validity of generated molecules decreases. This is intuitively consistent because increased diversity reduces replicas and improves variety in the generated molecules. Thus proposed model provide control over diversity of generated molecules. The results clearly indicate that the proposed method outperforms other SMILE based methods and gives a new direction for the generation of desired molecules.

关键词： variational autoencoder de novo Molecule Generation Natural Language Processing SMILES.

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：