咨询与建议

限定检索结果

文献类型

  • 4 篇 期刊文献
  • 4 篇 会议

馆藏范围

  • 8 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 8 篇 工学
    • 7 篇 电气工程
    • 7 篇 计算机科学与技术...
    • 3 篇 信息与通信工程
    • 2 篇 软件工程
    • 1 篇 仪器科学与技术
  • 4 篇 理学
    • 3 篇 物理学
    • 1 篇 化学
    • 1 篇 生物学
  • 4 篇 医学
    • 4 篇 临床医学

主题

  • 8 篇 vector-quantized...
  • 2 篇 noise reduction
  • 2 篇 task analysis
  • 2 篇 speech enhanceme...
  • 2 篇 decoding
  • 2 篇 complex wiener f...
  • 1 篇 masked autoencod...
  • 1 篇 langevin dynamic...
  • 1 篇 speech-to-text m...
  • 1 篇 variational auto...
  • 1 篇 neural vocoder
  • 1 篇 deep learning
  • 1 篇 bridges
  • 1 篇 untranscribed un...
  • 1 篇 generative adver...
  • 1 篇 image captioning
  • 1 篇 open-source soft...
  • 1 篇 image-to-speech
  • 1 篇 self-supervised ...
  • 1 篇 multidimensional...

机构

  • 1 篇 nara inst sci & ...
  • 1 篇 hitachi ltd adv ...
  • 1 篇 univ ghent fac e...
  • 1 篇 nara inst sci & ...
  • 1 篇 korea univ sch i...
  • 1 篇 japan adv inst s...
  • 1 篇 nagoya univ nago...
  • 1 篇 centralesupelec ...
  • 1 篇 tarvo inc
  • 1 篇 japan adv inst s...
  • 1 篇 riken ctr adv in...
  • 1 篇 vrije univ bruss...
  • 1 篇 univ ghent fac e...
  • 1 篇 riken ctr adv in...

作者

  • 2 篇 akagi masato
  • 2 篇 nakamura satoshi
  • 2 篇 effendi johanes
  • 2 篇 sakti sakriani
  • 2 篇 unoki masashi
  • 1 篇 ho tuan vu
  • 1 篇 huang wen-chin
  • 1 篇 hayashi tomoki
  • 1 篇 sadok samir
  • 1 篇 vercheval nicola...
  • 1 篇 royen remco
  • 1 篇 munteanu adrian
  • 1 篇 nguyen huy
  • 1 篇 wu yi-chiao
  • 1 篇 quoc huy nguyen
  • 1 篇 pizurica aleksan...
  • 1 篇 toda tomoki
  • 1 篇 tobing patrick l...
  • 1 篇 han giwoong
  • 1 篇 kobayashi kazuhi...

语言

  • 8 篇 英文
检索条件"主题词=vector-quantized variational autoencoder"
8 条 记 录,以下是1-10 订阅
排序:
vector-quantized variational autoencoder for Phase-aware Speech Enhancement  23
Vector-quantized Variational Autoencoder for Phase-aware Spe...
收藏 引用
Interspeech Conference
作者: Tuan Vu Ho Quoc Huy Nguyen Akagi, Masato Unoki, Masashi Japan Adv Inst Sci & Technol Nomi Japan
Speech-enhancement methods based on the complex ideal ratio mask (cIRM) have achieved promising results. These methods often deploy a deep neural network to jointly estimate the real and imaginary components of the cI... 详细信息
来源: 评论
CRANK: AN OPEN-SOURCE SOFTWARE FOR NONPARALLEL VOICE CONVERSION BASED ON vector-quantized variational autoencoder
CRANK: AN OPEN-SOURCE SOFTWARE FOR NONPARALLEL VOICE CONVERS...
收藏 引用
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
作者: Kobayashi, Kazuhiro Huang, Wen-Chin Wu, Yi-Chiao Tobing, Patrick Lumban Hayashi, Tomoki Toda, Tomoki Nagoya Univ Nagoya Aichi Japan TARVO Inc Tokyo Japan
In this paper, we present an open-source software for developing a nonparallel voice conversion (VC) system named crank. Although we have released an open-source VC software based on the Gaussian mixture model named s... 详细信息
来源: 评论
EM-LAST: Effective Multidimensional Latent Space Transport for an Unpaired Image-to-Image Translation With an Energy-Based Model
收藏 引用
IEEE ACCESS 2022年 10卷 72839-72849页
作者: Han, Giwoong Min, Jinhong Han, Sung Won Korea Univ Sch Ind & Management Engn Seoul 02841 South Korea
For an unpaired image-to-image translation to work effectively, the latent space of each image domain must be well-designed. The codes of each style must be translated toward the target while preserving the parts corr... 详细信息
来源: 评论
Phase-Aware Speech Enhancement With Complex Wiener Filter
收藏 引用
IEEE ACCESS 2023年 11卷 141573-141584页
作者: Nguyen, Huy Ho, Tuan Vu Akagi, Masato Unoki, Masashi Japan Adv Inst Sci & Technol JAIST Grad Sch Adv Sci & Technol Nomi Ishikawa 9231292 Japan Hitachi Ltd Adv Artificial Intelligent Innovat Ctr Media Intelligent Proc Reseach Dept Tokyo 1858601 Japan
In speech enhancement, accurate phase reconstruction can significantly improve speech quality. While phase-aware speech enhancement methods using the complex ideal ratio mask (cIRM) have shown promise, the estimation ... 详细信息
来源: 评论
End-to-End Image-to-Speech Generation for Untranscribed Unknown Languages
收藏 引用
IEEE ACCESS 2021年 9卷 55144-55154页
作者: Effendi, Johanes Sakti, Sakriani Nakamura, Satoshi Nara Inst Sci & Technol Ikoma 6300192 Japan RIKEN Ctr Adv Intelligence Project AIP Tokyo 1030027 Japan
Describing orally what we are seeing is a simple task we do in our daily life. However, in the natural language processing field, this simple task needs to be bridged by a textual modality that helps the system to gen... 详细信息
来源: 评论
Weakly-supervised Speech-to-text Mapping with Visually Connected Non-parallel Speech-text Data using Cyclic Partially-aligned Transformer  22
Weakly-supervised Speech-to-text Mapping with Visually Conne...
收藏 引用
Interspeech Conference
作者: Effendi, Johanes Sakti, Sakriani Nakamura, Satoshi Nara Inst Sci & Technol Ikoma Nara Japan RIKEN Ctr Adv Intelligence Project AIP Tokyo Japan
Despite the successful development of automatic speech recognition (ASR) systems for several of the world's major languages, they require a tremendous amount of parallel speech-text data. Unfortunately, for many o... 详细信息
来源: 评论
A vector quantized MASKED autoencoder FOR SPEECH EMOTION RECOGNITION
A VECTOR QUANTIZED MASKED AUTOENCODER FOR SPEECH EMOTION REC...
收藏 引用
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Sadok, Samir Leglaive, Simon Seguier, Renaud CentraleSupelec IETR UMR CNRS 6164 Gif Sur Yvette France
Recent years have seen remarkable progress in speech emotion recognition (SER), thanks to advances in deep learning techniques. However, the limited availability of labeled data remains a significant challenge in the ... 详细信息
来源: 评论
PCGen: A Fully Parallelizable Point Cloud Generative Model
收藏 引用
SENSORS 2024年 第5期24卷 1414页
作者: Vercheval, Nicolas Royen, Remco Munteanu, Adrian Pizurica, Aleksandra Univ Ghent Fac Engn & Architecture Dept Telecommun & Informat Proc Res Grp Artificial Intelligence & Sparse Modelling B-9000 Ghent Belgium Univ Ghent Fac Engn & Architecture Dept Elect & Informat Syst Clifford Res Grp B-9000 Ghent Belgium Vrije Univ Brussel Dept Elect & Informat ETRO Fac Engn B-1050 Brussels Belgium
Generative models have the potential to revolutionize 3D extended reality. A primary obstacle is that augmented and virtual reality need real-time computing. Current state-of-the-art point cloud random generation meth... 详细信息
来源: 评论