Existing research on speaker and emotional voice conversion often focuses on separate tasks, neglecting their joint exploration. Furthermore, the limited availability of emotional corpora for target speakers poses a s...
详细信息
ISBN:
(数字)9798331522667
ISBN:
(纸本)9798331522674
Existing research on speaker and emotional voice conversion often focuses on separate tasks, neglecting their joint exploration. Furthermore, the limited availability of emotional corpora for target speakers poses a significant challenge for training robustness and generalized models. This paper proposes an improved scheme for speaker-emotion voice conversion with limited target speaker's emotional corpus, integrating a large language model and a pre-trained emotional speech synthesis model. It introduces several enhancements to enhance the quality of converted speech in terms of speaker similarity and emotional expressiveness. First, emotionally tagged text is generated using a large language model and emotional speech is synthesized from this text using a fine-tuned pre-trained emotional speech synthesis model. Then, a speaker-emotion voice conversion model is co-trained with both synthesized and real target emotional speech. Finally, the model is fine-tuned with the real target emotional speech to further boost the speaker and emotional similarity.
We introduce an enumeration-free method based on mathematical programming to precisely characterize various properties such as fairness or sparsity within the set of "good models", known as Rashomon set. Thi...
详细信息
Classical routing heuristics, e.g., Open Shortest Path First, have several significant issues, such as they are not able to generalize or adapt to heterogeneous environments including dynamics of topology, traffic pat...
Classical routing heuristics, e.g., Open Shortest Path First, have several significant issues, such as they are not able to generalize or adapt to heterogeneous environments including dynamics of topology, traffic patterns, and Quality of Service (QoS) requirements. To generalize solutions, network operators recently utilized machine learning algorithms at centralized controllers. However, centralized machine learning solutions are not scalable due to many reasons, such as slow data transfer to the central controller in a large network. Distributed multi-agent systems do not require a tedious and complex central controller while reducing data storage and computation burden as tasks are divided and handled at local servers/computers. In this paper, we present a fully distributed multi-agent system named MADQN, addressing the request provisioning problem, i.e., provisioning a request on a dedicated path satisfying latency and bandwidth requirements. MADQN applies a Deep Q-Network reinforcement learning algorithm to train the agents. Although each agent has its data and policy locally, they can still cooperate with other agents to finish a common routing task that maximizes the total reward. We evaluate the effectiveness of the MADQN with a benchmark network that consists of 100 nodes and 432 directed links, and a dynamic set of thousands of requests. The results show that the agents in MADQN can learn to cooperate and provision 99% of the requests, which is about a 9% improvement against the centralized single agent scheme.
As the pace grows in the development of image processing techniques and the current applications rise in machine learning and deep learning techniques for visual inspections and physical assessment, this article revie...
详细信息
As the pace grows in the development of image processing techniques and the current applications rise in machine learning and deep learning techniques for visual inspections and physical assessment, this article reviews the existing literature. It provides a detailed synthesis of the overview of surface pavement conditions, computer-vision-based technologies for road damage detection, various datasets and data collection methods. We analyse and compare different machine-learning methods and models proposed in the literature and identify challenges that need to be addressed in the future in road surface defect detection.
作者:
Tu, Deng-YaoLin, Peng-ChanChou, Hsin-HungShen, Meng-RuHsieh, Sun-YuanNational Cheng Kung University
Master Degree Program on Artificial Intelligence Tainan City70101 Taiwan National Cheng Kung University
Institute of Medical Informatics Department of Oncology Department of Genomic Medicine National Cheng Kung University Hospital College of Medicine Department of Computer Science and Information Engineering Tainan City70101 Taiwan National Chi Nan University
Department of Computer Science and Information Engineering Nantou County54561 Taiwan National Cheng Kung University
Graduate Institute of Clinical Medicine Department of Obstetrics and Gynecology Department of Pharmacology National Cheng Kung University Hospital College of Medicine Tainan City70101 Taiwan National Cheng Kung University
Institute of Medical Information Institute of Manufacturing Information and Systems Center for Innovative FinTech Business Models International Center for the Scientific Development of Shrimp Aquaculture Department of Computer Science and Information Engineering Tainan City70101 Taiwan
Automatic liver tumor detection from computed tomography (CT) makes clinical examinations more accurate. However, deep learning-based detection algorithms are characterized by high sensitivity and low precision, which...
详细信息
The electrically evoked compound action potential (ECAP) has been used in various clinical studies and has become a key physiological signal for cochlear implants (CI). This study used four sensing electrodes to recor...
详细信息
ISBN:
(数字)9798350348958
ISBN:
(纸本)9798350348965
The electrically evoked compound action potential (ECAP) has been used in various clinical studies and has become a key physiological signal for cochlear implants (CI). This study used four sensing electrodes to record ECAP signals based on the alternating polarity approach. An electrical field imaging (EFI) result based on the finite element method was used to obtain the interface impedance, then ECAP simulation results were computed and compared with a patient's clinical ECAP measurements. Preliminary modeling results show that the interface impedance obtained by this EFI-based technique can improve the simulation accuracy of the ECAP model. The ECAP modeling result will be compared with clinical ECAP measurements to validate the model in the full paper.
Voice recognition systems are crucial because they allow seamless human-computer interaction and improve accessibility for users of all abilities. The use of these technologies in hands-free control, language translat...
详细信息
ISBN:
(数字)9798331504465
ISBN:
(纸本)9798331504472
Voice recognition systems are crucial because they allow seamless human-computer interaction and improve accessibility for users of all abilities. The use of these technologies in hands-free control, language translation, virtual assistants, transcription services, and hands-free control is revolutionising how we engage with technology and enhancing convenience and productivity in general. Several attendance systems based on voice recognition exist, but we wanted to deploy an attendance system with a good graphical user interface specifically for students of GIK Institute. For this purpose, we wanted to make a user-friendly and accurate voice recognition system based and trained on self-provided data of ten students. This study introduces an AI-driven attendance system, which demonstrates high efficiency and accuracy in identifying students’ daily class attendance. To achieve this, the Gaussian Mixture Model approach was employed. The paper also delves into the utilization of libraries and methods, encompassing the training and validation of well-known machine learning models. Additionally, the system’s performance, its strengths, weaknesses and potential areas for improvement are also discussed in the study.
Introduction: Computing Salient Feature Points (SFP) of 3D models has important application value in the field of computer graphics. In order to extract the SFP more effectively, a novel SFP computing algorithm based ...
详细信息
Acute chest pain is a common symptom of cardiovascular disease, and its data have important research value. However, the presence of missing value in medical datasets is almost inevitable, which may adversely affect t...
详细信息
暂无评论