检索结果-内蒙古大学图书馆

IEEE Workshop on multimedia Signal Processing

作者： Ching-Yu Chiu Wen-Yi Hsiao Yin-Cheng Yeh Yi-Hsuan Yang Alvin Wen-Yu Su Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Taiwan Yating Music Team Taiwan AI Labs Taiwan Research Center for IT Innovation Academia Sinica Taiwan National Cheng Kung University Taiwan

ISBN: (数字)9781728193205

ISBN: (纸本)9781728193236

Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities. To counter the lack of available multi-track data for supervised model training, a data augmentation method that creates artificial mixtures by combining tracks from different songs has been shown useful in recent works. Following this light, we examine further in this paper extended data augmentation methods that consider more sophisticated mixing settings employed in the modern music production routine, the relationship between the tracks to be combined, and factors of silence. As a case study, we consider the separation of violin and piano tracks in a violin piano ensemble, evaluating the performance in terms of common metrics, namely SDR, SIR, and SAR. In addition to examining the effectiveness of these new data augmentation methods, we also study the influence of the amount of training data. Our evaluation shows that the proposed mixing-specific data augmentation methods can help improve the performance of a deep learning-based model for source separation, especially in the case of small training data.

关键词： Training Source separation Training data Production Data models Multiple signal classification Music information retrieval

来源：评论

学校读者我要写书评

暂无评论

Mixing-specific data augmentation techniques for improved blind violin/piano source separation

arXiv

引用

arXiv 2020年

作者： Chiu, Ching-Yu Hsiao, Wen-Yi Yeh, Yin-Cheng Yang, Yi-Hsuan Su, Alvin Wen-Yu Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Taiwan Yating Music Team Taiwan AI Labs Taiwan Research Center for IT Innovation Academia Sinica Taiwan Dept. CSIE National Cheng Kung University Taiwan

关键词： Mixing

来源：评论

学校读者我要写书评

暂无评论

Improving the Vocabulary Learning by Personalized Proficiency

Improving the Vocabulary Learning by Personalized Proficienc...

引用

IEEE International Conference on Ubi-Media computing

作者： Yi-Zhen Lin Kai-Hsiang Chen Jen-Wei Huang Department of Electrical Engineering National Cheng Kung University Tainan Taiwan Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Taiwan

ISBN: (数字)9781728128207

ISBN: (纸本)9781728128214

To improve the vocabulary ability is very important in language learning. Thus, if we can learn and remember a word very effectively, then we will be able to master a language more quickly. Therefore, many scholars began to propose related research. Due to the learning mechanism of human brain, sometimes when people learn a new knowledge they may forgot at a short time. In order to make the consideration more complete, after analyzing Hermann Ebbinghaus's forgetting curve experiment, we added two variables, one is the acceptance of each word by the same person, and the other is the ability of different people to remember vocabulary. With the above two parameters, we want to design a system to help user to review the vocabulary which may be forgetting before. The forgetting curve can be personalized, and it is more accurate to calculate the best time for each user to review the vocabulary.

关键词：

来源：评论

学校读者我要写书评

暂无评论

LSTM-based Text Emotion Recognition Using Semantic and Emotional Word Vectors

LSTM-based Text Emotion Recognition Using Semantic and Emoti...

引用

Asian Conference on Affective computing and intelligent Interaction (ACII Asia)

作者： Ming-Hsiang Su Chung-Hsien Wu Kun-Yi Huang Qian-Bei Hong Computer Science and Information Engineering National Cheng Kung University Tainan Taiwan Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Tainan Taiwan

ISBN: (纸本)9781538653128

This study proposes a long-short term memory (LSTM)-based approach to text emotion recognition based on semantic word vector and emotional word vector of the input text. For each word in an input text, the semantic word vector is extracted from the word 2vec model. Besides, each lexical word is projected to all the emotional words defined in an affective lexicon to derive an emotional word vector. An autoencoder is then adopted to obtain the bottleneck features from the emotional word vector for dimensionality reduction. The autoencoder bottleneck features are then concatenated with the features in the semantic word vector to form the final textual features for emotion recognition. Finally, given the textual feature sequence of the entire sentence, the LSTM is used for emotion recognition by modeling the contextual emotion evolution of the input text. For evaluation, the NLPCC-MHMC-TE database containing seven emotion categories: anger, boredom, disgust, anxiety, happiness, sadness, and surprise was constructed and used. Five-fold cross-validation was employed to evaluate the performance of the proposed method. Experimental results show that the proposed LSTM-based method achieved a recognition accuracy of 70.66%, improving 5.33% compared with the CNN-based method. Besides, the proposed method based on integration of the semantic word vector and emotional word vector of the input text outperformed that using the individual feature vector.

关键词： Emotion recognition Databases Logic gates Feature extraction Semantics Speech recognition Computer science

来源：评论

学校读者我要写书评

暂无评论

Syncgan: Synchronize the Latent Spaces of Cross-Modal Generative Adversarial Networks

Syncgan: Synchronize the Latent Spaces of Cross-Modal Genera...

引用

IEEE International Conference on multimedia and Expo (ICME)

作者： Wen-Cheng Chen Chien-Wen Chen Min-Chun Hu Dept. of Computer Science and Information Engineering National Cheng Kung University Taiwan Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Taiwan

Generative adversarial network (GAN) has achieved impressive success on cross-domain generation, but it faces difficulty in cross-modal generation due to the lack of a common distribution between heterogeneous data. Most existing methods of conditional based cross-modal GANs adopt the strategy of one-directional transfer and have achieved preliminary success on text-to-image transfer. Instead of learning the transfer between different modalities, we aim to learn a synchronous latent space representing the cross-modal common concept. A novel network component named synchronizer is proposed in this work to judge whether the paired data is synchronous/corresponding or not, which can constrain the latent space of generators in the GANs. Our GAN model, named as SyncGAN, can successfully generate synchronous data (e.g., a pair of image and sound) from identical random noise. For transforming data from one modality to another, we recover the latent code by inverting the mappings of a generator and use it to generate data of different modality. In addition, the proposed model can achieve semi-supervised learning, which makes our model more flexible for practical applications.

关键词： Synchronization Gallium nitride Generators Data models Training Generative adversarial networks Training data

来源：评论

学校读者我要写书评

暂无评论

Exploring Macroscopic Fluctuation of Facial Expression for Mood Disorder Classification

Exploring Macroscopic Fluctuation of Facial Expression for M...

引用

Asian Conference on Affective computing and intelligent Interaction (ACII Asia)

作者： Qian-Bei Hong Chung-Hsien Wu Ming-Hsiang Su Kun-Yi Huang Graduate Program of Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Tainan Taiwan Computer Science and Information Engineering National Cheng Kung University Tainan Taiwan

ISBN: (纸本)9781538653128

In clinical diagnosis of mood disorder, a large portion of bipolar disorder patients (BDs) are misdiagnosed as unipolar depression (UDs). Clinicians have confirmed that BDs generally show "reduced affect'' during clinical treatment. Thus, it is expected to build an objective and one-time diagnosis system for diagnosis assistance by using machine-learning techniques. In this study, facial expressions of BD, UD and control group (C) elicited by emotional video clips are collected for exploring temporal fluctuation characteristics of intensities of facial muscles expression among the three groups. The differences of facial expressions among mood disorders are investigated by observing macroscopic fluctuations. To deal with these problems, the corresponding methods for feature extraction and modeling are proposed. From the viewpoint of macroscopic facial expression, action unit (AU) is applied for describing the temporal transformation of muscles. Then, modulation spectrum is used for extracting short-term variation of AU. The multilayer perceptron (MLP)-based disorder prediction model is then applied to obtain the prediction results. For evaluation of the proposed method, 12 subjects for three group are included in the K-fold (K=12) cross validation experiments. The experiment results reached 61.1% classification accuracy, and outperformed the other baseline methods.

关键词： Mood Gold Fluctuations Databases Feature extraction Modulation Predictive models

来源：评论

学校读者我要写书评

暂无评论

Speech Emotion Recognition using Convolutional Neural Network with Audio Word-based Embedding

Speech Emotion Recognition using Convolutional Neural Networ...

引用

International Symposium on Chinese Spoken Language Processing

作者： Kun-Yi Huang Chung-Hsien Wu Qian-Bei Hong Ming-Hsiang Su Yuan-Rong Zeng Department of Computer Science and Information Engineering National Cheng Kung University Taiwan PhD Program for Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Taiwan

ISBN: (纸本)9781538656280;9781538656273

A complete emotional expression typically contains a complex temporal course in a natural conversation. Related research on utterance-level, segment-level and multi-level processing lacks understanding of the underlying relation of emotional speech. In this work, a convolutional neural network (CNN) with audio word-based embedding is proposed for emotion modeling. In this study, vector quantization is first applied to convert the low level features of each speech frame into audio words using k-means algorithm. Word2vec is adopted to convert an input speech utterance into the corresponding audio word vector sequence. Finally, the audio word vector sequences of the training emotional speech data with emotion annotation are used to construct the CNN- based emotion model. The NCKU-ES database, containing seven emotion categories: happiness, boredom, anger, anxiety, sadness, surprise and disgust, was collected and five-fold cross validation was used to evaluate the performance of the proposed CNN-based method for speech emotion recognition. Experimental results show that the proposed method achieved an emotion recognition accuracy of 82.34%, improving by 8.7% compared to the Long Short Term Memory (LSTM)- based method, which faced the challenging issue of long input sequence. Comparing with raw features, the audio word-based embedding achieved an improvement of 3.4% for speech emotion recognition.

关键词： Emotion recognition Feature extraction Speech recognition Acoustics Training Databases Neural networks

来源：评论

学校读者我要写书评

暂无评论

Spatial-Temporal pattern analysis and prediction of air quality in Taiwan

Spatial-Temporal pattern analysis and prediction of air qual...

引用

IEEE International Conference on Ubi-Media computing

作者： Ping-Wei Soh Kai-Hsiang Chen Jen-Wei Huang Hone-Jay Chu Institute of Computer and Communication Engineering National Cheng Kung University Tainan Taiwan PhD Program for Multimedia Systems and Intelligent Computing National Cheng Kung University and Academia Sinica Tainan Taiwan Department of Electrical Engineering National Cheng Kung University Tainan Taiwan Department of Geomatics National Cheng Kung University Tainan Taiwan

This study explores the spatial-temporal patterns of particulate matter (PM) in Taiwan. Probability map of PM and daily patterns are discussed in this study. Data mining provides more detailed spatial-temporal information for PM variations and trends. The proposed model will show that data mining provides a relatively high goodness of fit and sufficient space-time explanatory power, particularly air pollution frequency and affect areas. In the proposed model, a method using Dynamic Time Warping is proposed to analyse temporal similarity between stations. The proposed model can eliminate global effect on a single station through the performance of multiple stations. The proposed model will further be used for prediction of PM2.5. The prediction results will discuss the spatial-temporal relations between stations. This study will investigate the distribution of PM and its cyclicality.

关键词： Predictive models Time series analysis Atmospheric modeling Air pollution Conferences Data mining

来源：评论

学校读者我要写书评

暂无评论

BigNeuron: a resource to benchmark and predict performance of algorithms for automated tracing of neurons in light microscopy datasets

引用

Nature methods 2023年第6期20卷 824-835页

作者： Linus Manubens-Gil Zhi Zhou Hanbo Chen Arvind Ramanathan Xiaoxiao Liu Yufeng Liu Alessandro Bria Todd Gillette Zongcai Ruan Jian Yang Miroslav Radojević Ting Zhao Li Cheng Lei Qu Siqi Liu Kristofer E Bouchard Lin Gu Weidong Cai Shuiwang Ji Badrinath Roysam Ching-Wei Wang Hongchuan Yu Amos Sironi Daniel Maxim Iascone Jie Zhou Erhan Bas Eduardo Conde-Sousa Paulo Aguiar Xiang Li Yujie Li Sumit Nanda Yuan Wang Leila Muresan Pascal Fua Bing Ye Hai-Yan He Jochen F Staiger Manuel Peter Daniel N Cox Michel Simonneau Marcel Oberlaender Gregory Jefferis Kei Ito Paloma Gonzalez-Bellido Jinhyun Kim Edwin Rubel Hollis T Cline Hongkui Zeng Aljoscha Nern Ann-Shyn Chiang Jianhua Yao Jane Roskams Rick Livesey Janine Stevens Tianming Liu Chinh Dang Yike Guo Ning Zhong Georgia Tourassi Sean Hill Michael Hawrylycz Christof Koch Erik Meijering Giorgio A Ascoli Hanchuan Peng Institute for Brain and Intelligence Southeast University Nanjing China. Microsoft Corporation Redmond WA USA. Tencent AI Lab Bellevue WA USA. Computing Environment and Life Sciences Directorate Argonne National Laboratory Lemont IL USA. Kaya Medical Seattle WA USA. University of Cassino and Southern Lazio Cassino Italy. Center for Neural Informatics Structures and Plasticity Krasnow Institute for Advanced Study George Mason University Fairfax VA USA. Faculty of Information Technology Beijing University of Technology Beijing China. Beijing International Collaboration Base on Brain Informatics and Wisdom Services Beijing China. Nuctech Netherlands Rotterdam the Netherlands. Janelia Research Campus Howard Hughes Medical Institute Ashburn VA USA. Department of Electrical and Computer Engineering University of Alberta Edmonton Alberta Canada. Ministry of Education Key Laboratory of Intelligent Computation and Signal Processing Anhui University Hefei China. Paige AI New York NY USA. Scientific Data Division and Biological Systems and Engineering Division Lawrence Berkeley National Lab Berkeley CA USA. Helen Wills Neuroscience Institute and Redwood Center for Theoretical Neuroscience UC Berkeley Berkeley CA USA. RIKEN AIP Tokyo Japan. Research Center for Advanced Science and Technology (RCAST) The University of Tokyo Tokyo Japan. School of Computer Science University of Sydney Sydney New South Wales Australia. Texas A&M University College Station TX USA. Cullen College of Engineering University of Houston Houston TX USA. Graduate Institute of Biomedical Engineering National Taiwan University of Science and Technology Taipei Taiwan. National Centre for Computer Animation Bournemouth University Poole UK. PROPHESEE Paris France. Department of Neuroscience Columbia University New York NY USA. Mortimer B. Zuckerman Mind Brain Behavior Institute Columbia University New York NY USA. Department of Computer Science Northern Illinois Universit

BigNeuron is an open community bench-testing platform with the goal of setting open standards for accurate and fast automatic neuron tracing. We gathered a diverse set of image volumes across several species that is representative of the data obtained in many neuroscience laboratories interested in neuron tracing. Here, we report generated gold standard manual annotations for a subset of the available imaging datasets and quantified tracing quality for 35 automatic tracing algorithms. The goal of generating such a hand-curated diverse dataset is to advance the development of tracing algorithms and enable generalizable benchmarking. Together with image quality features, we pooled the data in an interactive web application that enables users and developers to perform principal component analysis, t-distributed stochastic neighbor embedding, correlation and clustering, visualization of imaging and tracing data, and benchmarking of automatic tracing algorithms in user-defined data subsets. The image quality metrics explain most of the variance in the data, followed by neuromorphological features related to neuron size. We observed that diverse algorithms can provide complementary information to obtain accurate results and developed a method to iteratively combine methods and generate consensus reconstructions. The consensus trees obtained provide estimates of the neuron structure ground truth that typically outperform single algorithms in noisy datasets. However, specific algorithms may outperform the consensus tree strategy in specific imaging conditions. Finally, to aid users in predicting the most accurate automatic tracing results without manual annotations for comparison, we used support vector machine regression to predict reconstruction quality given an image volume and a set of automatic tracings.

关键词： Computational neuroscience Computational platforms and environments

来源：评论

学校读者我要写书评

暂无评论

Author Correction: BigNeuron: a resource to benchmark and predict performance of algorithms for automated tracing of neurons in light microscopy datasets

引用

Nature methods 2024年第10期21卷 1959页

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：