Most complex systems in various fields, such as social networks in social science, the Internet in engineering, and signaling pathways in biology, can be formulated as networks where nodes represent entities and links...
详细信息
Most complex systems in various fields, such as social networks in social science, the Internet in engineering, and signaling pathways in biology, can be formulated as networks where nodes represent entities and links represent some relationship between nodes. Individual entities in a complex system seldom exist in isolation, but rather are often organized in groups to exert functions. For example, an organization is typically consisted of units of different but related functions that interconnect in particular structures to maximize the overall performance of the organization. In biology, a group of proteins in a cell interact to form an RNA polymerase for transcription of genes. Therefore, a critical step toward understanding complex systems is to uncover organizational or community structures in the networks [1]. Communities, also referred to as clusters or modules, are groups of nodes that share common properties or play similar roles. A primary objective of community detection is to identify sets of nodes with common functions by using all the information availab.e in the network. Many methods for community detection have been proposed, and most of them only make use of information of network topology. However, since these methods rely on the accuracy of the relational networks, their ability to discover the true community structures degrades rapidly as the networks are perturbed by noise. Further, this problem will become more complicated when the networks contain complex structures, such as the hierarchical and/or overlapping communities. The above problems may be mitigated when one incorporates some additional knowledge from different sources beyond just network topology. Firstly, a lot of content on nodes and/or edges is often availab.e in many real applications, e.g. Twitter, Flickr, and Facebook in social media. The performance of community detection will be significantly improved if one considers this content information, especially when the network has
Speech disorder speakers refer to people whose pronunciation are incorrect due to the brain central or peripheral nervous damage disease. In the clinical manifestation, it usually performs like frozen and cerebral ***...
详细信息
Speech disorder speakers refer to people whose pronunciation are incorrect due to the brain central or peripheral nervous damage disease. In the clinical manifestation, it usually performs like frozen and cerebral *** damage influences the control of the speech organs without impacting the comprehension ability. Many previous studies are focusing on the speech characteristics rather than the articulation movements. It is necessary to find the mechanism of disorder speakers' articulation for helping them to do rehabilitation training. In this paper, the analysis of the acoustic parameter and moving trajectory shows how different of the disorder speakers' tongue movement from the normal speakers' tongue movement which would affect the shape of oral and further affect the acoustic *** electromagnetic articulography(EMA) procedures capture the motions of external and internal articulators. We construct TORGO database by lab.ling and collating and integrate the disorder people and normal people' EMA data to analyze the acoustic and moving trajectory. On the analysis of moving trajectory, we extract the specific phoneme from continuous speech to research the speaker's articular *** can be seen from the picture of the movement of sensor attached on the tongue,the disorder speakers' tongue swings back and forth when they pronounce. And they cannot control the tongue stable for a while. On the acoustic analysis, we use the Mel Frequency Cepstrum Coefficient and linear predictive coding to extract feature parameters from the audio data. Onthis basis, we further extract specific phoneme from continuous speech to research the speaker's acoustic parameters. We mainly adopt the HMM method to analyze the parameters of acoustic model. The experiment results indicate that HMM method is the effectiveness to investigate the pronunciation mechanism of disorder speakers and normal speakers. Besides the second and third formants of disorder speakers are lower tha
In this paper, we propose a neurophysiologically computational model with a physio-logical articulatory model to simulate the entire neurophysiological process of speech production. Firstly, a set of speech data for v...
详细信息
In this paper, we propose a neurophysiologically computational model with a physio-logical articulatory model to simulate the entire neurophysiological process of speech production. Firstly, a set of speech data for vowels and consonant-vowel(CV) syllab.es was generated using the physiological articulatory model driven by muscle activations;and then, the generated dataset was used as training data to learn four self-organizing maps(SOMs):speech planning, motor control, auditory feedback, and somatosensory feedback. Experimental results demonstrate that the implementation of the physiological articulatory model in the neurophysiologically computational model faithfully reflects the neurophysiological process of human speech production. The multi-connection of neurons is well verified through the "One-to-many" projection between the SOMs. The distribution of tongue and jaw in the well learned motor control map is comparable with which observed in the electrocortigraphic(ECo G) studies.
The problem of speaker clustering is an important speech analysis task. It is mainly considered on large scale data. The state-of-the-art methods often first converted each speech signal into a high dimensional vector...
详细信息
The problem of speaker clustering is an important speech analysis task. It is mainly considered on large scale data. The state-of-the-art methods often first converted each speech signal into a high dimensional vector-based representation space using the techniques such as GMM supervectors, Joint Factor Analysis and i-vectors, and then employed agglomerative hierarchical clustering algorithms for the recognition. However, mapping an entire corpus of speech utterances to a set of vectors will lead to questions about the structure of the underlying manifold. But most data clustering methods work on the Euclidean space, and hence often fail to discover the intrinsic geometrical and discriminating structure of the data space, which limits their application on some complicated speaker recognition situations. In order to model the underlying manifold structure of the data space, we convert the i-vector representation of speech signals in the Euclidean space into a network structure constructed based on the local(k) nearest neighbor relationship of these signals. We then propose a community detection model for the recognition of speakers, which is built on the assumption that the group of speech signals corresponding to a same speaker will be densely connected with respect to the rest of the network. Furthermore, the similarities of speech signals are also not useless. Here we refine the model with the idea that: if two speech signals have a high similarity in the local Euclidean space, their community membership distributions should be made close in the model. This further enhances its local invariance, which is essential to respect the manifold structure. To sum up, the proposed method can not only effectively model the intrinsic Riemannian structureof the data space with the idea of local invariance, but also be very efficient because it works on highly sparse networks. The most relevant previous work is the method proposed by Shum, Campbell & Reynolds, which also used
Artists usually carefully select different colors in artistic work so as to convey special visual and emotional feelings. Color theme extraction techniques can help users to acquire the color styles in an image. Howev...
ISBN:
(纸本)9781450327923
Artists usually carefully select different colors in artistic work so as to convey special visual and emotional feelings. Color theme extraction techniques can help users to acquire the color styles in an image. However, current color theme extraction methods ignore the emotional factors, and they can only provide a single theme result for an image as well, which don't meet people's favor on different colors under different mood states. This paper introduces the conception of emotional color theme, introducing the color emotion theory into color theme extraction, and proposes a novel emotion color theme extraction framework. Our color theme extraction method can be applied in color transfer, image enhancement, etc.
Recent research has shown that the improvement of mean retrieval effectiveness (e.g., MAP) may sacrifice the retrieval stability across queries, implying a tradeoff between effectiveness and stability. The evaluation ...
详细信息
ISBN:
(纸本)9781450325981
Recent research has shown that the improvement of mean retrieval effectiveness (e.g., MAP) may sacrifice the retrieval stability across queries, implying a tradeoff between effectiveness and stability. The evaluation of both effectiveness and stability are often based on a baseline model, which could be weak or biased. In addition, the effectiveness-stability tradeoff has not been systematically or quantitatively evaluated over TREC participated systems. The above two problems, to some extent, limit our awareness of such tradeoff and its impact on developing future IR models. In this paper, motivated by a recently proposed bias-variance based evaluation, we adopt a strong and unbiased "baseline", which is a virtual target model constructed by the best performance (for each query) among all the participated systems in a retrieval task. We also propose generalized bias-variance metrics, based on which a systematic and quantitative evaluation of the effectiveness-stability tradeoff is carried out over the participated systems in the TREC Ad-hoc Track (1993-1999) and Web Track (2010-2012). We observe a clear effectiveness-stability tradeoff, with a trend of becoming more obvious in more recent years. This implies that when we pursue more effective IR systems over years, the stability has become problematic and could have been largely overlooked. Copyright 2014 ACM.
This paper proposes a new approach for non-rigid structure from motion with occlusion, based on sparse representation. We introduce sparse transform to the joint estimation of 3D shapes and motions. 3D shape trajector...
详细信息
This paper proposes a new approach for non-rigid structure from motion with occlusion, based on sparse representation. We introduce sparse transform to the joint estimation of 3D shapes and motions. 3D shape trajectory space is fit by wavelet basis to achieve better modeling of complex motion. We address the occlusion problem based on the latest developments on sparse representation: matrix completion, which can recover the observation matrix that has high percentages of missing data and can also reduce the noises and outliers in the known elements. Experimental results on datasets without and with occlusion show that our method can better estimate the 3D shapes and motions, compared with state-of-the-art algorithms.
Speech production is complex for the brain to control, since it involves many neural processes such as speech planning, motor control, auditory and somatosensory feedback. Those functions are thought to work both in c...
详细信息
Speech production is complex for the brain to control, since it involves many neural processes such as speech planning, motor control, auditory and somatosensory feedback. Those functions are thought to work both in cascaded and parallel, and the control signals are transformed from one brain area to others with “one-to-many” relations. To describe this situation, in this study, we developed a new framework for a neuro-computational model for speech production based on our previous studies. The proposed model is used to deal with dynamic properties of speech articulation for consonant-vowel (CV-) syllab.es. In our simulation, the neuronal groups (i.e., motor, auditory and somatosensory) were acquired by learning and stored in the self-organizing maps (SOMs), and those relations between the SOMs were investigated. The results show that the time-varying properties were represented properly. In the control signal flow, the model demonstrated “one-to-many” projections between the SOMs, where one neuron in an SOM on average was projected onto 1.64 neurons in another SOM.
暂无评论