This paper presents the VLSI implementation of the continuous restricted Boltzmann machine (CRBM), a probabilistic generative model that is able to model continuous-valued data with a simple and hardware-amenable trai...
详细信息
This paper presents the VLSI implementation of the continuous restricted Boltzmann machine (CRBM), a probabilistic generative model that is able to model continuous-valued data with a simple and hardware-amenable training algorithm. The full CRBM system consists of stochastic neurons whose continuous-valued probabilistic behavior is mediated by injected noise. Integrating on-chip training circuits, the full CRBM system provides a platform for exploring computation with continuous-valued probabilistic behavior in VLSI. The VLSI CRBM's ability both to model and to regenerate continuous-valued data distributions is examined and limitations on its performance are highlighted and discussed.
Background: Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checkin...
详细信息
Background: Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology ( GO) functional class can be employed to identify potential gene annotation errors. Results: In a set of 211 previously annotated mouse protein kinases, we found that 201 of the GO annotations returned by AmiGO appear to be inconsistent with the UniProt functions assigned to their human counterparts. In contrast, 97% of the predicted annotations generated using a machine learning approach were consistent with the UniProt annotations of the human counterparts, as well as with available annotations for these mouse protein kinases in the Mouse Kinome database. Conclusion: We conjecture that most of our predicted annotations are, therefore, correct and suggest that the machine learning approach developed here could be routinely used to detect potential errors in GO annotations generated by high-throughput gene annotation projects.
Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate t...
详细信息
ISBN:
(纸本)9781424417452
Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilisticgenerative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.
暂无评论