检索结果-内蒙古大学图书馆

An open source multistep model to predict mutagenicity from statistical analysis and relevant structural alerts

CHEMISTRY CENTRAL JOURNAL 2010年第1期4卷 1-6页

作者： Ferrari, Thomas Gini, Giuseppina Politecn Milan Dept Elect & Informat DEI I-20133 Milan Italy

Background: Mutagenicity is the capability of a substance to cause genetic mutations. This property is of high public concern because it has a close relationship with carcinogenicity and potentially with reproductive toxicity. Experimentally, mutagenicity can be assessed by the Ames test on Salmonella with an estimated experimental reproducibility of 85%;this intrinsic limitation of the in vitro test, along with the need for faster and cheaper alternatives, opens the road to other types of assessment methods, such as in silico structure-activity prediction models. A widely used method checks for the presence of known structural alerts for mutagenicity. However the presence of such alerts alone is not a definitive method to prove the mutagenicity of a compound towards Salmonella, since other parts of the molecule can influence and potentially change the classification. Hence statistically based methods will be proposed, with the final objective to obtain a cascade of modeling steps with custom-made properties, such as the reduction of false negatives. Results: A cascade model has been developed and validated on a large public set of molecular structures and their associated Salmonella mutagenicity outcome. The first step consists in the derivation of a statistical model and mutagenicity prediction, followed by further checks for specific structural alerts in the "safe" subset of the prediction outcome space. In terms of accuracy (i.e., overall correct predictions of both negative and positives), the obtained model approached the 85% reproducibility of the experimental mutagenicity Ames test. Conclusions: The model and the documentation for regulatory purposes are freely available on the CAESAR website. The input is simply a file of molecular structures and the output is the classification result.

关键词： Support Vector machine Support Vector machine Classifier machine learning algorithm Support Vector machine Model Ames Test

来源：评论

学校读者我要写书评

暂无评论

Real-time Context-aware Network Security Policy Enforcement System (RC-NSPES)

Real-time Context-aware Network Security Policy Enforcement ...

引用

5th International Conference on Networking and Services (ICNS)

作者： Badii, A. Carter, A. Handzlik, A. Bojanic, S. Englert, T. Patel, D. Pejovic, V. Chorazyczewski, A. Hameed, K. Bankovic, Z. Univ Reading Reading RG6 2AH Berks England Netexpose Creteil France Microtech Int Ltd Wroclaw Poland Univ Politecn Madrid Madrid Spain

ISBN: (纸本)9781424436880

The major technical objectives of the RC-NSPES are to provide a framework for the concurrent operation of reactive and pro-active security functions to deliver efficient and optimised intrusion detection schemes as well as enhanced and highly correlated rule sets for more effective alerts management and root-cause analysis. The design and implementation of the RC-NSPES solution includes a number of innovative features in terms of real-time programmable embedded hardware (FPGA) deployment as well as in the integrated management station. These have been devised so as to deliver enhanced detection of attacks and contextualised alerts against threats that can arise from both the network layer and the application layer protocols. The resulting architecture represents an efficient and effective framework for the future deployment of network security systems.

关键词： Network Security Policy Enforcement Intrusion detection system FPGA contextualised alerts communication protocols (generic) security rules rules language string matching algorithm network management station machine learning algorithm RC-NSPES real-time IDS

来源：评论

学校读者我要写书评

暂无评论

Application of two machine learning algorithms to genetic association studies in the presence of covariates

引用

BMC GENETICS 2008年第1期9卷 1-13页

作者： Nonyane, Bareng As Foulkes, Andrea S. Univ Massachusetts Div Biostat & Epidemiol Sch Publ Hlth & Hlth Sci Amherst MA 01003 USA

Background: Population-based investigations aimed at uncovering genotype-trait associations often involve high-dimensional genetic polymorphism data as well as information on multiple environmental and clinical parameters. machine learning (ML) algorithms offer a straightforward analytic approach for selecting subsets of these inputs that are most predictive of a pre-defined trait. The performance of these algorithms, however, in the presence of covariates is not well characterized. Methods and Results: In this manuscript, we investigate two approaches: Random Forests (RFs) and Multivariate Adaptive Regression Splines (MARS). Through multiple simulation studies, the performance under several underlying models is evaluated. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is also provided. Conclusion: Consistent with more traditional regression modeling theory, our findings highlight the importance of considering the nature of underlying gene-covariate-trait relationships before applying ML algorithms, particularly when there is potential confounding or effect mediation.

关键词： machine learning False Discovery Rate Random Forest machine learning algorithm Multivariate Adaptive Regression Spline

来源：评论

学校读者我要写书评

暂无评论

Early identifying application traffic with application characteristics

Early identifying application traffic with application chara...

引用

IEEE International Conference on Communications (ICC 2008)

作者： Huang, Nen-Fu Jai, Gin-Yuan Chao, Han-Chieh Natl Tsing Hua Univ Dept Comp Sci Hsinchu 30043 Taiwan Natl Ilan Univ Dept Elect Yilan Taiwan

ISBN: (纸本)9781424420742

To more accurately extract the characteristics of application flows, this paper proposes a set of flow attributes to characterize the possible negotiation behaviors of each flow in application layer perspective. The discriminators are available in the early stage, so they are suitable to support real-time based traffic classification and engineering. The ability of flow attributes was tested with several machine learning algorithms. On the other hand, we also compare the accuracy of our method with other related works that addressed real-time traffic classification problem based on the same sample traffic. The result shows that our method outperforms other previous works in protocol level identification with more than 8%similar to 21% accuracy improvement based on fixed-ratio sample flow sets. Furthermore, the proposed method is also suitable to identify encrypted protocols.

关键词： P2P traffic identification/classification machine learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Using machine learning algorithms to guide rehabilitation planning for home care clients

引用

BMC MEDICAL INFORMATICS AND DECISION MAKING 2007年第1期7卷 1-13页

作者： Zhu, Mu Zhang, Zhanyang Hirdes, John P. Stolee, Paul Univ Waterloo Dept Stat & Actuarial Sci Waterloo ON N2L 3G1 Canada Univ Waterloo Dept Hlth Studies & Gerontol Waterloo ON N2L 3G1 Canada Homewood Hlth Ctr Homewood Res Inst Guelph ON Canada Univ Waterloo Sch Optometry Waterloo ON N2L 3G1 Canada Univ Waterloo Res Inst Aging Waterloo ON N2L 3G1 Canada

Background: Targeting older clients for rehabilitation is a clinical challenge and a research priority. We investigate the potential of machine learning algorithms - Support Vector machine (SVM) and K-Nearest Neighbors (KNN) - to guide rehabilitation planning for home care clients. Methods: This study is a secondary analysis of data on 24,724 longer-term clients from eight home care programs in Ontario. Data were collected with the RAI-HC assessment system, in which the Activities of Daily Living Clinical Assessment Protocol (ADLCAP) is used to identify clients with rehabilitation potential. For study purposes, a client is defined as having rehabilitation potential if there was: i) improvement in ADL functioning, or ii) discharge home. SVM and KNN results are compared with those obtained using the ADLCAP. For comparison, the machine learning algorithms use the same functional and health status indicators as the ADLCAP. Results: The KNN and SVM algorithms achieved similar substantially improved performance over the ADLCAP, although false positive and false negative rates were still fairly high (FP > .18, FN > .34 versus FP > .29, FN. > .58 for ADLCAP). Results are used to suggest potential revisions to the ADLCAP. Conclusion: machine learning algorithms achieved superior predictions than the current protocol. machine learning results are less readily interpretable, but can also be used to guide development of improved clinical protocols.

关键词： Support Vector machine machine learning algorithm Support Vector machine Model Original Scale Support Vector machine algorithm

来源：评论

学校读者我要写书评

暂无评论

Engineering proteinase K using machine learning and synthetic genes

引用

BMC BIOTECHNOLOGY 2007年第1期7卷 1-19页

作者： Liao, Jun Warmuth, Manfred K. Govindarajan, Sridhar Ness, Jon E. Wang, Rebecca P. Gustafsson, Claes Minshull, Jeremy DNA 20 Menlo Pk CA 94025 USA Univ Calif Santa Cruz Dept Comp Sci Santa Cruz CA 95064 USA

Background: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. Results: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68 C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. Conclusion: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties

关键词： machine learning Amino Acid Substitution Active Variant Partial Less Square Regression machine learning algorithm

来源：评论

学校读者我要写书评

暂无评论

A new approach in learning for intelligent multi agent systems

A new approach in learning for intelligent multi agent syste...

引用

21st European Conference on Modelling and Simulation

作者： Elmahalawy, Ahmed M. Czech Tech Univ Dept Cybernet Karlovo Namesti 13 Prague 12135 2 Czech Republic

ISBN: (纸本)9780955301827

The agent technology has recently become one of the most vibrant and fastest growing areas in information technology. This technology is developed to use more than one agent in the system;this is called Multi Agent systems (MAS). The system that depends on this technology should have been studied extensively. One of the most important characteristic of this is its ability to learn and adapt itself, where it has been done using one of the machine learning algorithms. Repertory Grid (RG) which has become a widely used and accepted technique for knowledge elicitation and has been implemented as a major component for many computer-based knowledge acquisition systems. In this paper, RG has been developed to be one of the machine learning algorithms and then used in MAS.

关键词： intelligent agent technology intelligent multi agent system machine learning algorithm Repertory Grid algorithm

来源：评论

学校读者我要写书评

暂无评论

Development of new in silico methods to identify ligands for orphan GPCR

引用

BMC Chemistry 2008年第1期2卷 1-1页

作者： Nathanael Weill Didier Rognan Bioinformatics of the Drug UMR7175/LC1 Université Louis Pasteur Strasbourg 74 route du Rhin B.P.24 F-67401 Illkirch France

来源：评论

学校读者我要写书评

暂无评论

An incremental EM-based learning approach for on-line prediction of hospital resource utilization

引用

ARTIFICIAL INTELLIGENCE IN MEDICINE 2006年第3期36卷 257-267页

作者： Ng, SK McLachlan, GJ Lee, AH Univ Queensland Dept Math Brisbane Qld 4072 Australia Univ Queensland Inst Mol Biosci Brisbane Qld 4072 Australia Curtin Univ Technol Sch Publ Hlth Perth WA 6845 Australia

Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value =

关键词： EM algorithm mixture of experts incremental update length of stay machine learning algorithm on-line prediction

来源：评论

学校读者我要写书评

暂无评论

Clustered linear regression

引用

KNOWLEDGE-BASED SYSTEMS 2002年第3期15卷 169-175页

作者： Ari, B Güvenir, HA Bilkent Univ Dept Comp Engn TR-06533 Ankara Turkey

Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there axe some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set-that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： clustering linear regression machine learning algorithm eager approach

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：