Nowadays, botnet-based attacks are the most prevalent cyber-threats type. It is therefore essential to detect this kind of malware using efficient bots detection techniques. This paper presents our security anomalies ...
详细信息
ISBN:
(纸本)9781728181042
Nowadays, botnet-based attacks are the most prevalent cyber-threats type. It is therefore essential to detect this kind of malware using efficient bots detection techniques. This paper presents our security anomalies detection system, based on a model that we named Combined Forest. Our approach consists of merging some pre-processed Decision Trees to highlight different kinds of botnet by detecting their intrinsic exchanges. Using a supervised data approach, each tree is built from a labelled dataset. In order to achieve this, we aggregate the IP-flows into Traffic-flows to extract key features and avoid over-fitting. Then, we tested different machine learning algorithms and selected the most suitable one. After that, many experiments have been done to determine the best parameters and design the most accurate, adaptative and efficient model.
Background: Accurately predicted contacts allow to compute the 3D structure of a protein. Since the solution space of native residue-residue contact pairs is very large, it is necessary to leverage information to iden...
详细信息
Background: Accurately predicted contacts allow to compute the 3D structure of a protein. Since the solution space of native residue-residue contact pairs is very large, it is necessary to leverage information to identify relevant regions of the solution space, i.e. correct contacts. Every additional source of information can contribute to narrowing down candidate regions. Therefore, recent methods combined evolutionary and sequence-based information as well as evolutionary and physicochemical information. We develop a new contact predictor (EPSILON-CP) that goes beyond current methods by combining evolutionary, physicochemical, and sequence-based information. The problems resulting from the increased dimensionality and complexity of the learning problem are combated with a careful feature analysis, which results in a drastically reduced feature set. The different information sources are combined using deep neural networks. Results: On 21 hard CASP11 FM targets, EPSILON-CP achieves a mean precision of 35.7% for top-L/10 predicted long-range contacts, which is 11% better than the CASP11 winning version of metaPSICOV. The improvement on 1.5L is 17%. Furthermore, in this study we find that the amino acid composition, a commonly used feature, is rendered ineffective in the context of meta approaches. The size of the refined feature set decreased by 75%, enabling a significant increase in training data for machine learning, contributing significantly to the observed improvements. Conclusions: Exploiting as much and diverse information as possible is key to accurate contact prediction. Simply merging the information introduces new challenges. Our study suggests that critical feature analysis can improve the performance of contact prediction methods that combine multiple information sources. EPSILON-CP is available as a webservice: http://***/epsilon/
暂无评论