Background: Several Computerized Adaptive Tests (CATs) have been proposed to facilitate assessments in mental health. These tests are built in a standard way, disregarding useful and usually available information not ...
详细信息
Background: Several Computerized Adaptive Tests (CATs) have been proposed to facilitate assessments in mental health. These tests are built in a standard way, disregarding useful and usually available information not included in the assessment scales that could increase the precision and utility of CATs, such as the history of suicide attempts. Methods: Using the items of a previously developed scale for suicidal risk, we compared the performance of a standard CAT and a decision tree in a support decision system to identify suicidal behavior. We included the history of past suicide attempts as a class for the separation of patients in the decision tree. Results: The decision tree needed an average of four items to achieve a similar accuracy than a standard CAT with nine items. The accuracy of the decision tree, obtained after 25 cross-validations, was 81.4%. A shortened test adapted for the separation of suicidal and non-suicidal patients was developed. Conclusion: CATs can be very useful tools for the assessment of suicidal risk. However, standard CATs do not use all the information that is available. A decision tree can improve the precision of the assessment since they are constructed using a priori information. (C) 2016 Elsevier B.V. All rights reserved.
In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed a...
详细信息
In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed as a single-step procedure, usually called a test battery, using simultaneously all information collected on different endpoints, or as tiered approach in which a decision tree is followed. Design of a testing strategy involves statistical considerations, such as the development of a statistical prediction model. During the EU FP6 ACuteTox project, several prediction models were proposed on the basis of statistical classification algorithms which we illustrate here. The final choice of testing strategies was not based on statistical considerations alone. However, without thorough statistical evaluations a testing strategy cannot be identified. We present here a number of observations made from the statistical viewpoint which relate to the development of testing strategies. The points we make were derived from problems we had to deal with during the evaluation of this large research project. A central issue during the development of a prediction model is the danger of overfitting. Procedures are presented to deal with this challenge. (C) 2012 Elsevier Ltd. All rights reserved.
Protein secondary structure prediction (PSSP) is a fundamental task in protein science and computational biology, and it can be used to understand protein 3-dimensional (3-D) structures, further, to learn their biolog...
详细信息
Protein secondary structure prediction (PSSP) is a fundamental task in protein science and computational biology, and it can be used to understand protein 3-dimensional (3-D) structures, further, to learn their biological functions. In the past decade, a large number of methods have been proposed for PSSP. In order to learn the latest progress of PSSP, this paper provides a survey on the development of this field. It first introduces the background and related knowledge of PSSP, including basic concepts, data sets, input data features and prediction accuracy assessment. Then, it reviews the recent algorithmic developments of PSSP, which mainly focus on the latest decade. Finally, it summarizes the corresponding tendencies and challenges in this field. This survey concludes that although various PSSP methods have been proposed, there still exist several further improvements or potential research directions. We hope that the presented guidelines will help nonspecialists and specialists to learn the critical progress in PSSP in recent years. (C) 2017 Elsevier Inc. All rights reserved.
Knee joints play an indispensable role in the activities of daily living. In particular, the knee joints of the elderly and the physically challenged require continuous care in order to ensure a healthy daily life. Th...
详细信息
Knee joints play an indispensable role in the activities of daily living. In particular, the knee joints of the elderly and the physically challenged require continuous care in order to ensure a healthy daily life. This study proposes a health monitoring system for knee joints, which is able to classify lower extremity movements using the angle and acceleration components of these joints. The proposed monitoring system consists of a wearable frame placed on the knee joint, consisting of a sensor part for monitoring the knee joint angle and acceleration and a wireless communication part for transferring bio signals to a smart device. Knee joint angles and accelerations are measured using potentiometers installed at the hinges of the upper and lower parts of the wearable frame and an inertial sensor (IMU) attached to the thigh. Data thus measured are transferred via Bluetooth to an application on a smart device. The proposed system incorporates a classification algorithm for lower extremity movements, which can distinguish users' actions such as sitting, lying, and standing by using real-time measurements of knee joint angles and accelerations. This study shows that the proposed monitoring system detects postures that negatively affect knee joints and informs a user when these postures are adopted, thereby helping to maintain healthy knee joints.
In order to get a more general result related on fuzzy implications that induced by aggre-gation functions, we relax the definition of general overlap functions, more precisely, removing its right-continuous, and intr...
详细信息
In order to get a more general result related on fuzzy implications that induced by aggre-gation functions, we relax the definition of general overlap functions, more precisely, removing its right-continuous, and introduce a new kind of aggregation function, which called semi-overlap function. Subsequently, we explore some of their related algebraic properties and its corresponding residual implications. Moreover, serval scholars have pro-vided kinds of methods for fuzzy modus ponens (FMP, for short) problems, such as Zadeh's compositional rule of inference (CRI, for short), Wang's triple I method (TIM, for short) and quintuple implication principle (QIP, for short). Compared with CRI and TIM, QIP has some advantages in solving FMP problems. Based on the above theory foundation of semi -overlap functions and their residual implications, we further consider the QIP for FMP problems. Finally, we propose a new classification algorithm that based on semi-overlap functions and QIP, which called SO5I-FRC algorithm. Through the comparative tests, the average accuracy of SO5I-FRC algorithm is higher than FARC-HD algorithm. The experi-mental results indicate that semi-overlap functions and QIP have certain advantages and a wide range of applications in classification problems.(c) 2022 Elsevier Inc. All rights reserved.
Background: Protein function is closely related to its location within the cell. Determination of protein subcellular location is helpful in uncovering its functions. However, traditional biological experiments to det...
详细信息
Background: Protein function is closely related to its location within the cell. Determination of protein subcellular location is helpful in uncovering its functions. However, traditional biological experiments to determine the subcellular location are of high cost and low efficiency, which cannot meet today's needs. In recent years, many computational models have been set up to identify the subcellular location of proteins. Most models use features derived from protein sequences. Recently, features extracted from the protein-protein interaction (PPI) network have become popular in studying various protein-related problems. Objective: A novel model with features derived from multiple PPI networks was proposed to predict protein subcellular location. Methods: Protein features were obtained by a newly designed network embedding algorithm, Mnode2vec, which is a generalized version of the classic Node2vec algorithm. Two classic classification algorithms: support vector machine and random forest, were employed to build the model. Results: Such model provided good performance and was superior to the model with features extracted by Node2vec. Also, this model outperformed some classic models. Furthermore, Mnode2vec was found to produce powerful features when the path length was small. Conclusion: The proposed model can be a powerful tool to determine protein subcellular location, and Mnode2vec can efficiently extract informative features from multiple networks.
The locations of the initiation of genomic DNA replication are defined as origins of replication sites (ORIs), which regulate the onset of DNA replication and play significant roles in the DNA replication process. The...
详细信息
The locations of the initiation of genomic DNA replication are defined as origins of replication sites (ORIs), which regulate the onset of DNA replication and play significant roles in the DNA replication process. The study of ORIs is essential for understanding the cell-division cycle and gene expression regulation. Accurate identification of ORIs will provide important clues for DNA replication research and drug development by developing computational methods. In this paper, the first integrated predictor named iORI-Euk was built to identify ORIs in multiple eukaryotes and multiple cell types. In the predictor, seven eukaryotic (Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana, Pichia pastoris, Schizosaccharomyces pombe and Kluyveromyces lactis) ORI data was collected from public database to construct benchmark datasets. Subsequently, three feature extraction strategies which are k-mer, binary encoding and combination of k-mer and binary were used to formulate DNA sequence samples. We also compared the different classification algorithms' performance. As a result, the best results were obtained by using support vector machine in 5-fold cross-validation test and independent dataset test. Based on the optimal model, an online web server called iORI-Euk (http://***/server/iO RI- Euk/) was established for the novel ORI identification.
In this paper, we formulate and estimate a flexible model of job mobility and wages with two-sided heterogeneity. The analysis extends the finite mixture approach of Bonhomme, Lamadon, and Manresa (2019) and Abowd, Mc...
详细信息
In this paper, we formulate and estimate a flexible model of job mobility and wages with two-sided heterogeneity. The analysis extends the finite mixture approach of Bonhomme, Lamadon, and Manresa (2019) and Abowd, McKinney, and Schmutte (2019) to develop a new classification Expectation-Maximization algorithm that ensures both worker and firm latent-type identification using wage and mobility variations in the data. Workers receive job offers in worker-type segmented labor markets. Offers are accepted according to a logit form that compares the value of the current job with that of the new job. In combination with flexibly estimated layoff and job finding rates, the analysis quantifies the four different sources of sorting: job preferences, segmentation, layoffs, and job finding. Job preferences are identified through job-to-job moves in a revealed preference argument. They are in the model structurally independent of the identified job wages, possibly as a reflection of the presence of amenities. We find evidence of a strong pecuniary motive in job preferences. While the correlation between preferences and current job wages is positive, the net present value of the future earnings stream given the current job correlates much more strongly with preferences for it. This is more so for short- than long-tenure workers. In the analysis, we distinguish between type sorting and wage sorting. Type sorting is quantified by means of the mutual information index. Wage sorting is captured through correlation between identified wage types. While layoffs are less important than the other channels, we find all channels to contribute substantially to sorting. As workers age, job arrival processes are the key determinant of wage sorting, whereas the role of job preferences dictate type sorting. Over the life cycle, job preferences intensify, type sorting increases, and pecuniary considerations wane.
DNA-binding proteins (DBPs) are responsible for several cellular functions, starting from our immunity system to the transport of oxygen. In the recent studies, scientists have used supervised machine learning based m...
详细信息
DNA-binding proteins (DBPs) are responsible for several cellular functions, starting from our immunity system to the transport of oxygen. In the recent studies, scientists have used supervised machine learning based methods that use information from the protein sequence only to classify the DBPs. Most of the methods work effectively on the train sets but performance of most of them degrades in the independent test set. It shows a room for improving the prediction method by reducing over-fitting. In this paper, we have extracted several features solely using the protein sequence and carried out two different types of feature selection on them. Our results have proven comparable on training set and significantly improved on the independent test set. On the independent test set our accuracy was 82.26% which is 1.62% improved compared to the previous best state-of-the-art methods. Performance in terms of sensitivity and area under receiver operating characteristic curve for the independent test set was also higher and they were 0.95 and 0.823 respectively. (C) 2018 Elsevier Ltd. All rights reserved.
Wireless networks have become integral to society as they provide mobility and scalability advantages. However, their disadvantage is that they cannot control the media, which makes them vulnerable to various types of...
详细信息
Wireless networks have become integral to society as they provide mobility and scalability advantages. However, their disadvantage is that they cannot control the media, which makes them vulnerable to various types of attacks. One example of such attacks is the evil twin access point (AP) attack, in which an authorized AP is impersonated by mimicking its service set identifier (SSID) and media access control (MAC) address. Evil twin APs are a major source of deception in wireless networks, facilitating message forgery and eavesdropping. Hence, it is necessary to detect them rapidly. To this end, numerous methods using clock skew have been proposed for evil twin AP detection. However, clock skew is difficult to calculate precisely because wireless networks are vulnerable to noise. This paper proposes an evil twin AP detection method that uses a multiple-feature-based machine learning classification algorithm. The features used in the proposed method are clock skew, channel, received signal strength, and duration. The results of experiments conducted indicate that the proposed method has an evil twin AP detection accuracy of 100% using the random forest algorithm.
暂无评论