Protein-protein interface underlies the protein protein *** mutation of protein-protein interface residues has shown that the distribution of binding free energy is not average among the interface ***,there are hot sp...
详细信息
Protein-protein interface underlies the protein protein *** mutation of protein-protein interface residues has shown that the distribution of binding free energy is not average among the interface ***,there are hot spots in the protein interfaces that contribute most binding *** we provide a new method based on integer quadratic programming that systematically aligns protein surface structures shared by a set of *** method incorporates protein sequence and structure data,and can correctly identify residues having evolutional and structural conservation between different *** is sequence order independent, so can unravel the evolutional similarity between distant ***,it can be used to predict hot spots with ROC area AUC=*** with most hot spot prediction methods,our method does not need prior knowledge for the structure of protein complex or even the structure of the binding partner.
This paper studies haplotype inference by maximum parsimony using population data. We define the optimal haplotype inference (OHI) problem as given a set of genotypes and a set of related haplotypes, find a minimum su...
详细信息
ISBN:
(纸本)9781581139648
This paper studies haplotype inference by maximum parsimony using population data. We define the optimal haplotype inference (OHI) problem as given a set of genotypes and a set of related haplotypes, find a minimum subset of haplotypes that can resolve all the genotypes. We prove that OHI is NP-hard and can be formulated as an integer quadratic programming (IQP) problem. To solve the IQP problem, we propose an iterative semi-definite programming based approximation algorithm, (called SDPHapInfer). We show that this algorithm finds a solution within a factor of O(logn) of the optimal solution, where n is the number of genotypes. This algorithm has been implemented and tested on a variety of simulated and biological data. In comparison with three other methods: HAPAR, HAPLOTYPER, and PHASE, the experimental results indicate that SDPHapInfer and HAPLOTYPER have similar error rates. In addition, the results generated by PHASE have lower error rates on some data but higher error rates on others. The error rates of HAPAR are higher than the others on biological data. In terms of efficiency, SDPHapInfer, HAPLOTYPER, and PHASE output a solution in a stable and consistent way, and they run much faster than HAPAR when the number of genotypes becomes large.
暂无评论