Bioinformatics applications constitute an emerging data-intensive, high-performance computing (HPC) domain. While there is much research on algorithmic improvements, (2004), the actual performance of an application al...
详细信息
Bioinformatics applications constitute an emerging data-intensive, high-performance computing (HPC) domain. While there is much research on algorithmic improvements, (2004), the actual performance of an application also depends on how well the program maps to the target hardware. This paper presents a performance study of two parallel bioinformatics applications HMMER (sequence alignment) and SVM-RFE (gene expression analysis), on Intel x86 based hyperthread-capable (2002) shared-memory multiprocessor systems. The performance characteristics varied according to the application and target hardware characteristics. For instance, HMMER is compute intensive and showed better scalability on a 3.0 GHz system versus a 2.2 GHz system. However, SVM-RFE is memory intensive and showed better absolute performance on the 2.2 GHz machine which has better memory bandwidth. The performance is also impacted by processor features, e.g. hyperthreading (HT) (2002) and prefetching. With HMMER we could obtain -75% of the performance with HT enabled with respect to doubling the number of CPUs. While load balancing optimizations can provide speedup of -30% for HMMER on a hyperthreading-enabled system, the load balancing has to adapt to the target number of processors and threads. SVM-RFE benefits differently from the same load-balancing and thread scheduling tuning. We conclude that compiler and runtime optimizations play an important role to achieve the best performance for a given bioinformatics algorithm.
暂无评论