As The integration of Physical space and cyberspace, the large-scale data distributing to diversification terminal which is geographical distribution of mass has become a huge challenge. When the data size can't b...
详细信息
As The integration of Physical space and cyberspace, the large-scale data distributing to diversification terminal which is geographical distribution of mass has become a huge challenge. When the data size can't be processed by the technology for traditional scope, how to deal with the user quality of service and efficient use of system resources has become an important issue of concern, with the resources becoming limited. This paper presents a data-driven mechanism for large-scale data distribution which is consists of four core part of the data production, data collection and pre-processing, data analysis engine, data consumption, aims to excavate the valuable information to improve the efficiency of resource use and accurate fault location for the Large-scale data distribution system. At the same time, this paper studies the resource scheduling optimization with analyzing data driven for the system behavior and Fault location with analyzing data-driven environment, which proves the effectiveness for the operation of the Large-scale data distribution system optimization by the data-driven working.
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features e...
详细信息
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.
Fingerprint has been widely used in a variety of biometric identification systems. However, with the rapid development of fingerprint identification systems, the amount of fingerprints information stored in systems ha...
详细信息
We develop a simulator for 3D tissue of the human cardiac ventricle with a physiologically realistic cell model and deploy it on the supercomputer Tianhe-2. In order to attain the full performance of the heterogeneous...
详细信息
ISBN:
(纸本)9781509053827
We develop a simulator for 3D tissue of the human cardiac ventricle with a physiologically realistic cell model and deploy it on the supercomputer Tianhe-2. In order to attain the full performance of the heterogeneous CPU-Xeon Phi design, we use carefully optimized codes for both devices and combine them to obtain suitable load balancing. Using a large number of nodes, we are able to perform tissue-scale simulations of the electrical activity and calcium handling in millions of cells, at a level of detail that tracks the states of trillions of ryanodine receptors. We can thus simulate arrythmogenic spiral waves and other complex arrhythmogenic patterns which arise from calcium handling deficiencies in human cardiac ventricle tissue. Due to extensive code tuning and parallelization via OpenMP, MPI, and SCIF/COI, large scale simulations of 10 heartbeats can be performed in a matter of hours. Test results indicate excellent scalability, thus paving the way for detailed whole-heart simulations in future generations of leadership class supercomputers.
OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When ...
详细信息
OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus has been extensively used. However, locality concerns exposed in GPU-specific OpenCL code are usually inherited without analysis, which may give side-effects on the CPU performance. Typi- cally, the use of OpenCL's local memory on multi-core/many-core CPUs may lead to an opposite performance effect, because local-memory arrays no longer match well with the hardware and the associated synchronizations are costly. To solve this dilemma, we actively analyze the memory access patterns using array-access descriptors derived from GPU-specific kernels, which can thus be adapted for CPUs by (1) removing all the unwanted local-memory arrays together with the obsolete barrier statements and (2) optimizing the coalesced kernel code with vectorization and locality re-exploitation. Moreover, we have developed an automated tool chain that makes this transformation of GPU-specific OpenCL kernels into a CPU-friendly form, which is accompanied with a scheduler that forms a new OpenCL runtime. Experiments show that the automated transformation can improve OpenCL kernel performance on a multi-core CPU by an average factor of 3.24. Satisfactory performance improvements axe also achieved on Intel's many-integrated-core coprocessor. The resultant performance on both architectures is better than or comparable with the corresponding OpenMP performance.
Cybercrime caused by malware becomes a persistent and damaging threat which makes the trusted security solution urgently demanded, especially for resource-constrained ends. The existing industry and academic approache...
详细信息
Cooperation of CPU and hardware accelerator on SoC FPGA to accomplish computational intensive tasks, provides significant advantages in performance and energy efficiency. However, current operating systems provide lit...
详细信息
As the de facto Internet inter-domain routing protocol, BGP protocol has a number of vulnerabilities and weakness. Monitoring BGP is an effective way to improve the security of inter-domain routing. This paper present...
详细信息
The volume of malwares is growing at an exponential speed nowadays. This huge growth makes it extremely hard to analyse malware manually. Most existing signatures extracting methods are based on string signatures, and...
详细信息
In data center networks, resource allocation based on workload is an effective way to allocate the infrastructure resources to diverse cloud applications and satisfy the quality of service for the users, which refers ...
详细信息
In data center networks, resource allocation based on workload is an effective way to allocate the infrastructure resources to diverse cloud applications and satisfy the quality of service for the users, which refers to mapping a large number of workloads provided by cloud users/tenants to substrate network provided by cloud providers. Although the existing heuristic approaches are able to find a feasible solution, the quality of the solution is not guaranteed. Concerning this issue, based on the minimum mapping cost, this paper solves the resource allocation problem by modeling it as a distributed constraint optimization problem. Then an efficient approach is proposed to solve the resource allocation problem, aiming to find a feasible solution and ensuring the optimality of the solution. Finally, theoretical analysis and extensive experiments have demonstrated the effectiveness and efficiency of our proposed approach.
暂无评论