Summary form only given. After a brief review on the current status of HPC research and development in China, this talk will introduce the plan of HPC development under the new Key R&D Program of China in the 13th...
详细信息
ISBN:
(纸本)9781509050826
Summary form only given. After a brief review on the current status of HPC research and development in China, this talk will introduce the plan of HPC development under the new Key R&D Program of China in the 13th 5-year plan. The motivation of developing the exa-scale computer and the need for establishing an eco-system for high performance computing in China will be presented. The major challenges in developing the exa-scale computer will be discussed. The major R&D activities of the new key project on HPC in the next 5 years will also be presented.
The flexibility of Field Programmable Gate Arrays (FPGAs) as well as their parallelprocessing capabilities make them a good choice for digital signal processing in communication systems. However, today, further impro...
详细信息
ISBN:
(纸本)9781509012886
The flexibility of Field Programmable Gate Arrays (FPGAs) as well as their parallelprocessing capabilities make them a good choice for digital signal processing in communication systems. However, today, further improvements in performance hang in mid-air as we run into the frequency wall and FPGA based devices are clocked below 1 GHz. New methodologies which can cater performance optimization within the frequency wall limitation become highly essential. In this context, efficient modulation techniques like Quadrature Amplitude Modulation (QAM) and mixed time and frequency domain approach have been utilized in this paper to employ a generic scalable FPGA based QAM transmitter with the filter parallelization being executed in mixed domain. The system developed in this paper achieves a throughput of 4 Gb/s for QAM-16 format with a clock frequency as low as 62.5 MHz, thereby, paves down a promising methodology for applications where having higher clock frequencies is a hard limit.
Big Data is now a widely studied concept in the field of massive processing of information, but testing systems and applications in this field is a difficult task, because the real environments to be used are of a lar...
详细信息
Big Data is now a widely studied concept in the field of massive processing of information, but testing systems and applications in this field is a difficult task, because the real environments to be used are of a large scale and many times impossible to reproduce. Usually, for this task executions in virtual settings, called simulations, are used. The present article presents a simulator for applications developed to be executed on the Apache S4 distributed computing platform for multiple hardware and software scenarios. A version in which mobile devices as well as processing units are used is proposed. The results show that the simulator without mobile devices gets good prediction performance results considering the processing and communication times between the elements of the S4 applications, and the other version with mobile systems shows a decrease in performance in different simulation configurations. However, adding replication in the elements that are communicated with the cellphones has shown substantial improvements in the application's performance.
Low power Systems-on-Chip (SoCs), originally developed in the context of mobile and embedded technologies, are becoming attractive for the scientific community given their increasing computing performances, coupled wi...
详细信息
Low power Systems-on-Chip (SoCs), originally developed in the context of mobile and embedded technologies, are becoming attractive for the scientific community given their increasing computing performances, coupled with relatively low cost and power demand. In this work, we investigate the potential of SoCs for realistic scientific workloads, in particular taken from the bioinformatics and astrophysics domains. We selected a series of parallel, computationally intensive scientific applications and ported them to a cluster of development boards based on low power SoCs. The performance results obtained for the different applications are reported and compared with those obtained on a typical x86 HPC node.
The design of durable structure components requires durability analysis in CAE systems. Such analysis requires batch processing of multiple same-type calculation routines and cloud computations use is a favorable choi...
详细信息
ISBN:
(纸本)9783038355250
The design of durable structure components requires durability analysis in CAE systems. Such analysis requires batch processing of multiple same-type calculation routines and cloud computations use is a favorable choice for such a task. To develop CAE durability modules analytical approach to different batch processing systems efficiency estimation is necessary. Such modules are intended for durability analysis of pre-hydrogenated and statically loaded structure components with initial defects. Durability estimate is defined as crack growth time elapsed from initial defect state to structure component fracture. Crack kinetics model had been used to simulate a fracture process, required for safe operation of a structure;crack length curves had been obtained and analyzed. The results were verified with the published experimental data on the subject at hand. The empirically specified crack kinetics model assumes the dependence between its parameters and structure components durability thus many similar separate computational tasks have to be done, which to save design time can be executed in parallel. An approach to parallel batch processing organization is described: a distributed cloud application, built on top of Microsoft Azure services, which engages multiple computational resources from a distant cloud server to perform parallel execution of simulations. Additionally, an efficiency criterion of parallel simulation tasks batch processing is suggested. Authors’ criterion can be applied to differentiate between various implementations of batch processing of similar computational tasks thus enabling to reach sufficient cost-effectiveness of computational resources utilization. This approach is important for design works budget planning because prolonged cloud resources use gradually increases cost of designed products – structure components. Using the criterion the most efficient source of computational power for any given task can be selected both automatically and
Avionics applications need to be certified for the highest criticality standard. This certification includes schedulability analysis and worst-case execution time (WCET) analysis. WCET analysis is only possible when t...
详细信息
Avionics applications need to be certified for the highest criticality standard. This certification includes schedulability analysis and worst-case execution time (WCET) analysis. WCET analysis is only possible when the software is written to be WCET analyzable and when the platform is time-predictable. In this paper we present prototype avionics applications that have been ported to the time-predictable T-CREST platform. The applications are WCET analyzable, and T-CREST is supported by the aiT WCET analyzer. This combination allows us to provide WCET bounds of avionic tasks, even when executing on a multicore processor.
The proceedings contain 14 papers. The special focus in this conference is on Brain-Inspired Computing. The topics include: Human brainnetome atlas and its potential applications in brain-inspired computing;workflows ...
ISBN:
(纸本)9783319508610
The proceedings contain 14 papers. The special focus in this conference is on Brain-Inspired Computing. The topics include: Human brainnetome atlas and its potential applications in brain-inspired computing;workflows for ultra-high resolution 3D models of the human brain on massively parallel supercomputers;including gap junctions into distributed neuronal network simulations;finite-difference time-domain simulation for three-dimensional polarized light imaging;visual processing in cortical architecture from neuroscience to neuromorphic computing;bio-inspired filters for audio analysis;sophisticated LVQ classification models - beyond accuracy optimization;classification of FDG-PET brain data by generalized matrix relevance LVQ;a cephalomorph real-time computer;towards the ultimate display for neuroscientific data analysis;sentiment analysis and affective computing;methods and applications and deep representations for collaborative robotics.
MapReduce programming model is a popular model to simplify but speed up data parallelapplications. However, it is not efficient for iterative applications because of its repeated data transmission with HDFS (Hadoop D...
详细信息
MapReduce programming model is a popular model to simplify but speed up data parallelapplications. However, it is not efficient for iterative applications because of its repeated data transmission with HDFS (Hadoop distributed File System). Conch, a cyclic MapReduce model, is designed for efficient processing of iterative applications. In order to minimize network overhead, shared data is cached locally and a "map-shuffle" phase is presented with a combined transmission mechanism. Meanwhile, a prediction scheduler for iterative applications is brought out to achieve better data locality in terms of runtime information. The experiments show that Conch can support iterative applications transparently and efficiently. Compared with Hadoop and HaLoop in single-job environment, Conch can achieve 13%-17% improvements on K-Means and fuzzy C-Means. Especially in multi-job environment, 63.6% and 28.6% improvements can be obtained compared with Hadoop and HaLoop.
Computer vision has played a key role in developing object detection and tracking techniques for Surveillance system. Most of the implementations currently employed are based on Serial execution on General Purpose Pro...
详细信息
ISBN:
(纸本)9781509016235
Computer vision has played a key role in developing object detection and tracking techniques for Surveillance system. Most of the implementations currently employed are based on Serial execution on General Purpose Processors. But the high cost and complexity of such implementations doesn't make it a viable option for real time surveillance system. The system proposed here is implemented on Field Programmable Gate Arrays (FPGA) Zynq XC7Z020 board using Modified Background Subtraction algorithm for real-time Object Detection and Tracking. The presence of numerous configurable logic blocks, distributed memory and hard Digital Signal processing (DSP) modules offers a great flexibility in achieving Temporal and Spatial parallelism. This paper uses Xilinx ISE software for implementation which is programmed in VHDL. OV7670 camera used in the paper has a resolution of 0.3 Megapixel and it captures the video at a speed of 30fps. The reference frame and the subsequent incoming frames are stored in different memory modules before the Modified Background Subtraction algorithm is applied on these frames to obtain the difference image. After comparing it with the threshold, the resultant image is displayed and its addresses are stored in order to track it. The system works in real time with minimum time lag between the capture and display. Moreover the entire system is optimized in terms of speed, memory requirements as well as the number of logic elements used which makes it suitable for application in real-time surveillance system.
The proceedings contain 39 papers. The topics discussed include: H2F: a hierarchical Hadoop framework for big data processing in geo-distributed environments;performance characterization of Hadoop workloads on SR-IOV-...
ISBN:
(纸本)9781450346177
The proceedings contain 39 papers. The topics discussed include: H2F: a hierarchical Hadoop framework for big data processing in geo-distributed environments;performance characterization of Hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters;a visual analytics approach to author name disambiguation;applying big data warehousing and visualization techniques on PingER data;efficient service discovery in decentralized online social networks;towards longitudinal analysis of a population's electronic health records using factor graphs;disease gene discovery of single-gene disorders based on complex network;identifying patient experience from online resources via sentiment analysis and topic modelling;a study of factuality, objectivity and relevance: three desiderata in large-scale information retrieval;on exploiting data locality for iterative MapReduce applications in hybrid clouds;spatial and temporal analysis of urban space utilization with renewable wireless sensor network;spatial big data for designing large scale infrastructure: a case-study of electrical road systems;a real-time big data analysis framework on a CPU/GPU heterogeneous cluster: a meteorological application case study;a benchmarking platform for analyzing corpora of traces: the recognition of the users' involvement in fields of competencies;and not too late to identify potential churners: early churn prediction in telecommunication industry.
暂无评论