A working single system image distributed operating system is presented. Dubbed Kerrighed, it provides a unified approach and support to both the MPI and the shared memory programmingmodels. The system is operational...
详细信息
A working single system image distributed operating system is presented. Dubbed Kerrighed, it provides a unified approach and support to both the MPI and the shared memory programmingmodels. The system is operational in a 16-processor cluster at the Institut de Recherche en Informatique et Systemes Aleatoires in Rennes, France. In this paper, the system is described with emphasis on its main contributing and distinguishing factors, namely its DSM based on memory containers, its flexible handling of scheduling and checkpointing strategies, and its efficient and unified communications layer. Because of the importance and popularity of data parallel applications in these systems, we present a brief discussion of the mapping of two well known and established data parallel algorithms. It is shown that ShearSort is remarkably well suited for the architecture/system pair as is the ever so popular and important two-dimensional fast Fourier transform. (2D FFT).
While standalone Flash memories (NAND) are facing their physical limitations, the emergence of resistive switching memories (RRAM) is seen as a solution for high density, low cost and low energy NAND replacement candi...
详细信息
ISBN:
(纸本)9781509060375
While standalone Flash memories (NAND) are facing their physical limitations, the emergence of resistive switching memories (RRAM) is seen as a solution for high density, low cost and low energy NAND replacement candidate. However, it has been shown that deeply scaled, high density RRAM architectures, such as crosspoint, suffer of voltage drop effects (IR drop) in metal lines, periphery overhead and metal line charging time due to injected current during programming operations and sneaking currents through unselected bitcells. In this work, we first propose several innovative models for IRdrop, periphery overhead and array-line charging time accounting for in-array multiple bit-write operation. Then, we introduce a new methodology for crosspoint memory design to determine IRdrop, periphery overhead and timing associated with the optimal characteristics of 1 selector-1 resistance (ISIR) device. We apply the proposed methodology to various half metal pitch memory technology nodes (from 50nm to 15nm) and to several written word sizes (from 1 to 32 bits). We show that for 1 bit programmed per array, the RRAM programming current has to be lower than 30 mu A and the selector leakage current lower than 10nA and that limitations increase as soon as multiple bits are written simultaneously in the same array. This, suggests massivelyparallel multi-bank write of a small number of bits per array, as the best solution for the RRAM memories to be competitive with NAND memories.
In the current era of unparalleled data expansion, effective handling. large datasets have emerged as a crucial obstacle. When using enormous datasets that are terabytes or petabytes in size, the conventional k-means ...
详细信息
ISBN:
(数字)9798350387490
ISBN:
(纸本)9798350387506
In the current era of unparalleled data expansion, effective handling. large datasets have emerged as a crucial obstacle. When using enormous datasets that are terabytes or petabytes in size, the conventional k-means clustering approach has computational time limits. In the MapReduce framework, we assess the k-means algorithm.'s performance using multiple methods: K-means simple, K-means with Initial Equidistant Centres (IEC), and K-means Java implementation. on MapReduce. We investigate the newsgroup dataset and evaluate their performance in various infrastructures. settings. We also perform a comparative study at different iteration levels between the above-mentioned K-means methods. We use this study to demonstrate improvement in calculated time performance with various infrastructures. Additionally, we also analyze k means algorithms and their behavior with respect to centroids and various iteration levels, and hence provide deeper insights into their dynamics. Our paper offers useful benchmarks for further research and practices working with large-scale data clustering, illuminating the best methods to make use of parallel computation.
In this work, we present HiAER-Spike, a modular, reconfigurable, event-driven neuromorphic computing platform designed to execute large spiking neural networks with up to 160 million neurons and 40 billion synapses - ...
详细信息
ISBN:
(数字)9798331541279
ISBN:
(纸本)9798331541286
In this work, we present HiAER-Spike, a modular, reconfigurable, event-driven neuromorphic computing platform designed to execute large spiking neural networks with up to 160 million neurons and 40 billion synapses - roughly twice the neurons of a mouse brain at faster-than real-time. This system, which is currently under construction at the UC San Diego Supercomputing Center, comprises a co-designed hard-and software stack that is optimized for run-time massivelyparallel processing and hierarchical address-event routing (HiAER) of spikes while promoting memory-efficient network storage and execution. Our architecture efficiently handles both sparse connectivity and sparse activity for robust and low-latency event-driven inference for both edge and cloud computing. A Python programming interface to HiAER-Spike, agnostic to hardware-level detail, shields the user from complexity in the configuration and execution of general spiking neural networks with virtually no constraints in topology. The system is made easily available over a web portal for use by the wider community. In the following we provide an overview of the hard- and software stack, explain the underlying design principles, demonstrate some of the system’s capabilities and solicit feedback from the broader neuromorphic community.
暂无评论