Recent advance in geo-distributed systems has made distributed data processing possible, where tasks are decomposed into subtasks, deployed into multiple data centers and run in parallel. Compared to conventional appr...
详细信息
Modern deep learning has significantly improved the performance and has been used in a variety of applications. Due to the heavy processing cost, major platforms for deep learning have been migrated from commodity com...
详细信息
ISBN:
(纸本)9781538620878
Modern deep learning has significantly improved the performance and has been used in a variety of applications. Due to the heavy processing cost, major platforms for deep learning have been migrated from commodity computers to the cloud where have huge amount of resources. However, the above situation leads to the slowdown of response time due to severe congestion of the network traffic. To alleviate the overconcentration of data traffic and power consumption, many researchers have paid attention to edge computing. We tackle withthe parallelprocessing model using Deep Convolutional Neural Network (DCNN) employed on multiple devices, and the size reduction of network traffic among the devices. We propose a technique that compresses the intermediate data and aggregates common computation in AlexNet for video recognition. Our experiments demonstrate that Zip loss-less compression reduces the amount of data by up to 1/24, and HEVC lossy compression reduces the amount of data by 1/208 with only 3.5% degradation of the recognition accuracy. Moreover, aggregation of common calculation reduces the amount of computation for 30 DCNNs by 90%.
As part of the NSF funded XMS project, we are actively researching automatic detection of poorly performing HPC jobs. To aid the analysis we have generated a taxonomy of the temporal I/O patterns for HPC jobs. In this...
详细信息
Virtual Reality has been traditionally explored in many robotics systems, with applications such as off-line programming, trajectory planning, teleoperation, education, design, natural user interfaces and rehabilitati...
详细信息
there are many large-scale graphs in real world such as Web graphs and social graphs. the interest in large-scale graph analysis is growing in recent years. Breadth-First Search (BFS) is one of the most fundamental gr...
详细信息
there are many large-scale graphs in real world such as Web graphs and social graphs. the interest in large-scale graph analysis is growing in recent years. Breadth-First Search (BFS) is one of the most fundamental graph algorithms used as a component of many graph algorithms. Our new method for distributedparallel BFS can compute BFS for one trillion vertices graph within half a second, using large supercomputers such as the K-Computer. By the use of our proposed algorithm, the K-Computer was ranked 1st in Graph500 using all the 82,944 nodes available on June and November 2015 and June 2016 38,621.4 GTEPS. Based on the hybrid BFS algorithm by Beamer (proceedings of the 2013 ieee 27th International symposium on parallel and distributedprocessing Workshops and PhD Forum, IPDPSW '13, ieee Computer Society, Washington, 2013), we devise sets of optimizations for scaling to extreme number of nodes, including a new efficient graph data structure and several optimization techniques such as vertex reordering and load balancing. Our performance evaluation on K-Computer shows that our new BFS is 3.19 times faster on 30,720 nodes than the base version using the previously known best techniques.
the advent of future 5th Generation (5G) use cases, such as ultra-dense networking and ultra-low latency propelled by Smart Cities and IoT projects will demand revolutionary network infrastructures. the need for low l...
详细信息
ISBN:
(纸本)9781538614655
the advent of future 5th Generation (5G) use cases, such as ultra-dense networking and ultra-low latency propelled by Smart Cities and IoT projects will demand revolutionary network infrastructures. the need for low latency, high bandwidth, scalability, ubiquitous access and support for IoT resource-constrained devices are some of the prominent issues that networks have to face to support future 5G use cases, which arise since current wireless and mobile infrastructures are not able to fulfill. In particular, the pervasiveness and high-density of Wireless Local Area Networks (WLAN) at urban centers, together withtheir growing capacity and evolving standards, can be leveraged to support such demand. We argue that the integration of key 5G cornerstone technologies, such as Network Function Virtualization (NFV) and softwarization, fill some of the abovementioned gaps in regards to proper WLAN management and service orchestration. In this paper, we present a solution for slicing WLAN infrastructures, aiming to provide differentiated services on top of the same substrate through customized, isolated and independent digital building blocks. through this proposal, we aim at efficiently handling ultra-dense networking 5G use cases to achieve benefits at unprecedented levels. Towards this goal, we present proof of concept realised over a real testbed and assess its feasibility.
this paper aims to build a small biconical antenna simulator with 1/10 ns(the rise-time is 1 ns and the half-width is10 ns) to meet the needs of a scale model *** of all,on the basis of understanding the fundamental...
详细信息
ISBN:
(纸本)9781509051861;9781509051854
this paper aims to build a small biconical antenna simulator with 1/10 ns(the rise-time is 1 ns and the half-width is10 ns) to meet the needs of a scale model *** of all,on the basis of understanding the fundamental theory of laconical antenna,it presents a detailed analysis of the propagation characteristics of the horizontal electric field parallel to the antenna radiated from the laconical antenna,meanwhile,the influences of half-angle,radius,arm length and height on the electric field are ***,the antenna dimensions are determined as θ=32°,R=1 m,L=7 m,H=6 m,and the test area of the simulator is calculated according to *** results suggest that the test area is symmetrically distributed with respect to boththe X and Y axes.
More and more massive parallel codes running on several hundreds of thousands of cores are entering the computational science and engineering domain, allowing high-fidelity computations on up to trillions of unknowns ...
详细信息
More and more massive parallel codes running on several hundreds of thousands of cores are entering the computational science and engineering domain, allowing high-fidelity computations on up to trillions of unknowns for very detailed analyses of the underlying problems. Such runs typically produce gigabytes of data, hindering both efficient storage and (interactive) data exploration. Advanced approaches based on inherently distributed data formats such as hierarchical data format version 5 become necessary here to avoid long latencies when storing the data and to support fast (random) access when retrieving the data for visual processing. this paper shows considerations and implementation aspects of an I/O kernel based on hierarchical data format version 5 that supports fast checkpointing, restarting, and selective visualisation using a single shared output file for an existing computational fluid dynamics framework. this functionality is achieved by including the framework's hierarchical data structure in the file, which also opens the door for additional steering functionality. Finally, the performance of the kernel's write routines are presented. Bandwidths close to the theoretical peak on modern supercomputing clusters were achieved by avoiding file-locking and using collective buffering.
暂无评论