This summary paper(1) proposes an FPGA-based array processor which performs Laplacian filtering on a 40 by 40 pixel grayscale video. The architecture comprises of bit-serial pixel processors interconnected to give a t...
详细信息
ISBN:
(纸本)9780769549699;9781467360050
This summary paper(1) proposes an FPGA-based array processor which performs Laplacian filtering on a 40 by 40 pixel grayscale video. The architecture comprises of bit-serial pixel processors interconnected to give a two-dimensional mesh array. This architecture features the novel use of partial reconfiguration which transfers data to and fro the array. Each processor occupies a configurable logic block and achieves a target frame rate of 10000 frames per second, at an operating frequency of 0.31 MHz on the Virtex-6 ML605 Evaluation Kit. The detailed correspondence between the contents of slice lookup tables and the Virtex-6 bitstream format is also documented.
Data centers are considered nowadays as the factories of the digital age, being currently responsible for consuming more energy than the entire United Kingdom. On the other side, the global total capacity of renewable...
详细信息
ISBN:
(纸本)9781728141947
Data centers are considered nowadays as the factories of the digital age, being currently responsible for consuming more energy than the entire United Kingdom. On the other side, the global total capacity of renewable power increases continuously. The combination of these two factors calls for new approaches in designing data centers powered only by renewable energy sources. Our work focuses on task scheduling optimization under a power envelope, and on the way to handle power starvation, i.e. when the available power does not provide sufficient resources to execute a given workload. To do so we utilize the concept of task degradation through cross-correlation to find where to place the tasks in order to reduce the data center profit degradation. The results show that our algorithm could obtain more than 34% increase in profit when compared to algorithms from the literature, while fulfilling the power profile and resources constraints.
The Megafly topology has recently been proposed as an efficient, hierarchical way to interconnect large-scale highperformancecomputing systems. Megafly networks may be constructed in various group sizes and configur...
详细信息
ISBN:
(纸本)9781728195865
The Megafly topology has recently been proposed as an efficient, hierarchical way to interconnect large-scale highperformancecomputing systems. Megafly networks may be constructed in various group sizes and configurations, but it is challenging to maintain high throughput performance on all such variants. Therefore, a robust topology-specific adaptive routing scheme is needed to utilize the topological advantages of Megafly. Currently, Progressive Adaptive Routing (PAR) is the best known routing scheme for Megafly networks, but its performance is not fully known across all scales and configurations. In this work, we show that the current PAR scheme performs sub-optimally on Megafly networks with a large number of groups. As better alternatives, we propose a new practical adaptive routing schemes, KU-GCN, that can improve the communication performance of Megafly at any scale and configuration. With the use of trace-driven simulation experiments, we show that our new Megafly routing scheme performs well across a wide variety of topology-workload combinations and outperforms PAR by up to 43.5 percent on a topology with a large number of groups.
This paper proposes a statistical method for memory bandwidth prediction in NUMA architecture. The memory bandwidth is expressed in terms of total transferred data per execution time. We first split memory bandwidth i...
详细信息
In distributed hybrid computing systems, traditional sequential processors are loosely coupled with reconfigurable hardware for optimal performance. This loose coupling proves to be a communication challenge;the proce...
详细信息
Several forces are driving the market to put multiple execution cores on a single processor chip. But we cannot view (or design) those cores (and the connections between them) in the same way we did when we lived in a...
详细信息
Most large modern-day applications (Facebook, Gmail, eBay, etc.) rely on widely distributed and replicated storage of data for scalability, availability, and disaster tolerance. Since maintaining high degrees of data ...
详细信息
ISBN:
(纸本)9781728157245
Most large modern-day applications (Facebook, Gmail, eBay, etc.) rely on widely distributed and replicated storage of data for scalability, availability, and disaster tolerance. Since maintaining high degrees of data consistency requires costly communication across distant sites, applications over such distributed and partially replicated data are complex artifacts that must carefully balance the required degrees of consistency and performance. In this paper I summarize work at the University of Illinois Assured Cloud computing center on using rewriting logic and its associated Maude tool environment to formally model and analyze both the correctness and the performance of state-of-the-art distributed transaction systems designs, as well as on how to automatically obtain a correct-by-construction distributed implementation of a promising design.
Clustering algorithms are efficient tools for discovering correlations or affinities within large datasets and are the basis of several Artificial Intelligence processes based on data generated by sensor networks. Rec...
详细信息
ISBN:
(数字)9781665488020
ISBN:
(纸本)9781665488020
Clustering algorithms are efficient tools for discovering correlations or affinities within large datasets and are the basis of several Artificial Intelligence processes based on data generated by sensor networks. Recently, such algorithms have found an active application area closely correlated to the Edge computing paradigm. The final aim is to transfer intelligence and decision-making ability near the edge of the sensors networks, thus avoiding the stringent requests for low-latency and large-bandwidth networks typical of the Cloud computing model. In such a context, the present work describes a new hybrid version of a clustering algorithm for the NVIDIA Jetson Nano board by integrating two different parallel strategies. The algorithm is later evaluated from the points of view of the performance and energy consumption, comparing it with two high-end GPU-based computing systems. The results confirm the possibility of creating intelligent sensor networks where decisions are taken at the data collection points.
In the past a few years, wavelet transforms have become a hot topic of research. Discrete and continuous wavelet transforms have been widely used in signal and multimedia processing. Due to the highperformance and fl...
详细信息
For the past five years, I had the very enviable task of leading IBM's effort in DARPA's high Productivity computing Systems (HPCS) program. IBM competed successfully with other contestants in and survived two...
详细信息
ISBN:
(纸本)1424409101
For the past five years, I had the very enviable task of leading IBM's effort in DARPA's high Productivity computing Systems (HPCS) program. IBM competed successfully with other contestants in and survived two down-selects, producing along the way ground-breaking research for peta-scale systems aimed at changing the status quo in high end computing. The HPCS program is unique in that it states productivity as a broader definition of the system value than justperformance. Commercial viability is another goal, meant to add realism and produce usable systems at the end of the program with productivity and performance goals that well exceed the projected improvements using today's technology. This unprecedented mix adds interesting and challenging constraints on the research program, and the traditional ways of approaching the problem do not apply. This talk will give an overview of the challenges of running projects of this kind, and gives a forward looking statement about the future of the program and its projected impact on the industry and the academic communities.
暂无评论