DIBCO 2013 is the international Document Image Binarization Contest organized in the context of ICDAR 2013 conference. The general objective of the contest is to identify current advances in document image binarizatio...
详细信息
ISBN:
(纸本)9781479901937
DIBCO 2013 is the international Document Image Binarization Contest organized in the context of ICDAR 2013 conference. The general objective of the contest is to identify current advances in document image binarization for both machine-printed and handwritten document images using evaluation performance measures that conform to document image analysis and recognition. This paper describes the contest details including the evaluation measures used as well as the performance of the 23 submitted methods along with a short description of each method.
The Partitioned Global Address Space (PGAS) model has been widely used in multi-core clusters as an alternative to MPI. Among the widespread use is Unified Parallel C (UPC). Previous research has shown that UPC perfor...
详细信息
ISBN:
(纸本)9781479909735
The Partitioned Global Address Space (PGAS) model has been widely used in multi-core clusters as an alternative to MPI. Among the widespread use is Unified Parallel C (UPC). Previous research has shown that UPC performance is comparable with MPI, however in certain cases UPC require hand-tuning techniques such as pre-fetching and privatized pointers-to-shared to improve the performance. In this paper we reviews, evaluate and analyze the performance pattern between UPC Naive, UPC optimize and MPI on two different multi-core clusters architecture. We focus our study using matrix multiplication as the benchmark and perform our experimental on two distributed memory machine;Cray XE6 with Gemini interconnects and Sun Cluster with Infiniband interconnects. We provide analysis on each core execution time to understand the pattern of communication for both machines. We also demonstrate the gaps between naive and optimized are depends on the compiler with its associate distributed memory machine. We also observed unnecessary optimization for certain programs related to HPC architecture and compiler.
In this paper, we investigate a decentralized output synchronization problem of heterogeneous linear systems. Motivated by recent results in the literature, we develop a self-triggered output broadcasting policy for t...
详细信息
ISBN:
(纸本)9781479901777
In this paper, we investigate a decentralized output synchronization problem of heterogeneous linear systems. Motivated by recent results in the literature, we develop a self-triggered output broadcasting policy for the interconnected systems. In other words, each system broadcasts its outputs only when necessary in order to achieve output synchronization. Consequently, the control signal of each system is updated based on currently available (but outdated) information received from the neighbors. These broadcasting time instants adapt to the current communication topology. For a fixed topology, our broadcasting policy yields global exponential output synchronization, and L_p-stable output synchronization in the presence of disturbances. Employing a converse Lyapunov theorem for impulsive systems, we provide an average dwell-time condition that yields disturbance-to-state stable output synchronization in case of switching topology. The proposed approach is applicable to directed and unbalanced communication topologies. Finally, our results are corroborated by numerical simulations.
Cloud computing is a recently developed new technology for complex systems with massive service sharing, which is different from the resource sharing of the grid computing systems. In a cloud environment, service requ...
详细信息
Load tests ensure that software systems are able to perform under the expected workloads. The current state of load test analysis requires significant manual review of performance counters and execution logs, and a hi...
详细信息
Load tests ensure that software systems are able to perform under the expected workloads. The current state of load test analysis requires significant manual review of performance counters and execution logs, and a high degree of system-specific expertise. In particular, memory-related issues (e.g., memory leaks or spikes), which may degrade performance and cause crashes, are difficult to diagnose. Performance analysts must correlate hundreds of megabytes or gigabytes of performance counters (to understand resource usage) with execution logs (to understand system behaviour). However, little work has been done to combine these two types of information to assist performance analysts in their diagnosis. We propose an automated approach that combines performance counters and execution logs to diagnose memory-related issues in load tests. We perform three case studies on two systems: one open-source system and one large-scale enterprise system. Our approach flags ≤ 0.1% of the execution logs with a precision ≥ 80%.
l 1 -minimization refers to finding the minimum l 1 -norm solution to an underdetermined linear system \mbi b = A \mbi x . Under certain conditions as described in compressive sensing theory, the minimum l 1 -norm s...
l 1 -minimization refers to finding the minimum l 1 -norm solution to an underdetermined linear system \mbi b = A \mbi x . Under certain conditions as described in compressive sensing theory, the minimum l 1 -norm solution is also the sparsest solution. In this paper, we study the speed and scalability of its algorithms. In particular, we focus on the numerical implementation of a sparsity-based classification framework in robust face recognition, where sparse representation is sought to recover human identities from high-dimensional facial images that may be corrupted by illumination, facial disguise, and pose variation. Although the underlying numerical problem is a linear program, traditional algorithms are known to suffer poor scalability for large-scale applications. We investigate a new solution based on a classical convex optimization framework, known as augmented Lagrangian methods. We conduct extensive experiments to validate and compare its performance against several popular l 1 -minimization solvers, including interior-point method, Homotopy, FISTA, SESOP-PCD, approximate message passing, and TFOCS. To aid peer evaluation, the code for all the algorithms has been made publicly available.
There continues to be many advances in the theory and practice of Modeling and Simulation (M&S). However, some of these can be considered as Grand Challenges;issues whose solutions require significant focused effo...
详细信息
Recently, graphics processing units (GPUs) have opened up new opportunities for speeding up general-purpose parallel applications due to their massive computational power and up to hundreds of thousands of threads ena...
详细信息
ISBN:
(纸本)9781450319003
Recently, graphics processing units (GPUs) have opened up new opportunities for speeding up general-purpose parallel applications due to their massive computational power and up to hundreds of thousands of threads enabled by programming models such as CUDA. However, due to the serial nature of existing micro-architecture simulators, these massively parallel architectures and workloads need to be simulated sequentially. As a result, simulating GPGPU architectures with typical benchmarks and input data sets is extremely time-consuming. This paper addresses the GPGPU architecture simulation challenge by generating miniature, yet representative GPGPU kernels. We first summarize the static characteristics of an existing GPGPU kernel in a profile, and analyze its dynamic behavior using the novel concept of the divergence flow statistics graph (DFSG). We subsequently use a GPGPU kernel synthesizing framework to generate a miniature proxy of the original kernel, which can reduce simulation time significantly. The key idea is to reduce the number of simulated instructions by decreasing per-thread iteration counts of loops. Our experimental results show that our approach can accelerate GPGPU architecture simulation by a factor of 88X on average and up to 589X with an average IPC relative error of 5.6%.
Network radio frequency (RF) environment sensing (NRES) systems pinpoint and track people in buildings using changes in the signal strength measurements made by a wireless sensor network. It has been shown that such s...
详细信息
Network radio frequency (RF) environment sensing (NRES) systems pinpoint and track people in buildings using changes in the signal strength measurements made by a wireless sensor network. It has been shown that such systems can locate people who do not participate in the system by wearing any radio device, even through walls, because of the changes that moving people cause to the static wireless sensor network. However, many such systems cannot locate stationary people. We present and evaluate a system which can locate stationary or moving people, without calibration, by using kernel distance to quantify the difference between two histograms of signal strength measurements. From five experiments, we show that our kernel distance-based radio tomographic localization system performs better than the state-of-the-art NRES systems in different non line-of-sight environments.
Recent advance of virtualization technology provides a new approach to check-point/restart at the virtual machine(VM) *** contrast to traditional process-level checkpointing,checkpointing at the virtualization layer b...
详细信息
Recent advance of virtualization technology provides a new approach to check-point/restart at the virtual machine(VM) *** contrast to traditional process-level checkpointing,checkpointing at the virtualization layer brings up several advantages,such as compatibility,transparence,flexibility and ***,because the virtualization layer has little semantic knowledge about the operation system and the applications running atop,VM-layer checkpointing requires saving the entire operating system state rather than a single *** overhead may render the approach *** reduce the size of VM checkpoint,in this paper we propose a page eviction scheme and an incremental checkpointing mechanism to avoid saving unnecessary VM pages in the *** keep the system online transparently,we propose a live checkpointing mechanism by saving the memory image in a copy-on-write(COW) *** implement the performance optimization mechanisms in a prototype system,called *** results with a group of representative applications show that our page eviction scheme and incremental checkpointing can significantly reduce the checkpoint file size by up to 87% and shorten the total checkpointing/restart time by a factor of up to 71%,in comparison with the Xens default checkpointing *** observed application downtimes due to checkpointing can be reduced to as small as 300 ms.
暂无评论