We present a new method to find minimal-complexity interconnect models that obey a certain specified accuracy in relation to the true waveform. The method can be described as an interconnect characterization flow that...
详细信息
We present a new method to find minimal-complexity interconnect models that obey a certain specified accuracy in relation to the true waveform. The method can be described as an interconnect characterization flow that defines simple rules for finding the minimal number of segments and required segment type, RC or RLC, by regarding interconnect resistance, driver source resistance, interconnect characteristic impedance and load capacitance. To show the application of the method, segment selection rules are derived for a case with a waveform discrepancy constraint of 5%
This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication netw...
详细信息
This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round
As chip multiprocessors with simultaneous multithreaded cores are becoming commonplace, there is a need for simple approaches to exploit thread-level parallelism. In this paper, we consider thread-level speculation as...
详细信息
As chip multiprocessors with simultaneous multithreaded cores are becoming commonplace, there is a need for simple approaches to exploit thread-level parallelism. In this paper, we consider thread-level speculation as a means to reap thread-level parallelism out of application binaries. We first investigate the tradeoffs between scheduling speculative threads on the same core and on different cores. While threads contend for the same resources using the former approach, the latter approach is plagued by the overhead for inter-core communication. Despite the impact of resource contention, our detailed simulations show that the first approach provides the best performance due to lower inter-thread communication cost. The key contribution of the paper is the proposed design and evaluation of the dual-thread speculation system. This design point has very low complexity and reaps most of the gains of a system supporting eight threads
Data link compression can efficiently compress the data stream between main memory and the processor chip in single processor systems. By dynamically updating a value cache on each side of the link with the most frequ...
详细信息
Data link compression can efficiently compress the data stream between main memory and the processor chip in single processor systems. By dynamically updating a value cache on each side of the link with the most frequently transmitted values, frequent value encoding can compress the data stream by up to 70%. Unfortunately, the number of value caches needed grows quadratically with the number of nodes in multiprocessors which causes a scalability problem. This paper shows that by sharing the caches between different pairs of communicating nodes, the frequent values stored at each node can be utilized more efficiently. For interconnects with point-to-point links, it is shown, however, that sharing of caches introduces overhead traffic for keeping the value caches consistent. If all misses in the shared cache are broadcast to all other nodes, the generated traffic becomes so large, that it is better to transmit the values uncompressed. We propose and evaluate three techniques that aim at reducing this overhead and find that it is possible to reduce most of this traffic, but at the cost of less efficient compression and the final result is comparable to using dedicated value caches
Parallelism plays a significant role in high-performance computing systems, from large clusters of computers to chip-multithreading (CMT) processors. Performance of the parallel systems comes not only from concurrentl...
详细信息
Parallelism plays a significant role in high-performance computing systems, from large clusters of computers to chip-multithreading (CMT) processors. Performance of the parallel systems comes not only from concurrently running more processing hardware but also from utilizing the hardware efficiently. The hardware utilization is strongly influenced by how processors/processes are synchronized in the system to maximize parallelism. Synchronization between concurrent processes usually relies on shared data structures. The data structures that enhance parallelism by allowing processes to access them concurrently are known as concurrent data structures. The thesis aims at developing efficient concurrent data structures and algorithms for synchronization in asynchronous shared-memory multiprocessors. Generally speaking, simple data structures perform well in the absence of contention but perform poorly in high-contention situations. Contrarily, sophisticated data structures that can scale and perform well in the presence of high contention usually suffer unnecessary high latency when there is no contention. Efficient concurrent data structures should be able to adapt their algorithmic complexity to varying contention. This has motivated us to develop fundamental concurrent data structures like trees, multi-word compare-and-swap and locks into reactive ones that timely adapt their size or algorithmic behavior to the contention level in execution environments. While the contention is varying rapidly, the reactive data structures must keep the cost of reaction below its benefit, avoiding unnecessary reaction due to the contention oscillation. This is quite challenging since the information of how the contention will vary in the future is usually not available in multiprogramming multiprocessor environments. To deal with the uncertainty, we have successfully synthesized non-blocking synchronization techniques and advanced on-line algorithmic techniques, in the context of reac
Synchronization, consistency and scalability are important issues in the design of concurrent computer system services. In this thesis we study the application of optimistic and scalable methods in concurrent system s...
详细信息
Synchronization, consistency and scalability are important issues in the design of concurrent computer system services. In this thesis we study the application of optimistic and scalable methods in concurrent system services. In a distributed setting we study scalable tracking of the causal relations between events, lightweight information dissemination in optimistic causal order in distributed systems and fault-tolerant and dynamic resource sharing. Further, we study scalable memory allocation, memory reclamation, threading, thread synchronization and data structures in shared memory systems. For each of the services we study we give the design of algorithms using optimistic methods, assess the correctness and analyze the behaviour of the algorithm, and in most cases describe implementations and perform experimental studies comparing the proposed algorithms to "traditional" approaches. We present a study of the accuracy of plausible timestamps for scalable event tracking in large systems. We analyze how these clocks may relate causally independent event pairs and based on the analysis we propose two new clock algorithms to satisfy the analysis criteria. We propose an information dissemination service providing optimistic causal order called lightweight causal cluster consistency. It offers scalable behaviour, low message size overhead and high-probability reliability guarantees for e.g. multi-peer collaborative applications. A key component in the dissemination service is a dynamic and fault-tolerant cluster management algorithm, which manages a set of tickets/resources such that each ticket has at most one owner at a time. In the dissemination service this algorithm manages senders and enables the use of small fixed size vector clocks. We present a lock-free concurrent memory allocator, NBMALLOC, designed to enhance performance and scalability on multiprocessors which also shows in our experimental evaluation. We present a lock-free memory reclamation scheme for u
The most fundamental task for any mobile robot is to perform self-localization in the world in which it is currently active, i.e. determine its position relative its world. Encoders that count wheel rotations are ofte...
详细信息
The most fundamental task for any mobile robot is to perform self-localization in the world in which it is currently active, i.e. determine its position relative its world. Encoders that count wheel rotations are often used, which can be turned into relative position estimates by means of integration. This process is commonly referred to as dead reckoning. Unfortunately, the errors in such position estimates grow over time due to the underlying measurements errors, which means that the errors in the dead reckoning estimates must be regularly corrected by absolute position estimates provided by other sensors. The goal of this thesis is to evaluate the possibilities of using so called scan matching algorithms for robust position estimation of a mobile robot, especially in environments that change over time. A scan is a set of range measurements of the environment provided by e.g. a laser scanner. By comparing a scan taken at the actual position of the robot with a scan previously taken and stored in a map of the environment, an estimate of the absolute position of the robot can be obtained. It is important that scan matching algorithms are robust against changes in the environments, are robust against different types of environments and can judge their own results. The main contributions of the thesis are threefold. First, two new sector-based scan matching algorithms are presented that are based on two existing scan-matching algorithms known as the Cox's and IDC algorithm. The sector-based variants, Cox-S and IDC-S, increase the performance of the existing algorithms, especially in environments containing severe changes. Second, two new methods are presented for estimating the uncertainty of the IDC algorithm. These methods improve the self-judgment of the IDC and IDC-S significantly, as the existing method for estimating the uncertainty was not reliable. Third, the new sector-based scan matching algorithms are evaluated and compared to the existing algorithms on the
In this paper we present how face tracking can be implemented on mobile devices. Our main contribution is to present how face tracking on mobile systems can be used as a multi-dimensional input technique and to demons...
详细信息
ISBN:
(纸本)1595932984
In this paper we present how face tracking can be implemented on mobile devices. Our main contribution is to present how face tracking on mobile systems can be used as a multi-dimensional input technique and to demonstrate how this can be used in different mobile applications. We present at set of different applications based on the tracking, and discuss current and future advantages, challenges and problems with face tracking as input device for mobile systems.
The use of radio frequency identification systems (RFID) is growing rapidly. Today, mostly "passive" RFID systems are used because no onboard energy source is needed on the transponders. However, "activ...
详细信息
The use of radio frequency identification systems (RFID) is growing rapidly. Today, mostly "passive" RFID systems are used because no onboard energy source is needed on the transponders. However, "active " RFID with onboard power source gives a new range of opportunities not possible with passive systems. To obtain energy efficiency in an active RFID system a protocol should be designed that is optimized with energy in mind. This paper describes the on-going work of defining and evaluating such a protocol. The protocol's performance in terms of energy efficiency, aggregated throughput, delay, and number of air collisions is evaluated and compared to that of the medium-access layer in 802.15.4 Zigbee, and also to a commercially available protocol from Free2move.
Various aspects of understanding of modeling and tuning security from a Quality of Services (QoS) perspective, are discussed. A categorization scheme that enables systematic studies of tunable security services was de...
详细信息
Various aspects of understanding of modeling and tuning security from a Quality of Services (QoS) perspective, are discussed. A categorization scheme that enables systematic studies of tunable security services was developed. The objective of the study is to find efficient and generic methods that will provide adequate security in future communication networks. It was found that the new approach is appropriate for multimedia applications that require tuning the security level in order to maintain performance at levels that are acceptable to users.
暂无评论