As the healthcare industry continues to become more distributed, healthcare organizations are increasing their reliance on mobile links to access patient information and to update their master database at the point of...
详细信息
As the healthcare industry continues to become more distributed, healthcare organizations are increasing their reliance on mobile links to access patient information and to update their master database at the point of care. Handheld computers have evolved into a viable platform for these systems. While initial projects have shown promise, several questions remain. This article explores the unique characteristics of handheld computers with respect to user interface design and wireless access, and introduces a prototype development effort.
Summary form only given. Mechatronic systems.request for high reliability, especially in the context of time where mostly hard real-time capabilities are mandatory. May be even stronger requirements regard the robustn...
详细信息
Summary form only given. Mechatronic systems.request for high reliability, especially in the context of time where mostly hard real-time capabilities are mandatory. May be even stronger requirements regard the robustness against software failures and interdependences from erroneous tasks to others. We propose the concept of robust partitioning for reliable real-time embedded systems. The concept consists of two parts, memory space protection and time protection. Memory protection is realized by already existing hardware and software mechanisms. For realizing temporal protection, a two-step timer interrupt system realizing an imprecise computation concept is proposed: if the execution of a module exceeds a certain time limit before the deadline, the first timer interrupt is triggered and a backup routine is started to produce an imprecise result in the remaining time until the second timer expires. This time protection concept shows significant advantages as compared to classical approaches for single, parallel and distributedsystems. We give an extended introduction into the concept and discussed first attempts for its realization.
Storage clusters consisting of thousands of disk drives are now being used both for their large capacity and high throughput. However, their reliability is far worse than that of smaller storage systems.due to the inc...
详细信息
Storage clusters consisting of thousands of disk drives are now being used both for their large capacity and high throughput. However, their reliability is far worse than that of smaller storage systems.due to the increased number of storage nodes. RAID technology is no longer sufficient to guarantee the necessary high data reliability for such systems. because disk rebuild time lengthens as disk capacity grows. We present fast recovery mechanism (FARM), a distributed recovery approach that exploits excess disk capacity and reduces data recovery time. FARM works in concert with replication and erasure-coding redundancy schemes to dramatically lower the probability of data loss in large-scale storage systems. We have examined essential factors that influence system reliability, performance, and costs, such as failure detections, disk bandwidth usage for recovery, disk space utilization, disk drive replacement, and system scales, by simulating system behavior under disk failures. Our results show the reliability improvement from FARM and demonstrate the impacts of various factors on system reliability. Using our techniques, system designers will be better able to build multipetabyte storage systems.with much higher reliability at lower cost than previously possible.
Summary form only given. Clustering of several storage servers is a common way to build fast and fault tolerant storage systems. One application can be found in the context of parallel programs that already run on clu...
详细信息
Summary form only given. Clustering of several storage servers is a common way to build fast and fault tolerant storage systems. One application can be found in the context of parallel programs that already run on clustered systems.and need to write and read a huge amount of data from and to disks. Another application field are Web and video streaming server that cause intense data transfer from and to disks. A distributed storage system is reviewed under the aspect of fault tolerance and reconfiguration of the data layout after faults. Data objects are stored in a data layout according to RAID level 3 among disk subsystems.of different computers. Concurrent up-and down-streaming of data is provided by a technique that ensures data consistency. This consistency has been found to be beneficial for concurrent access and reconfiguration. Beyond, the system does not need a meta-data server, which often represents a bottleneck for distributed storage systems.
Global predicate evaluation is a fundamental problem in distributedsystems. This paper views it from a different perspective, namely that of the signals and systems.area of electrical engineering. It adapts a signal ...
详细信息
Global predicate evaluation is a fundamental problem in distributedsystems. This paper views it from a different perspective, namely that of the signals and systems.area of electrical engineering. It adapts a signal processing approach to address this problem in the context of monitoring of 'health' of a software system. The global state of the system is viewed as a 'state' signal which evolves over time. The distributed processes are assumed to possess roughly synchronized clocks. The states of individual processes are periodically sampled and reported to a global monitor. The observed system state constructed by the global monitor is viewed as being composed of two components - the consistent global states and an error signal due to the messages in transit and differences in the local clocks. The global monitor removes the error signal by processing the observed global signal through a low-pass filter. It evaluates the predicates on the filtered signal. The approach presented is applicable to distributedsystems.which are semi-stationary, i.e. whose internal states of interest remain stable over comparatively long intervals of time. The paper presents the relevant signal processing concepts (p-spectrum and p-filtering), outlines an architecture for global predicate monitoring and describes the signal processing done in the global monitor. The paper then summarizes an evaluation of the approach presented on a small computer aided vehicle dispatch system. The evaluation experiments are described and the results are presented and analyzed.
Summary form only given. Built upon new data organization and access characteristics, MEMS-based storage devices have come under consideration as an alternative to disks for large data-intensive applications. While no...
详细信息
Summary form only given. Built upon new data organization and access characteristics, MEMS-based storage devices have come under consideration as an alternative to disks for large data-intensive applications. While not already in commercial production, MEMS-based storage devices have outperformed disks in device-level simulations. Processor-embedded distributed disks improved performance of workloads by offloading application-level processing to the storage. To exploit the potential benefits offered by these emerging storage technologies and offloading models, we propose a processor-embedded distributed MEMS-based storage architecture, and evaluate the proposed architecture with representative database and data mining workloads. Our results show that MEMS-based storage improved the overall performance of these workloads over disk-based systems. and transformed the characteristics of several workloads, impacting the design points for future storage architectures.
Efficient task scheduling is essential for achieving high performance computing applications for distributedsystems. Most of existing real-time systems.consider schedulability as a main goal and ignores other effects...
详细信息
Efficient task scheduling is essential for achieving high performance computing applications for distributedsystems. Most of existing real-time systems.consider schedulability as a main goal and ignores other effects such as machines failures. In This work we develop an algorithm to efficiently schedule parallel task graphs (fork-join structures). Our scheduling algorithm considers more than one factor at the same time. These factors are scheduability, reliability of the participating processors and achieved degree of parallelism. To achieve most of these goals, we composed an objective function that combines these different factors simultaneously. The proposed objective function is adjustable to provide the user with a way to prefer one factor to the others. The simulation results indicate that our algorithm produces schedules where the applications deadlines are met, reliability is maximized and the application parallelism is exploited.
software FMEA is a means to determine whether any single failure in computer software can cause catastrophic system effects, and additionally identifies other possible consequences of unexpected software behavior. The...
详细信息
ISBN:
(纸本)0780377176
software FMEA is a means to determine whether any single failure in computer software can cause catastrophic system effects, and additionally identifies other possible consequences of unexpected software behavior. The procedure described here was developed and used to analyze mission- and safety-critical softwaresystems. The procedure includes using a structured approach to understanding the subject software, developing rules and tools for doing the analysis as a group effort with minimal data entry and human error, and generating a final report. software FMEA is a kind of implementation analysis that is an intrinsically tedious process but database tools make the process reasonably painless, highly accurate, and very thorough. The main focus here is on development and use of these database tools.
As softwaredistributed Shared Memory(DSM) systems.become attractive on larger clusters, the focus of attention moves toward improving the reliability of systems. In this paper, we propose a lightweight logging scheme...
详细信息
ISBN:
(纸本)0769520693
As softwaredistributed Shared Memory(DSM) systems.become attractive on larger clusters, the focus of attention moves toward improving the reliability of systems. In this paper, we propose a lightweight logging scheme, called remote logging, and a recovery protocol for home-based DSM. Remote logging stores coherence-related data to the volatile memory of a remote node. The logging overhead can be moderated with high-speed system area network and user-level DMA operations supported by modern communication protocols. Remote logging tolerates multiple failures if the backup nodes of failed nodes are alive. It makes the reliability of DSM grow much higher. Experimental results show that our fault-tolerant DSM has low overhead compared to conventional stable logging and it can be effectively recovered from some concurrent failures.
Fermilab, in collaboration with the DESY laboratory in Hamburg, Germany, has created a petabyte scale data storage infrastructure to meet the requirements of experiments to store and access large data sets. The Fermil...
详细信息
ISBN:
(纸本)0769519148
Fermilab, in collaboration with the DESY laboratory in Hamburg, Germany, has created a petabyte scale data storage infrastructure to meet the requirements of experiments to store and access large data sets. The Fermilab data storage infrastructure consists of the following major storage and data transfer components: Enstore mass storage system, DCache distributed data cache, ftp, and Grid ftp for primarily external data transfers. This infrastructure provides a data throughput sufficient for transferring data from experiments' data acquisition systems. It also allows access to data in the Grid framework.
暂无评论