This work is devoted to the problem of detecting and processing faults of computing nodes during execution of parallel programs on distributed computing systems. The fault tolerance tools of PBS/TORQUE are considered....
详细信息
ISBN:
(纸本)9781728129877
This work is devoted to the problem of detecting and processing faults of computing nodes during execution of parallel programs on distributed computing systems. The fault tolerance tools of PBS/TORQUE are considered. The functional model for faults handling optimization are proposed.
computing resources in volunteer computing grid represent a big under-used reserve of processing capacity. However, a task scheduler has no guarantees regarding the deliverable computing power of these resources. Pred...
详细信息
ISBN:
(纸本)9781467366373
computing resources in volunteer computing grid represent a big under-used reserve of processing capacity. However, a task scheduler has no guarantees regarding the deliverable computing power of these resources. Predicting CPU availability can help to better exploit these resources and make effective scheduling decisions. In this paper, we draw up the main guidelines to develop a scalable method to predict CPU availability in a large-scale volunteer computingsystem. To reduce solution time and ensure precision, we use simple prediction techniques precisely Autoregressive models and tendency-based strategy. To address the limitations of autoregressive models, we propose an automated approach to check whether time series satisfy the assumptions of the models and to construct a prediction model by identifying its appropriate order value. At each prediction, we consider autoregressive models over three different past analyses: first over the recent hours, second during the same hours of the previous days and third during the same weekly hours of the previous weeks. We analyze the performance of multivariate vector autoregressive models (VAR) and pure autoregressive models (AR), constructed according to our approach, against the tendency prediction technique using traces of a large-scale Internet-distributed computing system, termed seti@home.
Today's server environments consist of many machines constructing clusters for distributed computing system or storage area networks (SAN) for effectively processing or saving enormous data. In these kinds of serv...
详细信息
ISBN:
(纸本)9781467362184
Today's server environments consist of many machines constructing clusters for distributed computing system or storage area networks (SAN) for effectively processing or saving enormous data. In these kinds of server environments, backend-storages are usually the bottleneck of the overall system. But it is not enough to simply replace the devices with better ones to exploit their performance benefits. In other words, proper optimizations are needed to fully utilize their performance gains. In this work, we first applied a high performance device as a backend-storage to the existing SAN solution, and found that it could not utilize the low latency and high bandwidth of the device, especially in case of small sized random I/O pattern even though a high speed network was used. To address this problem, we propose a new design that contains three optimizations: 1) removing software overheads to lower I/O latency; 2) parallelism to utilize the high bandwidth of the device; 3) temporal merge mechanism to reduce network overhead. We implemented them as a prototype and found that our solution makes substantial performance improvements in terms of both the latency and bandwidth.
A decretive concern in distributed computing systems is to efficiently schedule the tasks among all processors so that the overall processing time of the submitted tasks is at a minimum. In this article, following the...
详细信息
ISBN:
(纸本)9781467347709
A decretive concern in distributed computing systems is to efficiently schedule the tasks among all processors so that the overall processing time of the submitted tasks is at a minimum. In this article, following the recently evolved paradigm, referred to as divisible load theory (DLT), we conducted an experimental study on the time performance to process a large volume of image data on a network of workstations. We present our program model and timing mechanism for the distributed image processing and finally display effects of δ parameter and test cases in our mentioned algorithm.
In this paper authors offer the new method for solving coherent tasks in distributed computing system, based on resources of personal computers. Parameters of such resources are dynamically varying and that makes it h...
详细信息
In this paper authors offer the new method for solving coherent tasks in distributed computing system, based on resources of personal computers. Parameters of such resources are dynamically varying and that makes it hard for their application in distributed computations. To achieve ability of effective usage of personal computers, proposed method uses multiagent approach: proactive agent controls every personal computer in distributed computing system, and process of task solving is dispatched decentralized by interactions of agents. To solve every incoming coherent task agents of the system unite into community and that makes it easier to dispatch and perform computations. The main benefit of proposed method is decreasing the price for creating and maintenance of distributed computing system.
The information and networking technology have been revolutionized with the inception of recent evolution of Cyber-Physical systems (CPS) and the Internet of Things (IoT). The next generation distributedcomputing sys...
详细信息
ISBN:
(纸本)9781538674765
The information and networking technology have been revolutionized with the inception of recent evolution of Cyber-Physical systems (CPS) and the Internet of Things (IoT). The next generation distributed computing systems i.e., CPS and IoT are highly interconnected and deeply embedded with the physical world. By capitalizing the advantages and opportunities of these technologies, Industrial-IoT has been fueling smart industrial processes. The execution and application of these processes generate huge amount of data, which leads the distributed computing systems to carefully consider information and data management reliably and securely; also facilitating the necessary automation as well as ensuring timely information exchange. But the current internet doesn't guarantee the network performance and secure transportation; in addition the physical systems are becoming more insecure when interconnected to the cyber systems. These bottlenecks are leading to the necessity of improving performance and security in the cyber-physical communication. Considering those pervasive requirements, this paper has modeled network performance as well as system security with a view to improve these components which could heel the reliable cyber-communication challenges.
One of the today issues in software engineering is to find new effective ways to deal intelligently with the increasing complexity of distributed computing systems. In this context a crucial role is played by the bala...
详细信息
ISBN:
(纸本)9780769534046
One of the today issues in software engineering is to find new effective ways to deal intelligently with the increasing complexity of distributed computing systems. In this context a crucial role is played by the balancing of the work load among all nodes in a system composed of interconnected nodes that enter and exit the system without following any rule. To address this issue, we are experimenting with the usage of autonomic self-aggregation techniques that rewire the system in groups of homogeneous nodes that are then able to balance the load among each others using classical techniques. We present our approach together with some simulation experiments that show how the application of self-aggregation algorithms makes it possible to balance the load also in these extreme situations. Besides, our experiments show that the introduction of self-aggregation does not introduce a significant overhead in terms of execution time, even if it requires the exchange of a higher number of messages between nodes.
GSML is a programming language that has been de- signed for grid end-users to overcome the programming hurdle and the high learning curve associated with Grid infrastructures that are complex distributedcomputing sys...
详细信息
ISBN:
(纸本)9781424432806;9780769534497
GSML is a programming language that has been de- signed for grid end-users to overcome the programming hurdle and the high learning curve associated with Grid infrastructures that are complex distributedcomputing sys- tems. This paper defines its formal semantics in terms of a chemical programming language called HOCL. This trans- lation of GSML programs into HOCL gives a precise def- inition of the concepts of GSML, especially sessions. The semantics also bridges the GSML and chemical computing paradigms.
This paper describes multi-agent based automated negotiation between clients in *** automated negotiation model and architecture of the negotiation agent Be presented.A constraint satisfaction-processing component is ...
详细信息
This paper describes multi-agent based automated negotiation between clients in *** automated negotiation model and architecture of the negotiation agent Be presented.A constraint satisfaction-processing component is developed to evaluate negotiation proposals against the defined constraints and negotiation strategic rules. A preference-scoring module performs quantitative analysis of alternative negotiation conditions.A prototype of e-commerce system is implemented to demonstrate automated negotiations among buyers and suppliers.
By introducing signaling and self-management in a Turing node and a signaling network as an overlay over the computing network, the current von-Neumann computing model is evolved to bring the architectural resiliency ...
详细信息
ISBN:
(纸本)9781467312332
By introducing signaling and self-management in a Turing node and a signaling network as an overlay over the computing network, the current von-Neumann computing model is evolved to bring the architectural resiliency of cellular organisms to computing infrastructure. The DIME computing model introduces the genetic transactions of replication, repair, recombination and reconfiguration to program self-resiliency in distributed computing systems executing a managed workflow. The injection of parallelism and network based composition of "Self" identity are the first steps in introducing the elements of homeostasis and self-management in the computing infrastructure. DIMEs inject the architectural resiliency of cellular organisms to create a new class of distributed autonomic computingsystems using managed Turing machine networks.
暂无评论