Priority heuristic policies have been developed for centralized and distributedrealtime database systems where cohorts or sub transaction executed in sequential manner, however, these heuristics may not fit well for...
详细信息
ISBN:
(纸本)9783319723440;9783319723433
Priority heuristic policies have been developed for centralized and distributedrealtime database systems where cohorts or sub transaction executed in sequential manner, however, these heuristics may not fit well for the mobile distributedrealtime database systems (MDRTDBS) where sub transactions are performing parallel execution and faces a lot of wireless challenges. In this paper, a MDRTDBS model has been introduced where sub-transaction executed parallel on different mobile sites and proposed a heuristic based on number of write locks. Proposed heuristic improves overall system performance by favoring sub transaction which demands lesser number of write locks. Further, a study has been done to evaluate impact of proposed heuristics with earliest deadline first and heuristic based on number of locks required using distributed high priority two phase locking protocol.
In the field of industrial automation, the traditional master/slave real-lime data processing (SCADA) system has been unable to deal with the current massive data and diversified business demands in throughput, real-t...
详细信息
ISBN:
(纸本)9781728101200
In the field of industrial automation, the traditional master/slave real-lime data processing (SCADA) system has been unable to deal with the current massive data and diversified business demands in throughput, real-time and scalability. This paper presents a decentralized real-time data space, based decentralized management SCADA cluster solutions for the treatment of object partitioning, distributedreal-time data processing, dynamic data migration and decentralized real-time data management are discussed. At last, the scheme is applied to develop the system an-d verify it.
In order to better serve users, several location-based services rely on the real-time spatio-temporal information. Existing location privacy-preserving methods traverse the whole dataset to anonymize k locations toget...
详细信息
ISBN:
(纸本)9781538631805
In order to better serve users, several location-based services rely on the real-time spatio-temporal information. Existing location privacy-preserving methods traverse the whole dataset to anonymize k locations together, and do not utilize parallel computing technology. The anonymization for big volume of location data may result in huge computing cost. We propose a new method called Never Wait for Long (NW4L), which protects the privacy of big-volume location data in parallel and real-time. Instead of linear structure, a k-d tree structure is adopted for the nearest location search to speed up computation. To further improve efficiency, locations are pre-classified in several groups, so that each group can be anonymized in parallel with support from the distributed stream computation framework. In this paper, we implemented NW4L based on Spark and used real-world dataset for performance evaluation. Experimental results show that 100,000 location samples can be processed in 2 minutes, which is feasible for real-time computation.
There are many scientific and commercial applications that require the execution of a large number of independent jobs resulting in significant overall execution time. Therefore, such applications typically require di...
详细信息
There are many scientific and commercial applications that require the execution of a large number of independent jobs resulting in significant overall execution time. Therefore, such applications typically require distributed computing infrastructures and science gateways to run efficiently and to be easily accessible for end-users. Optimising the execution of such applications in a cloud computing environment by keeping resource utilisation at minimum but still completing the experiment by a set deadline has paramount importance. As container-based technologies are becoming more widespread, support for job-queuing and auto-scaling in such environments is becoming important. Current container management technologies, such as Docker Swarm or Kubernetes, while provide auto-scaling based on resource consumption, do not support job queuing and deadline-based execution policies directly. This paper presents JQueuer, a cloud-agnostic queuing system that supports the scheduling of a large number of jobs in containerised cloud environments. The paper also demonstrates how JQueuer, when integrated with a cloud application-level orchestrator and auto-scaling framework, called MiCADO, can be used to implement deadline-based execution policies. This novel technical solution provides an important step towards the cost-optimisation of batch processing and job submission applications. In order to test and prove the effectiveness of the solution, the paper presents experimental results when executing an agent-based simulation application using the open source REPAST simulation framework. (C) 2019 The Authors. Published by Elsevier B.V.
If last decade viewed computational services as a utility then surely this decade has transformed computation into a commodity. Computation is now progressively integrated into the physical networks in a seamless way ...
详细信息
If last decade viewed computational services as a utility then surely this decade has transformed computation into a commodity. Computation is now progressively integrated into the physical networks in a seamless way that enables cyber-physical systems (CPS) and the Internet of Things (IoT) meet their latency requirements. Similar to the concept of "platform as a service" or "software as a service", both cloudlets and fog computing have found their own use cases. Edge devices (that we call end or user devices for disambiguation) play the role of personal computers, dedicated to a user and to a set of correlated applications. In this new scenario, the boundaries between the network node, the sensor, and the actuator are blurring, driven primarily by the computation power of IoT nodes like single board computers and the smartphones. The bigger data generated in this type of networks needs clever, scalable, and possibly decentralized computing solutions that can scale independently as required. Any node can be seen as part of a graph, with the capacity to serve as a computing or network router node, or both. Complex applications can possibly be distributed over this graph or network of nodes to improve the overall performance like the amount of data processed over time. In this paper, we identify this new computing paradigm that we call Social Dispersed Computing, analyzing key themes in it that includes a new outlook on its relation to agent based applications. We architect this new paradigm by providing supportive application examples that include next generation electrical energy distribution networks, next generation mobility services for transportation, and applications for distributed analysis and identification of non-recurring traffic congestion in cities. The paper analyzes the existing computing paradigms (e.g., cloud, fog, edge, mobile edge, social, etc.), solving the ambiguity of their definitions;and analyzes and discusses the relevant foundational softwar
Today enterprises have massive stream data that require to be processed in realtime due to data explosion in recent years. Spark Streaming as an emerging system is developed to process realtime stream data analytics...
详细信息
Today enterprises have massive stream data that require to be processed in realtime due to data explosion in recent years. Spark Streaming as an emerging system is developed to process realtime stream data analytics by using micro-batch approach. The unified programming model of Spark Steaming leads to some unique benefits over other traditional streaming systems, such as fast recovery from failures, better load balancing and resource usage. It treats the continuous stream as a series of micro-batches of data and continuously process these micro-batch jobs. However, efficient scheduling of micro-batch jobs to achieve high throughput and low latency is very challenging due to the complex data dependency and dynamism inherent in streaming workloads. In this paper, we propose A-scheduler, an adaptive scheduling approach that dynamically schedules parallel micro-batch jobs in Spark Streaming and automatically adjusts scheduling parameters to improve performance and resource efficiency. Specifically, A-scheduler dynamically schedules multiple jobs concurrently using different policies based on their data dependencies and automatically adjusts the level of job parallelism and resource shares among jobs based on workload properties. Furthermore, we integrate dynamic batching technique with A-Scheduler to further improve the overall performance of the customized Spark Streaming system. It relies on an expert fuzzy control mechanism to dynamically adjust the length of each batch interval in response to time-varying streaming workload and system processing rate. We implemented A-scheduler and evaluated it with a real-time security event processing workload. Our experimental results show that A-scheduler with dynamic batching can reduce end-to-end latency by 38 percent and meanwhile improve workload throughput and energy efficiency by 23 and 15 percent, respectively, compared to the default Spark Streaming scheduler.
A typical shuffle operation randomly partitions data on many computers, generating possibly a significant amount of network traffic which often dominates a job's completion time. This traffic is particularly prono...
详细信息
A typical shuffle operation randomly partitions data on many computers, generating possibly a significant amount of network traffic which often dominates a job's completion time. This traffic is particularly pronounced in iterative distributed operations where each iteration invokes a shuffle operation. We observe that data of different iterations are related according to the transformation logic of distributed operations. If data generated by the current iteration are partitioned to the computers where they will be processed in the next iteration, unnecessary shuffle network traffic between the two iterations can be prevented. We model general iterative distributed operations as the transform-and-shuffle primitive and define a powerful notion named Confluence key dependency to precisely capture the data relations in the primitive. We further find that by binding key partitions between different iterations based on the Confluence key dependency, the shuffle network traffic can always be reduced by a predictable percentage. We implemented the Confluence system. Confluence provides a simple interface for programmers to express the Confluence key dependency, based on which Confluence automatically generates efficient key partitioning schemes. Evaluation results on diverse real-life applications show that Confluence greatly reduces the shuffle network traffic, resulting in as much as 23 percent job completion time reduction.
High utility pattern mining (HUPM) has become a key issue in knowledge discovery since it provides retailers and managers with useful information for making decisions efficiently. However, previous studies most focuse...
High utility pattern mining (HUPM) has become a key issue in knowledge discovery since it provides retailers and managers with useful information for making decisions efficiently. However, previous studies most focused on mining the high-utility patterns (HUPs) from a single database. In this paper, we present a framework to incorporate the weighted model for parallel synthesis of the discovered HUPs from various databases. The pre-large concept was also used as a buffer here in order to provide more prospective HUPs, thus providing higher accuracy of the synthesized patterns. From our experiments, the developed model exceeds existing works, in particular the designed model has increased precision and recall on knowledge synthesization compared to the previous works.
The increasing pervasiveness of mobile devices combined with their replacement rate, led us to deal with the disposal of an increasing amount of still working electronic devices. This work proposes an approach to miti...
详细信息
ISBN:
(纸本)9781450364447
The increasing pervasiveness of mobile devices combined with their replacement rate, led us to deal with the disposal of an increasing amount of still working electronic devices. This work proposes an approach to mitigate this problem by extending the mobile devices' lifetime, by integrating them as part of a distributed mobile computing system. Thanks also to the growing computational power of such devices, this paradigm opens up the opportunity to deploy mobile applications in a distributed manner. This, without forgetting the energy budget management as a paramount objective. In this work, we built a proof-of-concept, based on the extension of a run-time resource manager to support Android applications. We introduced a energy-aware device selection policy to dispatch the application workload according to both device capabilities and run-time status. Experimental results show that, as well as increasing the utilization of multiple mobile devices available to the single user, using an energy-efficiency and distributed approach can increase the battery duration between 12% and 36%.
暂无评论