this paper explores the problem of boundary data classification ambiguity that arises when machine learning techniques are applied in the field of intrusion detection. the features and attributes of the boundary data ...
详细信息
Skeletal parallelism is a model of parallelism where parallel constructs are provided to the programmer as usual patterns of parallelalgorithms. High-level skeleton libraries often offer a global view of programs ins...
详细信息
ISBN:
(纸本)9783030389918;9783030389901
Skeletal parallelism is a model of parallelism where parallel constructs are provided to the programmer as usual patterns of parallelalgorithms. High-level skeleton libraries often offer a global view of programs instead of the common Single Program Multiple Data view in parallel programming. A program is written as a sequential program but operates on parallel data structures. Most of the time, skeletons on a parallel data structure have counterparts on a sequential data structure. For example, the map function that applies a given function to all the elements of a sequential collection (e.g., a list) has a map skeleton counterpart that applies a sequential function to all the elements of a distributed collection. Two of the challenges a programmer faces when using a skeleton library that provides a wide variety of skeletons are: which are the skeletons to use, and how to compose them? these design decisions may have a large impact on the performance of the parallel programs. However, skeletons, especially when they do not mutate the data structure they operate on, but are rather implemented as pure functions, possess algebraic properties that allow to transform compositions of skeletons into more efficient compositions of skeletons. In this paper, we present such an automatic transformation framework for the Python skeleton library PySke and evaluate it on several example applications.
the recent prevalence of positioning sensors and mobile devices generates a massive amount of spatial-temporal data from moving objects in real-time. As one of the fundamental processes in data analysis, the clusterin...
详细信息
ISBN:
(纸本)9783030602451;9783030602444
the recent prevalence of positioning sensors and mobile devices generates a massive amount of spatial-temporal data from moving objects in real-time. As one of the fundamental processes in data analysis, the clustering on spatial-temporal data creates various applications, like event detection and travel pattern extraction. However, most of the existing works only focus on the offline scenario, which is not applicable to online time-sensitive applications due to their low efficiency and ignorance of temporal features. In this paper, we propose a distributed streaming framework for spatial-temporal data clustering, which accepts various clustering algorithms while ensuring low resource consumption and result correctness. the framework includes a dynamic partitioning strategy for continuous load-balancing and a cluster-merging algorithm based on convex hulls [10], which guarantees the result correctness. Extensive experiments on real dataset prove the effectiveness of our proposed framework and its advantage over existing solutions.
the issue of finding skyline tuples over multiple relations, more commonly known as the skyline join problem, has been well studied in scenarios in which the data is static. Most recently, it has become a new trend th...
详细信息
ISBN:
(纸本)9783030050511;9783030050504
the issue of finding skyline tuples over multiple relations, more commonly known as the skyline join problem, has been well studied in scenarios in which the data is static. Most recently, it has become a new trend that performing skyline queries on data streams, where tuples arrive or expire in a continuous approach. A few algorithms have been proposed for computing skylines on two data streams. However, those literatures did not consider the inherent parallelism, or employ serial algorithms to solve the skyline query problem, which cannot leverage the multi-core processors. Based on this motivation, in this paper, we address the problem of parallel computing for skyline join over multiple data streams. We developed a Novel Iterative framework based on the existing work and study the inherent parallelism of the Novel Iterative framework. then we propose two parallel skyline join algorithms over sliding windows, NP-SWJ and IP-SWJ. To the best of our knowledge, this is the first paper that addresses parallel computing of skyline join over multiple data streams. Extensive experimental evaluations on real and synthetic data sets show that the algorithms proposed in this paper provide large gains over the state-of-the-art serial algorithm of skyline join over data streams.
this paper presents new multi-objectives scheduling strategies implemented in Docker SwarmKit. Docker SwarmKit is a container toolkit for orchestrating distributed systems at any scale. Currently, Docker SwarmKit has ...
详细信息
ISBN:
(纸本)9783030050573;9783030050566
this paper presents new multi-objectives scheduling strategies implemented in Docker SwarmKit. Docker SwarmKit is a container toolkit for orchestrating distributed systems at any scale. Currently, Docker SwarmKit has one scheduling strategy called Spread. Spread is based only on one objective to select from a set of cloud nodes, one node to execute a container. However, the containers submitted by users to be scheduled in Docker SwarmKit are configured according to multi-objectives criteria, as the number of CPUs and the memory size. To better address the multi-objectives configuration problem of containers, we introduce the concept and the implementation of new multi-objectives scheduling strategies adapted for Cloud Computing environments and implemented in Docker SwarmKit. the principle of our multi-objectives strategies consist to select a node which has a good compromise between multi-objectives criteria to execute a container. the proposed scheduling strategies are based on a combinaison of PROMEthEE and Kung multi-objectives decision algorithms in order to place containers. the implementation in Docker SwarmKit and experiments of our new strategies demonstrate the potential of our approach under different scenarios.
the proceedings contain 76 papers. the topics discussed include: clustering and change detection in multiple streaming time series;lightweight identification of captured memory for software transactional memory;layer-...
ISBN:
(纸本)9783319038582
the proceedings contain 76 papers. the topics discussed include: clustering and change detection in multiple streaming time series;lightweight identification of captured memory for software transactional memory;layer-based scheduling of parallel tasks for heterogeneous cluster platforms;optimistic concurrency control for energy efficiency in the wireless environment;synchronization-reducing variants of the biconjugate gradient and the quasi-minimal residual methods;exploring irregular reduction support in transactional memory;coordinate task and memory management for improving power efficiency;hardware-assisted intrusion detection by preserving reference information integrity;towards automatic generation of hardware classifiers;a practical approach for finding small independent, distance dominating sets in large-scale graphs;and heterogeneous computing vs. big data: the case of cryptanalytical applications.
the proceedings contain 76 papers. the topics discussed include: clustering and change detection in multiple streaming time series;lightweight identification of captured memory for software transactional memory;layer-...
ISBN:
(纸本)9783319038889
the proceedings contain 76 papers. the topics discussed include: clustering and change detection in multiple streaming time series;lightweight identification of captured memory for software transactional memory;layer-based scheduling of parallel tasks for heterogeneous cluster platforms;optimistic concurrency control for energy efficiency in the wireless environment;synchronization-reducing variants of the biconjugate gradient and the quasi-minimal residual methods;exploring irregular reduction support in transactional memory;coordinate task and memory management for improving power efficiency;hardware-assisted intrusion detection by preserving reference information integrity;towards automatic generation of hardware classifiers;a practical approach for finding small independent, distance dominating sets in large-scale graphs;and heterogeneous computing vs. big data: the case of cryptanalytical applications.
Traditional Cloud computing has emerged as a new paradigm for providing computing resources on demand and outsourcing software and hardware infrastructures. Cloud computing is rapidly changing the way IT services are ...
详细信息
ISBN:
(纸本)9783030050542;9783030050535
Traditional Cloud computing has emerged as a new paradigm for providing computing resources on demand and outsourcing software and hardware infrastructures. Cloud computing is rapidly changing the way IT services are made available and managed. these services can be requested by several Cloud providers, hence the need for networking between IT service components distributed in geographically diverse locations. Like the traditional Cloud computing, the volunteer computing paradigm has become increasingly important. For this paradigm, the resources on each personal machine are shared, thanks to the will of their owners. Cloud and volunteer paradigms have been recently seen as complementary technologies to better exploit the use of local resources. Besides execution time and cost, energy consumption is also becoming more important in the Cloud computing environments. thus, it has become a major concern for the widespread deployment of Cloud data centers. Among methods that can overcome this problem, we are interested in planning services that improve the use of data center resources in a dynamic environment. In this context, we propose throughout this paper a heuristic that predicts the allocation of dynamic and independent services to reduce the total energy consumption. Our proposal respects various constraints: availability, capacity of machines and the number of applications duplications. A series of experiments illustrates and validates the potential of our approach.
暂无评论