Frequent Itemset Mining (FIM) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. However, FIM suffers from three important limitations with th...
详细信息
ISBN:
(数字)9781728126166
ISBN:
(纸本)9781728126173
Frequent Itemset Mining (FIM) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. However, FIM suffers from three important limitations with the rapidly expanding of big data in all domains. First, it assumes that all items have the same importance. Second, it ignores the fact that data collected in a real-life environment is often inaccurate. Third, it is also a data-intensive and computation-intensive process which makes the FIM algorithm very time-consuming over large datasets. To address these issues, we propose a parallel uncertain frequent itemset mining algorithm with spark (Pufim). Pufim firstly expresses item uncertainty by considering both the probability and weight, and calculates the maximum probability weight value of 1-items. Next, a distributed Pufim-tree structure is designed inspiring by FP-Tree for reducing the times of scanning the databases. Each node of Pufim-tree stores an item and its maximum probability weight value. Finally, experiments on publicly available UCI datasets demonstrate that Pufim achieves more prominent results than other related approaches across various metrics. In addition, the empirical study also shows Pufim has a good scalability.
Modern FPGAs (Field Programmable Gate Arrays) are becoming increasingly important when it comes to embedded system development. Within these FPGAs, soft-core processors are often used to solve a wide range of differen...
详细信息
ISBN:
(纸本)9781728125312
Modern FPGAs (Field Programmable Gate Arrays) are becoming increasingly important when it comes to embedded system development. Within these FPGAs, soft-core processors are often used to solve a wide range of different tasks. Soft-core processors are a cost-effective and time-efficient way to realize embedded systems. The trend for soft-core processors, as well as mainstream CPUs (central processing units), leads to multi-core architectures. Both the necessary memory architectures and the compilers play an important role in this process. In this paper, a novel method that aims at minimizing the necessary memory resources on the FPGA while maximizing the processing speed of any given algorithm is described. In the first step, an application-specializable multi-soft-core processor architecture is presented that is capable of solving problems while adhering to hard real-time deadlines. Its special architecture and other necessary features are discussed. Furthermore, a method for the generation of optimized machine code for each processor core as well as hard real-time compatible deadlock handling mechanisms are presented. Selected algorithms are implemented to demonstrate the functionality and efficiency of the realized approach for different configurations of the multi-soft-core processor architecture.
Communication overheads in distributedsystems constitute a large fraction of the total execution time, and limit the scalability of applications running on these systems. We propose a DCT-based approximate communicat...
Communication overheads in distributedsystems constitute a large fraction of the total execution time, and limit the scalability of applications running on these systems. We propose a DCT-based approximate communication scheme that takes advantage of the error resiliency of several widely-used applications, and improves communication efficiency by substantially reducing message lengths. Our scheme is implemented into the Message Passing Interface (MPI) library. When evaluated on several representative MPI applications on a real cluster system, it is seen that the fraction of total execution time devoted to communication reduces from 59% to 23%, even accounting for the computational overhead required for DCT encoding. For many communication-intensive applications, it is shown that our approximate communication scheme effectively speeds up the total execution time without much loss in quality of the result.
Heterogeneous parallel architectures present many challenges to application developers. One of the most important ones is the decision where to execute a specific task. As today's systems are often dynamic in natu...
详细信息
Computing the all pair shortest paths in a graph is a widely used solution, but a time-consuming process too. The popularly used conventional algorithms rely solely on the computing capability of the CPU, but fail to ...
详细信息
Computing the all pair shortest paths in a graph is a widely used solution, but a time-consuming process too. The popularly used conventional algorithms rely solely on the computing capability of the CPU, but fail to meet the demand of real-time processing and mostly do not scale well for larger data. In this paper, we propose the ex-FTCD (extending Full Transitive Closure with Dijkstra's) algorithm for finding the all pair shortest path by merging the features of the greedy technique in Dijkstra's single source shortest path method and the transitive closure property. Experiments show that the process improves computing speed and is more scalable. We re-designed the algorithm for the parallel execution and implemented it in mapreduce on Hadoop that supports the conventional map/reduce jobs. This work also includes the implementation on Spark that supports the in-memory computational capability which uses Random Access Memory for computations. The experiments show that the numbers of iterations are relatively small for even large networks.
Massive individuals identification is a challenging problem in modern society. Particularly, finger-vein recognition is an emerging biometric technique with several advantages, especially in terms of security against ...
详细信息
Massive individuals identification is a challenging problem in modern society. Particularly, finger-vein recognition is an emerging biometric technique with several advantages, especially in terms of security against forgery. In this paper, we propose a hybrid parallel matching process for finger-vein recognition under two different parallel platforms, by using the Local Line Binary Pattern descriptor and Hamming distance. Our proposal aims to reduce the computation time of the matching process for massive individuals identification by using finger-vein patterns. Extensive evaluation shows that our approach obtains a high speed-up under a multi-core platform and a close to linear behavior for a multi-node platform. The results with our hybrid parallel system show that it is suitable for real-time individuals identification, achieving a speed-up up to 129.98x. To the best of our knowledge, our work is the first implementation of finger-vein recognition under a parallel platform, which is the main contribution of this paper.
_oab The proceedings contain 9 papers. The topics discussed include: EdgeEye - an edge service framework for real-time intelligent video analytics;the web as a distributed computing platform;enabling GPU-assisted anti...
ISBN:
(纸本)9781450358378
_oab The proceedings contain 9 papers. The topics discussed include: EdgeEye - an edge service framework for real-time intelligent video analytics;the web as a distributed computing platform;enabling GPU-assisted antivirus protection on android devices through edge offloading;a multi-cloudlet infrastructure for future smart cities: an empirical case study;profit-aware resource management for edge computing systems;Duvel: enabling context-driven, multi-profile apps on android through storage sand-boxing;enabling edge devices that learn from each other: cross modal training for activity recognition;semi-edge: from edge caching to hierarchical caching in network fog;and voice enabling mobile applications with UIVoice.
The proceedings contain 7 papers. The topics discussed include: Relays: towards a link layer for robust and secure fog computing;distributing computations in fog architectures;scheduling at the edge for assisting clou...
ISBN:
(纸本)9781450357760
The proceedings contain 7 papers. The topics discussed include: Relays: towards a link layer for robust and secure fog computing;distributing computations in fog architectures;scheduling at the edge for assisting cloud real-timesystems;enabling exclusive shared access to cloud of things resources;digital epidemiology and beyond;a novel NFV schedule optimization approach with sensitivity to packets dropping positions;and GoEdge: a scalable and stateless local breakout method.
We present a parallel parameter optimization algorithm for reproducing future projections of certain model outputs in dynamic general equilibrium models. The optimization problem is reduced to a nonlinear system of eq...
详细信息
We present a parallel parameter optimization algorithm for reproducing future projections of certain model outputs in dynamic general equilibrium models. The optimization problem is reduced to a nonlinear system of equations. The Jacobian matrix for a Newton type solver in the problem is generated in parallel. The parameter optimization algorithm is implemented for parallelsystems with distributed memory by using MPI. To achieve better performance of the parallel algorithm we use the parallel Fair-Taylor algorithm for computing an equilibrium path. Calculation of prices, input-output ratios and international trade for different time steps is carried out in parallel at each iteration of the method. The solution method is implemented for parallelsystems with shared memory by using OpenMP. The effectiveness of the hybrid MPI+OpenMP parallel code for parameter optimization is demonstrated in the example of a global multi-sector energy economics model with scenarios that are used for studying climate change impacts on land use. (C) 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
暂无评论