Data replication techniques have been extensively used in distributed systems to achieve, among others, due to system nodes failures: (a) high data availability, (b) system's reliability and (c) scalability. Due t...
详细信息
Automatic feature engineering aims to construct informative features automatically and reduce manual labor for machine learning applications. The majority of existing approaches are designed to handle tasks with only ...
详细信息
ISBN:
(纸本)9783030602390;9783030602383
Automatic feature engineering aims to construct informative features automatically and reduce manual labor for machine learning applications. The majority of existing approaches are designed to handle tasks with only one data source, which are less applicable to real scenarios. In this paper, we present a distributed automatic feature engineering algorithm, DAFEE, to generate features among multiple large-scale relational datasets. Starting from the target table, the algorithm uses a Breadth-First-Search type algorithm to find its related tables and constructs advanced high-order features that are remarkably effective in practical applications. Moreover, DAFEE implements a feature selection method to reduce the computational cost and improve predictive performance. Furthermore, it is highly optimized to process a massive volume of data. Experimental results demonstrate that it can significantly improve the predictive performance by 7% compared to SOTA algorithms.
With the help of recent development on semiconductor design and process technologies modern processors can provide a great opportunity to increase the performance of processing multimedia data by exploiting task- and ...
详细信息
Efficient mapping of logical processes to physical processes is one of key technologies to accelerate parallel performance simulation. Aiming at minimizing the communications between SMP nodes and between host physica...
详细信息
Following current IC design technology trend, modern GPUs integrate more and more processing cores, and the speed gap between processor and memory system becomes even larger. As the number of cores continually increas...
详细信息
This paper presents a new technique for pointer analysis of distributed programs executed on parallel machines with hierarchical memories. One motivation for this research is the languages whose global address space i...
详细信息
ISBN:
(纸本)9783642311277;9783642311284
This paper presents a new technique for pointer analysis of distributed programs executed on parallel machines with hierarchical memories. One motivation for this research is the languages whose global address space is partitioned. Examples of these languages are Fortress, X10, Titanium, Co-Array Fortran, UPC, and Chapel. These languages allow programmers to adjust threads and data layout and to write to and read from memories of other threads. The techniques presented in this paper have the form of type systems which are simply structured. The proposed technique is shown on a language which is the while langauge enriched with basic commands for pointer manipulations and also enriched with commands vital for distributed execution of programs. An abstraction analysis that for a given statement calculates the set of function abstractions that the statement may evaluate-to is introduced in this paper. The abstraction analysis is needed in the proposed pointer analysis. The mathematical soundness of all techniques presented in this paper are discussed. The soundness is proved against a new operational semantics presented in this paper. Our work has two advantages over related work. In our technique, each analysis result is associated with a correctness proof in the form of type derivation. The hierarchical memory model used in this paper is in line with the hierarchical character of concurrent parallel computers.
Nowadays, the explosive growth in data collection in business and scientific areas has required the need to analyze and mine useful knowledge residing in these data. The recourse to data mining techniques seems to be ...
详细信息
ISBN:
(纸本)9781538635810
Nowadays, the explosive growth in data collection in business and scientific areas has required the need to analyze and mine useful knowledge residing in these data. The recourse to data mining techniques seems to be inescapable in order to extract useful and novel patterns/models from large datasets. In this context, frequent itemsets (patterns) play an essential role in many data mining tasks that try to find interesting patterns from datasets. However, conventional approaches for mining frequent itemsets in Big Data era encounter significant challenges when computing power and memory space are limited. This paper proposes an efficient distributed frequent itemset mining algorithm, called parallelCharMax, that is based on a powerful sequential algorithm, called Charm, and computes the maximal frequent itemsets that are considered perfect summaries of the frequent ones. The proposed algorithm has been implemented using MapReduce framework. The experimental component of the study shows the efficiency and the performance of the proposed algorithm compared with well known algorithms such as MineWithRounds and HMBA.
Several techniques for running Prolog programs on pipelined vector processors, such as the Hitachi S-820 or the Cray-2, are developed. This paper presents an automatic program transformation (vectorization) method of ...
详细信息
This paper presents the development of a real-time system for recognition of textured objects. In contrast to current approaches which mostly rely on specialized multiprocessor architectures for fast processing, we us...
详细信息
ISBN:
(纸本)3540606971
This paper presents the development of a real-time system for recognition of textured objects. In contrast to current approaches which mostly rely on specialized multiprocessor architectures for fast processing, we use a distributed network architecture to support parallelism and attain real-time performance. In this paper, a new approach to image matching is proposed as the basis of object localization and positioning, which involves dynamic texture feature extraction and hierarchical image matching. A mask based stochastic method is introduced to extract feature points for matching. Our experimental results demonstrate that the combination of texture feature extraction and interesting point detection provides a better solution to the search of the best matching between two textured images. Furthermore, such an algorithm is implemented on a low cost heterogeneous PVM (parallel Virtual Machine) network to speed up the processing without specific hardware requirements.
This paper presents an H.264/AVC decoder realization on a dual-core SoC (System-on-Chip) platform by the well-designed macroblock level software partitioning. Furthermore, optimizations of the procedures executed on e...
详细信息
ISBN:
(纸本)9789898111135
This paper presents an H.264/AVC decoder realization on a dual-core SoC (System-on-Chip) platform by the well-designed macroblock level software partitioning. Furthermore, optimizations of the procedures executed on each core, and data movement between two cores are captured from software and hardware techniques. The evaluation results show that a video with D1 (720x480 pixels) resolution can reach real-time decoding by the implementation, which provides a valuable experience for similar designs.
暂无评论