Purpose This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computin...
详细信息
Purpose This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies. Design/methodology/approach In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors' proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic distributed and parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at
With the wide penetration of smart robots in multifarious fields, the simultaneous localization and mapping (SLAM) technique in robotics has attracted growing attention in the community. Yet collaborating SLAM over mu...
详细信息
With the wide penetration of smart robots in multifarious fields, the simultaneous localization and mapping (SLAM) technique in robotics has attracted growing attention in the community. Yet collaborating SLAM over multiple robots still remains challenging due to performance contradiction between the intensive graphics computation of SLAM and the limited computing capability of robots. While traditional solutions resort to the powerful cloud servers acting as an external computation provider, we show by real-world measurements that the significant communication overhead in data offloading prevents its practicability to real deployment. To tackle these challenges, this article promotes the emerging edge-computing paradigm into multirobot SLAM and proposes RecSLAM, a multirobot laser SLAM system that focuses on accelerating the map construction process under the robot-edge-cloud architecture. In contrast to the conventional multirobot SLAM that generates graphic maps on robots and completely merges them on the cloud, RecSLAM develops a hierarchical map fusion technique that directs robots' raw data to edge servers for real-time fusion and then sends to the cloud for global merging. To optimize the overall pipeline, an efficient multirobot SLAM collaborative processing framework is introduced to adaptively optimize robot-to-edge offloading tailored to heterogeneous edge resource conditions, meanwhile ensuring the workload balancing among the edge servers. Extensive evaluations show RecSLAM can achieve up to 39.31% processing latency reduction over the state of the art. Besides, a proof-of-concept prototype is developed and deployed in real scenes to demonstrate its effectiveness.
This article presents contributions in the field of path planning for industrial robots with 6 degrees of freedom. This work presents the results of our research in the last 4 years at the Institute for Process Contro...
详细信息
This article presents contributions in the field of path planning for industrial robots with 6 degrees of freedom. This work presents the results of our research in the last 4 years at the Institute for Process Control and Robotics at the University of Karlsruhe. The path planning approach we present works in an implicit and discretized C-space. Collisions are detected in the Cartesianworkspace by a hierarchical distance computation. The method is based on the A* search algorithm and needs no essential off-line computation. A new optimal discretization method leads to smaller search spaces, thus speeding up the planning. For a further acceleration, the search was parallelized. With a static load distribution good speedups can be achieved. By extending the algorithm to a bidirectional search, the planner is able to automatically select the easier search direction. The new dynamic switching of start and goal leads finally to the multi-goal path planning, which is able to compute a collision-free path between a set of goal poses (e.g., spot welding points) while minimizing the total path length. (C) 2001 John Wiley & Sons, Inc.
With advances in remote-sensing technology, the large volumes of data cannot be analyzed efficiently and rapidly, especially with arrival of high-resolution images. The development of image-processing technology is an...
详细信息
With advances in remote-sensing technology, the large volumes of data cannot be analyzed efficiently and rapidly, especially with arrival of high-resolution images. The development of image-processing technology is an urgent and complex problem for computer and geo-science experts. It involves, not only knowledge of remote sensing, but also of computing and networking. Remotely sensed images need to be processed rapidly and effectively in a distributed and parallel processing environment. Grid computing is a new form of distributed computing, providing an advanced computing and sharing model to solve large and computationally intensive problems. According to the basic principle of grid computing, we construct a distributedprocessing system for processing remotely sensed images. This paper focuses on the implementation of such a distributed computing and processing model based on the theory of grid computing. Firstly, problems in the field of remotely sensed image processing are analyzed. Then, the distributed (and parallel) computing model design, based on grid computing, is applied. Finally, implementation methods with middleware technology are discussed in detail. From a test analysis of our system, ***, the whole image-processing system is evaluated, and the results show the feasibility of the model design and the efficiency of the remotely sensed image distributed and parallel processing system. (C) 2006 Elsevier Inc. All rights reserved.
We present the Feature Tracking Kit (FTK), a framework that simplifies, scales, and delivers various feature-tracking algorithms for scientific data. The key of FTK is our simplicial spacetime meshing scheme that gene...
详细信息
We present the Feature Tracking Kit (FTK), a framework that simplifies, scales, and delivers various feature-tracking algorithms for scientific data. The key of FTK is our simplicial spacetime meshing scheme that generalizes both regular and unstructured spatial meshes to spacetime while tessellating spacetime mesh elements into simplices. The benefits of using simplicial spacetime meshes include (1) reducing ambiguity cases for feature extraction and tracking, (2) simplifying the handling of degeneracies using symbolic perturbations, and (3) enabling scalable and parallelprocessing. The use of simplicial spacetime meshing simplifies and improves the implementation of several feature-tracking algorithms for critical points, quantum vortices, and isosurfaces. As a software framework, FTK provides end users with VTK/ParaView filters, Python bindings, a command line interface, and programming interfaces for feature-tracking applications. We demonstrate use cases as well as scalability studies through both synthetic data and scientific applications including tokamak, fluid dynamics, and superconductivity simulations. We also conduct end-to-end performance studies on the Summit supercomputer. FTK is open sourced under the MIT license: https://***/hguo/ftk.
This paper focuses upon the development of three new electronic architectures of inference engines as a part of a hardware expert system applied to very high-speed faults detection in industrial processes. The archite...
详细信息
This paper focuses upon the development of three new electronic architectures of inference engines as a part of a hardware expert system applied to very high-speed faults detection in industrial processes. The architecture of this expert system consists of an inference engine (a dedicated processor that is necessary due to the high-speed requirements and the repetitiveness of the operation), which uses a pattern-directed inference system;a fact base, which stores the status of the signals at each moment, and a static knowledge base, which contains the inference rules compiled from expert knowledge. A circuit for analyzing time is also presented. This allows time to be taken as another variable of the process and carries out a redundancy analysis simultaneously with the fault detection module.
The proliferation of current and next-generation mobile and sensing devices has increased at an alarming rate. With these state-of-the-art devices, the global positioning system (GPS) has made remote sensing and locat...
详细信息
The proliferation of current and next-generation mobile and sensing devices has increased at an alarming rate. With these state-of-the-art devices, the global positioning system (GPS) has made remote sensing and location tracking more viable. One such query is the All Nearest Neighbor (ANN) query, which extracts and returns all data objects that are in close vicinity to all query objects. An ANN is a combination of k-nearest neighbors (kNN), and join queries. Hence, ANN has useful for applications in different domains such as transportation optimization, locating safe zones, and ride-sharing. An example of its applications is, "find the nearest gas station for each car parking lot". Because these applications are responsible for generating a massive number of query requests, a large amount of computation is required to return these query requests. As a single machine cannot meet this demand in this study, we propose a distributed query processing framework to process ANN queries using the Apache Spark framework. In an empirical study, our proposed framework achieved superior query efficiency and scalability compared to other methods and design alternatives.
A data structure is used to store materialized generalized transitive closure such that the evaluation of generalized transitive closure queries, deletions, and insertions of tuples can be performed efficiently in cen...
详细信息
A data structure is used to store materialized generalized transitive closure such that the evaluation of generalized transitive closure queries, deletions, and insertions of tuples can be performed efficiently in centralized and parallel environments. For transitive closure of a binary relation, in a single processor environment, it takes on the average O(m") to retrieve the ancestors/descendants of a given node, where m" is the number of ancestors/descendants of the given node, and it takes on the average O(m x m') to perform an insertion or a deletion of a tuple (a, b), where m is the number of ancestors of a plus 1 and m' is the number of descendants of b plus 1. The number of directed paths between two nodes can be determined in O(1) time. In multiprocessor/distributed environments, the time for retrieving the ancestors/descendants of a given node is O(m") plus two rounds of communications between the query processor and the other processors, and the times for insertion and deletion are O(m x d) plus two rounds of communications, where d is the smaller of m' or n/s, where n is the number of distinct attribute values, and s is the number of processors. The technique is generalized to mixed left and right linear recursive queries. The same performance results in terms of retrieval, deletions, and insertions in both centralized and parallel environments are obtained.
distributed data management is a key technology to enable efficient massive data processing and analysis in cluster-computing environments. Specifically, in environments where the data volumes are beyond the system ca...
详细信息
distributed data management is a key technology to enable efficient massive data processing and analysis in cluster-computing environments. Specifically, in environments where the data volumes are beyond the system capabilities, big data files are required to be summarized by representative samples with the same statistical properties as the whole dataset. This paper proposes a big data management system (BDMS) based on distributed random sample data blocks. It presents a high-level architecture design of the BDMS which extends the current distributed file systems. This system offers certain functionalities for block-level management such as statistically-aware data partitioning, data blocks organization, and data blocks selection. This paper also presents a round-random partitioning scheme to represent a big dataset as a set of non-overlapping data blocks;each block is a random sample of the whole dataset. Based on the presented scheme, two algorithms are introduced as an implementation strategy to convert the HDFS blocks of a big file into a set of random sample data blocks which is also stored in HDFS. The experimental results show that the execution time of partitioning operation is acceptable in the real applications because this operation is only performed once on each input data file. (C) 2018 Elsevier Inc. All rights reserved.
The large data volume and high algorithm complexity of hyperspectral image (HSI) problems have posed big challenges for efficient classification of massive HSI data repositories. Recently, cloud computing architecture...
详细信息
The large data volume and high algorithm complexity of hyperspectral image (HSI) problems have posed big challenges for efficient classification of massive HSI data repositories. Recently, cloud computing architectures have become more relevant to address the big computational challenges introduced in the HSI field. This article proposes an acceleration method for HSI classification that relies on scheduling metaheuristics to automatically and optimally distribute the workload of HSI applications across multiple computing resources on a cloud platform. By analyzing the procedure of a representative classification method, we first develop its distributed and parallel implementation based on the MapReduce mechanism on Apache Spark. The subtasks of the processing flow that can be processed in a distributed way are identified as divisible tasks. The optimal execution of this application on Spark is further formulated as a divisible scheduling framework that takes into account both task execution precedences and task divisibility when allocating the divisible and indivisible subtasks onto computing nodes. The formulated scheduling framework is an optimization procedure that searches for optimized task assignments and partition counts for divisible tasks. Two metaheuristic algorithms are developed to solve this divisible scheduling problem. The scheduling results provide an optimized solution to the automatic processing of HSI big data on clouds, improving the computational efficiency of HSI classification by exploring the parallelism during the parallelprocessing flow. Experimental results demonstrate that our scheduling-guided approach achieves remarkable speedups by facilitating the automatic processing of HSI classification on Spark, and is scalable to the increasing HSI data volume.
暂无评论