Nowadays, the volume of multimedia and unstructured data has grown rapidly. More and more three-dimensional (3D) models are created for ever increasing applications. New storage and processing technologies are needed ...
详细信息
ISBN:
(纸本)9781467347143
Nowadays, the volume of multimedia and unstructured data has grown rapidly. More and more three-dimensional (3D) models are created for ever increasing applications. New storage and processing technologies are needed to keep pace withthe continuous growth of big data. Hadoop is an attractive and open-source platform for large-scale data storage and analytics. Our previous research work has applied Hadoop distributed file system to efficiently manage 3D data for a 3D model retrieval system. To take better advantages of Hadoop, in this paper we propose two parallel strategies to improve the storing and accessing performance of 3D models. the MapReduce paradigm is adopted to provide a coarse grained parallelism for data loading, and a lightweight multithreaded algorithm is presented for data accesses. We conduct an extensive performance study on a cluster and the results show that significant performance increase can be gained for the parallel techniques.
the proceedings contain 87 papers. the topics discussed include: dynamic transactional workflows in service-oriented environments;watermarking images in the frequency domain by exploiting self-inverting permutations;a...
ISBN:
(纸本)9789898565549
the proceedings contain 87 papers. the topics discussed include: dynamic transactional workflows in service-oriented environments;watermarking images in the frequency domain by exploiting self-inverting permutations;a dynamic load balancing strategy for a distributed biometric authentication system;towards a model-driven development of web applications;designing a click fraud detection algorithm - exposing suspect networks;model-based performance testing of web services using probabilistic timed automata;web forums change analysis;empowering collaborative business intelligence by the use of online social networks;semantic matching to achieve software component discovery and composition;cost-effective web-based media synchronization schemes for real-time distributed groupware;and making data citable - a web-based system for the registration of social and economics science data.
Network search makes operational data available in real-time to management applications. In contrast to traditional monitoring, neither the data location nor the data format needs to be known to the invoking process, ...
详细信息
ISBN:
(纸本)9783901882531
Network search makes operational data available in real-time to management applications. In contrast to traditional monitoring, neither the data location nor the data format needs to be known to the invoking process, which simplifies application development, but requires an efficient search plane inside the managed system. the search plane is realized as a network of search nodes that process search queries in a distributed fashion. this paper introduces matching and ranking for network search queries. We are proposing a semantic for matching and ranking, which is configurable to support different types of management applications-from exact matching for database-style queries to loose, approximate matching, which is appropriate for exploratory purposes. We describe an echo protocol for efficient distributed query processing that supports matching and ranking. Further, we present the design of a search node, which maintains a real-time database of operational information and allows for parallel processing of search queries. A prototype implementation on a cloud testbed shows that the network search system, on a 9-node cluster with 24 core servers, executes 200 global search queries/sec withthe 75th percetile latency below 100 milliseconds and with a CPU utilization below 5%. the performance measurements, together with our design, suggest that a system of 100,000 servers processing the same load would exhibit the same overhead per server and a query latency of below 1 sec.
parallel programming patterns provide enduring principles that serve as a conceptual framework to orient students when they set out to solve problems. Learning patterns enables students to quickly gain the intellectua...
详细信息
In last years, wireless networking is becoming very popular because it is able to satisfy user requests in terms of Quality of Service (QoS);when mobility is present, perhaps, hand-over issues are relevant when hosts ...
详细信息
ISBN:
(纸本)9781467324809
In last years, wireless networking is becoming very popular because it is able to satisfy user requests in terms of Quality of Service (QoS);when mobility is present, perhaps, hand-over issues are relevant when hosts change coverage areas during their active sessions. It is very important to mitigate mobility effects, employing an appropriate bandwidth management policy. In our work, we propose two integrated schemes: the first one is based on Markov theory and is aimed at the prediction of mobile hosts movements (in terms of future cells), while the second one is based on statistical theory and is aimed at the minimization of the wasted bandwidth used for passive reservations. So, the proposed Pattern Prediction and Passive Bandwidth Management Algorithm (3P-BMA) is the result of the integration of the Markov predictor and the statistical bandwidth management scheme. 3P-BMA is completely independent on the considered technology, mobility model and vehicular environment. We do not care if the coverage is made by UMTS or WLAN technologies, if hosts are pedestrians or mobile users, etc. Some campaigns of simulation have been led-out in order to confirm the effectiveness of the proposed idea in terms of prediction accuracy, Call Dropping/Blocking probabilities and system utilization.
the training of SVM can be viewed as a Convex Quadratic Programming (CQP) problem which becomes difficult to be solved when dealing withthe large scale data sets. Traditional methods such as Sequential Minimal Optimi...
详细信息
ISBN:
(纸本)9781479924189
the training of SVM can be viewed as a Convex Quadratic Programming (CQP) problem which becomes difficult to be solved when dealing withthe large scale data sets. Traditional methods such as Sequential Minimal Optimization (SMO) for SVM training is used to solve a sequence of small scale sub-problems, which costs a large amount of computation time and is hard to be accelerated by utilizing the computation power of GPU. Although Interior Point Method (IPM) such as primal-dual interior point method (PDIPM) can be also addressed SVM training well and has favourable potential for parallelizing on GPU, it contains comparatively high time complexity O(l(3)) and space complexity O(l(2)), where l is the number of training instances. Fortunately, by invoking low-rank approximation methods such as Incomplete Cholesky Factorization (ICF) and Sherman Morrison Woodbury formula (SMW), the requirements of both storage and computation of PDIPM can be reduced significantly. In this paper, a parallel PDIPM method (P-PDIPM) along with a parallel ICF method (P-ICF) is proposed to accelerate the SVM training on GPU. Experimental results indicate that the training speed of P-PDIPM on GPU is almost 40x faster than that of the serial one (S-PDIPM) on CPU. Besides, without extensive optimization, P-PDIPM can obtain about 8x speedup over the state of the art tool LIBSVM while maintaining high prediction accuracy.
the proceedings contain 45 papers. the topics discussed include: theoretical distributedcomputing meets biology: a review;participatory sensing: crowdsourcing data from mobile smartphones in urban spaces;energy effic...
ISBN:
(纸本)9783642360701
the proceedings contain 45 papers. the topics discussed include: theoretical distributedcomputing meets biology: a review;participatory sensing: crowdsourcing data from mobile smartphones in urban spaces;energy efficient distributedcomputing on mobile devices;data insertion and archiving in erasure-coding based large-scale storage systems;medical software - issues and best practices;improved interference in wireless sensor networks;trust based secure gateway discovery mechanism for integrated Internet and MANET;improving mapreduce performance through complexity and performance based data placement in heterogeneous hadoop clusters;online recommendation of learning path for an e-learner under virtual university;a parallel 2-approximation NC-algorithm for range assignment problem in packet radio networks;and an efficient localization of nodes in a sensor network using conflict minimization with controlled initialization.
Combinatorial problems are NP-complete, which means even infinite number of CPUs take polynomial time to search an optimal solution. therefore approximate search algorithms such as Genetic Algorithms are used. However...
详细信息
ISBN:
(纸本)9781479932115
Combinatorial problems are NP-complete, which means even infinite number of CPUs take polynomial time to search an optimal solution. therefore approximate search algorithms such as Genetic Algorithms are used. However, such an approximate search algorithm easily falls into local optimum and just distributed / parallel processing seems inefficient. In this paper, this inefficiency is shown by simulation using TSP library as the example of optimal route scheduling. then, an autonomous distributed GA to cope withthis inefficiency through exchanging information about individuals (to calculate fitness /divergence /situation) among autonomous CPUs is proposed in solving real-time combinatorial problems. Using TSP library again, its effectiveness is shown by simulation experiments.
Information filtering systems constitute a critical component in modern information seeking applications. As the number of users grows and the information available becomes even bigger it is imperative to employ scala...
详细信息
Optimisation of data-parallel scientific applications for modern HPC platforms is challenging in terms of efficient use of heterogeneous hardware and software. It requires partitioning the computations in proportion t...
详细信息
ISBN:
(纸本)9783642399572;9783642399589
Optimisation of data-parallel scientific applications for modern HPC platforms is challenging in terms of efficient use of heterogeneous hardware and software. It requires partitioning the computations in proportion to the speeds of computing devices. Implementation of data partitioning algorithms based on computation performance models is not trivial. It requires accurate and efficient benchmarking of devices, which may share the same resources but execute different codes, appropriate interpolation methods to predict performance, and mathematical methods to solve the data partitioning problem. In this paper, we present a software framework that addresses these issues and automates the main steps of data partitioning. We demonstrate how it can be used to optimise data-parallelapplications for modern heterogeneous HPC platforms.
暂无评论