Distributed computing technology has been widely used to solve complex problems appearing in parallelprocessing systems. Job scheduling is very important in many distributed computing systems, like grid systems and h...
详细信息
Floating point computing ability is an important concern in high performance scientific application and engineering computing. Although as a fundamental operation, floating point division (or reciprocal) has long been...
详细信息
Organizations have begun outsourcing management of their data to third party cloud service providers after the introduction of Database as a Service (DAS) model. A cloud database is a database that typically runs on a...
详细信息
ISBN:
(纸本)9781479917976
Organizations have begun outsourcing management of their data to third party cloud service providers after the introduction of Database as a Service (DAS) model. A cloud database is a database that typically runs on a cloud computing platform, such as Amazon EC2, GoGrid, Salesforce and Rackspace. But outsourcing the data raises concerns over privacy. A typical solution is to store databases in encrypted form on the remote server. Queried records are downloaded from the server and decrypted for further processing. Bucketization is one technique for executing queries over encrypted data on a DAS server. this paper is an extension to work done by other researchers [1-4]. Query Optimal Bucketization (QOB) algorithm [1-2] divides the server data into buckets subject to an optimality constraint. In an earlier paper [3], the authors proposed Binary Query Bucketization (BQB) to improve the search time for bucketized datasets and reduce the number of records that are processed by QOB. In this paper, we propose a parallel Binary Query Bucketization (PBQB) algorithm to query records located in the DAS. It integrates parallel search [4] and BQB. parallel search divides the search workload into chunks with each thread/processor working on a chunk. Simulation is used to assess the numerical performance of PBQB. It is shown that the proposed algorithm outperforms BQB.
ADAS (Advanced Driver Assistance Systems) algorithms increasingly use heavy image processing operations. To embed this type of algorithms, semiconductor companies offer many heterogeneous architectures. these SoCs (Sy...
详细信息
ADAS (Advanced Driver Assistance Systems) algorithms increasingly use heavy image processing operations. To embed this type of algorithms, semiconductor companies offer many heterogeneous architectures. these SoCs (System on Chip) are composed of different processing units, with different capabilities, and often with massively parallel computing unit. Due to the complexity of these SoCs, predicting if a given algorithm can be executed in real time on a given architecture is not trivial. In fact it is not a simple task for automotive industry actors to choose the most suited heterogeneous SoC for a given application. Moreover, embedding complex algorithms on these systems remains a difficult task due to heterogeneity, it is not easy to decide how to allocate parts of a given algorithm on the different computing units of a given SoC. In order to help automotive industry in embedding algorithms on heterogeneous architectures, we propose a novel approach to predict performances of image processingalgorithms applicable on different types of computing units. Our methodology is able to predict a more or less wide interval of execution time with a degree of confidence using only high level description of algorithms, and a few characteristics of computing units.
Nyström method and low-rank linearized Support Vector Machines (SVMs) are two widely used methods for scaling up kernel SVMs, both of which need to sample part of columns of the kernel matrix to reduce the size. ...
详细信息
In this paper, we parallelize the collision detection of five- axis machining as an example to show how to execute CNC applications on Graphics processing Unit (GPU). We first design and implement an efficient collisi...
详细信息
We develop an efficient parallel algorithm for answering shortest-path queries in planar graphs and implement it on a multi-node CPU-GPU clusters. the algorithm uses a divide-and-conquer approach for decomposing the i...
详细信息
ISBN:
(纸本)9783319265209;9783319265193
We develop an efficient parallel algorithm for answering shortest-path queries in planar graphs and implement it on a multi-node CPU-GPU clusters. the algorithm uses a divide-and-conquer approach for decomposing the input graph into small and roughly equal subgraphs and constructs a distributed data structure containing shortest distances within each of those subgraphs and between their boundary vertices. For a planar graph with n vertices, that data structure needs O(n) storage per processor and allows queries to be answered in O(n(1/4)) time.
Computation of optical flow is a fundamental step in computer vision applications. However, due to its high complexity, it is difficult to compute a high-accuracy optical flow field in real time. this paper proposes a...
详细信息
this paper presents an optimized adder-based formulation for low-area and low-power implementation of 1-D DWT using 5/3 and 9/7 filters. Not only the number of adders is minimized, the number bit-shifts also minimized...
详细信息
ISBN:
(纸本)9781479966585
this paper presents an optimized adder-based formulation for low-area and low-power implementation of 1-D DWT using 5/3 and 9/7 filters. Not only the number of adders is minimized, the number bit-shifts also minimized in the formulation to reduce the bit-width of intermediate results. Separate Adder-based designs are derived using the proposed formulation for 9/7 filter, 5/3 filter and a reconfigurable structure for both 9/7 and 5/3 filters. the proposed structure for 9/7 filter requires 19 adders and 11 hardwired-shifters (shifters are implemented by rewiring only) and computes two DWT components in every clock cycle. It requires only 8 registers for two-stage pipeline implementation. the proposed reconfigurable structure involves a small overhead of complexity in terms of one adder, 2 MUXes, 2 registers, and 4 extra hardwired-shifters than the proposed 9/7 structure to have the reconfigurable design. the proposed reconfigurable structure supports higher usable frequency (without pipelining), and provides double the throughput per clock cycle compared to that of best available similar structure with marginally higher area complexity. ASIC synthesis results show that the proposed pipelined structure for 9/7 filters involves nearly 70% less ADP and 82% less EPO than the best of DA-based structures. Further, it involves less than half the ADP and 47% less EPO than the corresponding recent multiplier-based structure. the proposed reconfigurable structure involves less than one-third the EPO and ADP of similar existing structure. the proposed design indicates the superiority of adder-based design over DA-based design as well as conventional multiplier-based design.
Bloom filters are widely used in databases and network areas. these filters facilitate efficient membership checking with a low false positive ratio. It is a way to improve the throughput of bloom filter by parallel p...
详细信息
暂无评论