检索结果-内蒙古大学图书馆

On Acceleration and Scalability of Number Theoretic Private Information Retrieval

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2016年第6期27卷 1727-1741页

作者： Unal, Ecem Savas, Erkay Sabanci Univ Fac Engn & Nat Sci Istanbul Turkey

We present scalable and parallel versions of Lipmaa's computationally-private information retrieval (CPIR) scheme [20], which provides log-squared communication complexity. In the proposed schemes, instead of binary decision diagrams utilized in the original CPIR, we employ an octal tree based approach, in which non-sink nodes have eight child nodes. Using octal trees offers two advantages: i) a serial implementation of the proposed scheme in software is faster than the original scheme and ii) its bandwidth usage becomes less than the original scheme when the number of items in the data set is moderately high (e.g., 4,096 for 80-bit security level using Damgard-Jurik cryptosystem). In addition, we present a highly-optimized parallel algorithm for shared-memory multi-core/processor architectures, which minimizes the number of synchronization points between the cores. We show that the parallel implementation is about 50 times faster than the serial implementation for a data set with 4,096 items on an eight-core machine. Finally, we propose a hybrid algorithm that scales the CPIR scheme to larger data sets with small overhead in bandwidth complexity. We demonstrate that the hybrid scheme based on octal trees can lead to more than two orders of magnitude faster parallel implementations than serial implementations based on binary trees. Comparison with the original as well as the other schemes in the literature reveals that our scheme is the best in terms of bandwidth requirement.

关键词： Number theoretic private information retrieval security privacy parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Ternary Matrix Factorization: problem definitions and algorithms

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2016年第1期46卷 1-31页

作者： Maurus, Samuel Plant, Claudia Tech Univ Munich Helmholtz Zentrum Munchen D-85764 Neuherberg Germany

Can we learn from the unknown? Logical data sets of the ternary kind are often found in information systems. They contain unknown as well as true/false values. An unknown value may represent a missing entry (lost or indeterminable) or have meaning, like a Don't Know response in a questionnaire. In this paper, we introduce algorithms for reducing the dimensionality of logical data (categorical data in general) in the context of a new data mining challenge: Ternary Matrix Factorization (TMF). For a ternary data matrix, TMF exploits ternary logic to produce a basis matrix (which holds the major patterns in the data) and a usage matrix (which maps patterns to original observations). Both matrices are interpretable, and their ternary matrix product approximates the original matrix. TMF has applications in (1) finding targeted structure in ternary data, (2) imputing values through pattern discovery in highly incomplete categorical data sets, and (3) solving instances of its encapsulated Binary Matrix Factorization problem. Our elegant algorithm FasTer (FASt TERnary Matrix Factorization) has linear run-time complexity with respect to the dimensions of the data set and is parameter-robust. A variant of FasTer that exploits useful results from combinatorics provides accuracy bounds for a core part of the algorithm in certain situations. Experiments on synthetic and real-world data sets show that our algorithms are able to outperform state-of-the-art techniques in all three TMF applications with respect to run-time and effectiveness. Finally, convincing speedup and efficiency results on a parallel version of FasTer demonstrate its suitability for weak-and strong-scaling scenarios.

关键词： Three-valued logic Ternary data Matrix factorization Dimensionality reduction Missing values Imputation parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

PARAREAL MULTISCALE METHODS FOR HIGHLY OSCILLATORY DYNAMICAL SYSTEMS

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2016年第6期38卷 A3540-A3564页

作者： Ariel, Gil Kim, Seong Jun Tsai, Richard Bar Ilan Univ Dept Math IL-5290002 Ramat Gan Israel Georgia Inst Technol Dept Math Atlanta GA 30332 USA Univ Texas Austin Dept Math Austin TX 78712 USA Univ Texas Austin Inst Computat Engn & Sci Austin TX 78712 USA KTH Royal Inst Technol Stockholm Sweden

We introduce a new strategy for coupling the parallel in time (parareal) iterative methodology with multiscale integrators. Following the parareal framework, the algorithm computes a low-cost approximation of all slow variables in the system using an appropriate multiscale integrator, which is refined using parallel fine scale integrations. Convergence is obtained using an alignment algorithm for fast phase-like variables. The method may be used either to enhance the accuracy and range of applicability of the multiscale method in approximating only the slow variables, or to resolve all the state variables. The numerical scheme does not require that the system is split into slow and fast coordinates. Moreover, the dynamics may involve hidden slow variables, for example, due to resonances. We propose an alignment algorithm for almost-periodic solutions, in which case convergence of the parareal iterations is proved. The applicability of the method is demonstrated in numerical examples.

关键词： multiscale computation parallel algorithms highly oscillatory problems

来源：评论

学校读者我要写书评

暂无评论

MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression

引用

BMC BIOINFORMATICS 2016年第1期17卷 1-13页

作者： Kim, Minji Zhang, Xiejia Ligo, Jonathan G. Farnoud, Farzad Veeravalli, Venugopal V. Milenkovic, Olgica Univ Illinois Dept Elect & Comp Engn Urbana IL 61801 USA CALTECH Dept Elect Engn Pasadena CA 91125 USA

Background: Metagenomics is a genomics research discipline devoted to the study of microbial communities in environmental samples and human and animal organs and tissues. Sequenced metagenomic samples usually comprise reads from a large number of different bacterial communities and hence tend to result in large file sizes, typically ranging between 1-10 GB. This leads to challenges in analyzing, transferring and storing metagenomic data. In order to overcome these data processing issues, we introduce MetaCRAM, the first de novo, parallelized software suite specialized for FASTA and FASTQ format metagenomic read processing and lossless compression. Results: MetaCRAM integrates algorithms for taxonomy identification and assembly, and introduces parallel execution methods;furthermore, it enables genome reference selection and CRAM based compression. MetaCRAM also uses novel reference-based compression methods designed through extensive studies of integer compression techniques and through fitting of empirical distributions of metagenomic read-reference positions. MetaCRAM is a lossless method compatible with standard CRAM formats, and it allows for fast selection of relevant files in the compressed domain via maintenance of taxonomy information. The performance of MetaCRAM as a stand-alone compression platform was evaluated on various metagenomic samples from the NCBI Sequence Read Archive, suggesting 2- to 4-fold compression ratio improvements compared to gzip. On average, the compressed file sizes were 2-13 percent of the original raw metagenomic file sizes. Conclusions: We described the first architecture for reference-based, lossless compression of metagenomic data. The compression scheme proposed offers significantly improved compression ratios as compared to off-the-shelf methods such as zip programs. Furthermore, it enables running different components in parallel and it provides the user with taxonomic and assembly information generated during execution of the

关键词： Metagenomics Genomic compression parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Robust maximum likelihood estimation for stochastic state space model with observation outliers

引用

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE 2016年第11期47卷 2733-2744页

作者： AlMutawa, J. King Fahd Univ Petr & Minerals Dept Math & Stat Dhahran 31261 Saudi Arabia

The objective of this paper is to develop a robust maximum likelihood estimation (MLE) for the stochastic state space model via the expectation maximisation algorithm to cope with observation outliers. Two types of outliers and their influence are studied in this paper: namely,the additive outlier (AO) and innovative outlier (IO). Due to the sensitivity of the MLE to AO and IO, we propose two techniques for robustifying the MLE: the weighted maximum likelihood estimation (WMLE) and the trimmed maximum likelihood estimation (TMLE). The WMLE is easy to implement with weights estimated from the data;however, it is still sensitive to IO and a patch of AO outliers. On the other hand, the TMLE is reduced to a combinatorial optimisation problem and hard to implement but it is efficient to both types of outliers presented here. To overcome the difficulty, we apply the parallel randomised algorithm that has a low computational cost. A Monte Carlo simulation result shows the efficiency of the proposed algorithms.

关键词： EM algorithm weighted likelihood estimation trimmed maximum likelihood estimation parallel algorithms maximum likelihood estimation randomised algorithm outliers stochastic state space model

来源：评论

学校读者我要写书评

暂无评论

Performance optimizations for scalable implicit RANS calculations with SU2

引用

COMPUTERS & FLUIDS 2016年 129卷 146-158页

作者： Economon, Thomas D. Mudigere, Dheevatsa Bansal, Gaurav Heinecke, Alexander Palacios, Francisco Park, Jongsoo Smelyanskiy, Mikhail Alonso, Juan J. Dubey, Pradeep Stanford Univ Dept Aeronaut & Astronaut Stanford CA 94305 USA Intel Corp Parallel Comp Lab Bangalore Karnataka India Intel Corp Software & Serv Grp Hillsboro OR 97124 USA Intel Corp Parallel Comp Lab Santa Clara CA 95044 USA Boeing Co Adv Concepts Grp Long Beach CA 90808 USA

In this paper, we present single- and multi-node optimizations of SU2, a widely-used, open-source Computational Fluid Dynamics application, aimed at improving performance and scalability for implicit Reynolds-averaged Navier-Stokes calculations on unstructured grids. Typical industry-standard implementations are currently limited by unstructured accesses, variable degrees of parallelism, as well as the global synchronizations inherent in traditionally used Krylov linear solvers. Therefore, we rely on aggressive single-node optimizations, such as hierarchical parallelism, dynamic threading, compacted memory layout, and vectorization, along with a communication-friendly agglomeration (geometric) linear multi grid solver. Based on results with the well-known ONERA M6 geometry, our single core and shared memory optimizations result in a speedup of 2.6X on the latest 14-core Intel (R) Xeon (TM) (1) E5-2697v3 processor when compared to the baseline SU2 implementation with 14 MPI ranks. In multi-node settings, the hybrid OpenMP+MPI multigrid implementation achieves 2X higher parallel efficiency on 256 nodes over conventional Krylov-based (GMRES) methods. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： CFD Multigrid parallel algorithms High performance computing

来源：评论

学校读者我要写书评

暂无评论

Some recent advances in automated analysis

引用

INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER 2016年第2期18卷 121-128页

作者： Abraham, Erika Havelund, Klaus Rhein Westfal TH Aachen Aachen Germany CALTECH Jet Prop Lab Pasadena CA USA

Due to the increasing complexity of software systems, there is a growing need for automated and scalable software synthesis and analysis. In the last decade, active research in the formal methods community brought interesting results and valuable tools. However, there are still challenges to face and hard problems that need to be solved. We briefly outline some recent trends, and review some of the latest achievements, introducing six papers selected from the 20th International Conference on Tools and algorithms for the Construction and Analysis of Systems (TACAS 2014).

关键词： Analysis parallel algorithms Satisfiability modulo theories Runtime verification Probabilistic systems

来源：评论

学校读者我要写书评

暂无评论

Effective Connectivity Analysis in Brain Networks: A GPU-Accelerated Implementation of the Cox Method

引用

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 2016年第7期10卷 1226-1237页

作者： Andalibi, Vafa Christophe, Francois Laukkarinen, Teemu Mikkonen, Tommi Tampere Univ Technol Dept Elect & Commun FI-33101 Tampere Finland Tampere Univ Technol Dept Pervas Comp FI-33101 Tampere Finland

The observation of interactions between neurons of a network can reveal important information about how information is processed within that network. Such observation can be established with the analysis of causality between the activities of the different neurons in the network. This analysis is called effective connectivity analysis. However, methods for such analysis are either computationally heavy for daily use or too inaccurate for making reliable analyses. Cox method produces reliable analysis, but the computation takes hours on CPUs, making it slow to use on research. In this paper, two algorithms are presented that speed up analysis of Cox method by parallelizing the computation on a graphical processing unit (GPU) with the help of a Compute Unified Device Architecture platform. Both algorithms are evaluated according to the network size and recording duration. The interest of proposing GPU implementations is in gaining the computation time but another important interest is that such implementation requires rethinking the algorithm in different ways than as the sequential implementation. This rethinking itself brings new optimization possibilities, e.g. by employing OpenCL. Utilizing this accelerated implementation, the Cox method is then applied on an experimental dataset from CRCNS in a personal computer. This should facilitate observations of biological neural network organizations that can provide new insights to improve understanding of memory, learning and intelligence.

关键词： Biological neural networks complex networks maximum likelihood estimation parallel algorithms parallel processing topology analysis

来源：评论

学校读者我要写书评

暂无评论

Scalable algorithms for NFA Multi-Striding and NFA-Based Deep Packet Inspection on GPUs

引用

IEEE-ACM TRANSACTIONS ON NETWORKING 2016年第3期24卷 1704-1717页

作者： Avalle, Matteo Risso, Fulvio Sisto, Riccardo Politecn Torino Dipartimento Automat & Informat I-10129 Turin Italy

Finite state automata (FSA) are used by many network processing applications to match complex sets of regular expressions in network packets. In order to make FSA-based matching possible even at the ever-increasing speed of modern networks, multi-striding has been introduced. This technique increases input parallelism by transforming the classical FSA that consumes input byte by byte into an equivalent one that consumes input in larger units. However, the algorithms used today for this transformation are so complex that they often result unfeasible for large and complex rule sets. This paper presents a set of new algorithms that extend the applicability of multi-striding to complex rule sets. These algorithms can transform nondeterministic finite automata (NFA) into their multi-stride form with reduced memory and time requirements. Moreover, they exploit the massive parallelism of graphical processing units for NFA-based matching. The final result is a boost of the overall processing speed on typical regex-based packet processing applications, with a speedup of almost one order of magnitude compared to the current state-of-the-art algorithms.

关键词： Computer networks parallel algorithms pattern matching

来源：评论

学校读者我要写书评

暂无评论

Mesh-free data transfer algorithms for partitioned multiphysics problems: Conservation, accuracy, and parallelism

引用

JOURNAL OF COMPUTATIONAL PHYSICS 2016年 307卷 164-188页

作者： Slattery, Stuart R. Oak Ridge Natl Lab Comp Sci & Math Div Computat Engn & Energy Sci Grp 1 Bethel Valley Rd Oak Ridge TN 37831 USA

In this paper we analyze and extend mesh-free algorithms for three-dimensional data transfer problems in partitioned multiphysics simulations. We first provide a direct comparison between a mesh-based weighted residual method using the common-refinement scheme and two mesh-free algorithms leveraging compactly supported radial basis functions: one using a spline interpolation and one using a moving least square reconstruction. Through the comparison we assess both the conservation and accuracy of the data transfer obtained from each of the methods. We do so for a varying set of geometries with and without curvature and sharp features and for functions with and without smoothness and with varying gradients. Our results show that the mesh-based and mesh-free algorithms are complementary with cases where each was demonstrated to perform better than the other. We then focus on the mesh-free methods by developing a set of algorithms to parallelize them based on sparse linear algebra techniques. This includes a discussion of fast parallel radius searching in point clouds and restructuring the interpolation algorithms to leverage data structures and linear algebra services designed for large distributed computing environments. The scalability of our new algorithms is demonstrated on a leadership class computing facility using a set of basic scaling studies. These scaling studies show that for problems with reasonable load balance, our new algorithms for both spline interpolation and moving least square reconstruction demonstrate both strong and weak scalability using more than 100,000 MPI processes with billions of degrees of freedom in the data transfer operation. (C) 2015 Elsevier Inc. All rights reserved.

关键词： Data transfer Multiphysics parallel algorithms Moving least square Spline interpolation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：