While remotely operated unmanned vehicles are increasingly a part of everyday life, truly autonomous robots capable of independent operation in dynamic environments have yet to be realized - particularly in the case o...
详细信息
While remotely operated unmanned vehicles are increasingly a part of everyday life, truly autonomous robots capable of independent operation in dynamic environments have yet to be realized - particularly in the case of ground robots required to interact with humans and their environment. We present a unified multiresolution vision model for this application designed to provide the wide field of view required to maintain situational awareness and sufficient visual acuity to recognize elements of the environment while permitting feasible implementations in real-time vision applications. The model features a kind of color-constant processing through single-opponent color channels and contrast invariant oriented edge detection using a novel implementation of the Combination of Receptive Fields model. The model provides color and edge-based salience assessment, as well as a compressed color image representation suitable for subsequent object identification. We show that bottom-up visual saliency computed using this model is competitive with the current state-of-the-art while allowing computation in a compressed domain and mimicking the human visual system with nearly half (45%) of computational effort focused within the fovea. This method reduces storage requirement of the image pyramid to less than 5% of the full image, and computation in this domain reduces model complexity in terms of both computational costs and memory requirements accordingly. We also quantitatively evaluate the model for its application domain by using it with a camera/lens system with a 185 field of view capturing 3.5M pixel color images by using a tuned salience model to predict human fixations. (C) 2015 Elsevier B.V. All rights reserved.
Background: Metagenomics is a genomics research discipline devoted to the study of microbial communities in environmental samples and human and animal organs and tissues. Sequenced metagenomic samples usually comprise...
详细信息
Background: Metagenomics is a genomics research discipline devoted to the study of microbial communities in environmental samples and human and animal organs and tissues. Sequenced metagenomic samples usually comprise reads from a large number of different bacterial communities and hence tend to result in large file sizes, typically ranging between 1-10 GB. This leads to challenges in analyzing, transferring and storing metagenomic data. In order to overcome these data processing issues, we introduce MetaCRAM, the first de novo, parallelized software suite specialized for FASTA and FASTQ format metagenomic read processing and lossless compression. Results: MetaCRAM integrates algorithms for taxonomy identification and assembly, and introduces parallel execution methods;furthermore, it enables genome reference selection and CRAM based compression. MetaCRAM also uses novel reference-based compression methods designed through extensive studies of integer compression techniques and through fitting of empirical distributions of metagenomic read-reference positions. MetaCRAM is a lossless method compatible with standard CRAM formats, and it allows for fast selection of relevant files in the compressed domain via maintenance of taxonomy information. The performance of MetaCRAM as a stand-alone compression platform was evaluated on various metagenomic samples from the NCBI Sequence Read Archive, suggesting 2- to 4-fold compression ratio improvements compared to gzip. On average, the compressed file sizes were 2-13 percent of the original raw metagenomic file sizes. Conclusions: We described the first architecture for reference-based, lossless compression of metagenomic data. The compression scheme proposed offers significantly improved compression ratios as compared to off-the-shelf methods such as zip programs. Furthermore, it enables running different components in parallel and it provides the user with taxonomic and assembly information generated during execution of the
Efficiently parallelizable parameterized problems have been classified as being either in the class FPP (fixed-parameter parallelizable) or the class PNC (parameterized analog of NC), which contains FPP as a subclass....
详细信息
Can we learn from the unknown? Logical data sets of the ternary kind are often found in information systems. They contain unknown as well as true/false values. An unknown value may represent a missing entry (lost or i...
详细信息
Can we learn from the unknown? Logical data sets of the ternary kind are often found in information systems. They contain unknown as well as true/false values. An unknown value may represent a missing entry (lost or indeterminable) or have meaning, like a Don't Know response in a questionnaire. In this paper, we introduce algorithms for reducing the dimensionality of logical data (categorical data in general) in the context of a new data mining challenge: Ternary Matrix Factorization (TMF). For a ternary data matrix, TMF exploits ternary logic to produce a basis matrix (which holds the major patterns in the data) and a usage matrix (which maps patterns to original observations). Both matrices are interpretable, and their ternary matrix product approximates the original matrix. TMF has applications in (1) finding targeted structure in ternary data, (2) imputing values through pattern discovery in highly incomplete categorical data sets, and (3) solving instances of its encapsulated Binary Matrix Factorization problem. Our elegant algorithm FasTer (FASt TERnary Matrix Factorization) has linear run-time complexity with respect to the dimensions of the data set and is parameter-robust. A variant of FasTer that exploits useful results from combinatorics provides accuracy bounds for a core part of the algorithm in certain situations. Experiments on synthetic and real-world data sets show that our algorithms are able to outperform state-of-the-art techniques in all three TMF applications with respect to run-time and effectiveness. Finally, convincing speedup and efficiency results on a parallel version of FasTer demonstrate its suitability for weak-and strong-scaling scenarios.
Swarm is a parallel architecture that exploits ordered parallelism. It executes tasks speculatively and out of order and can scale to large core counts and speculation windows. The authors evaluate swarm on graph anal...
详细信息
Swarm is a parallel architecture that exploits ordered parallelism. It executes tasks speculatively and out of order and can scale to large core counts and speculation windows. The authors evaluate swarm on graph analytics, simulation, and database benchmarks. At 64 cores, swarm outperforms sequential implementations of these algorithms by 43 to 117 times and state-of-the-art software-only parallel algorithms by 3 to 18 times.
We propose a derivative-free algorithm for finding high-quality local minima for functions that require significant computational resources to evaluate. Our algorithm efficiently utilizes the computational resources a...
详细信息
We propose a derivative-free algorithm for finding high-quality local minima for functions that require significant computational resources to evaluate. Our algorithm efficiently utilizes the computational resources allocated to it and also has strong theoretical results, almost surely starting a finite number of local optimization runs and identifying all local minima. We propose metrics for measuring how efficiently an algorithm finds local minima, and we benchmark our algorithm on synthetic problems (with known local minima) and two real-world applications.
We present scalable and parallel versions of Lipmaa's computationally-private information retrieval (CPIR) scheme [20], which provides log-squared communication complexity. In the proposed schemes, instead of bina...
详细信息
We present scalable and parallel versions of Lipmaa's computationally-private information retrieval (CPIR) scheme [20], which provides log-squared communication complexity. In the proposed schemes, instead of binary decision diagrams utilized in the original CPIR, we employ an octal tree based approach, in which non-sink nodes have eight child nodes. Using octal trees offers two advantages: i) a serial implementation of the proposed scheme in software is faster than the original scheme and ii) its bandwidth usage becomes less than the original scheme when the number of items in the data set is moderately high (e.g., 4,096 for 80-bit security level using Damgard-Jurik cryptosystem). In addition, we present a highly-optimized parallel algorithm for shared-memory multi-core/processor architectures, which minimizes the number of synchronization points between the cores. We show that the parallel implementation is about 50 times faster than the serial implementation for a data set with 4,096 items on an eight-core machine. Finally, we propose a hybrid algorithm that scales the CPIR scheme to larger data sets with small overhead in bandwidth complexity. We demonstrate that the hybrid scheme based on octal trees can lead to more than two orders of magnitude faster parallel implementations than serial implementations based on binary trees. Comparison with the original as well as the other schemes in the literature reveals that our scheme is the best in terms of bandwidth requirement.
Wireless communications are expected to take place in increasingly complicated scenarios, such as dense urban, forest, tunnel and other significant cluttered environments. A key challenge emerging is to understand the...
详细信息
ISBN:
(纸本)9781509063604
Wireless communications are expected to take place in increasingly complicated scenarios, such as dense urban, forest, tunnel and other significant cluttered environments. A key challenge emerging is to understand the physics and characteristics of wireless channels in complex environments, which are critical for the analysis, design, and application of future mobile and wireless communication systems. The objective of this work is to investigate high-resolution, high-performance computational algorithms for extreme-scale channel modeling in real-world environments. The system-level large scene analysis is enabled by the novel, ultra-parallel algorithms on the emerging exascale high-performance computing (HPC) platforms. The results lead to much greater channel model resolution than existing deterministic channel modeling technologies. All relevant propagation mechanisms are accounted for in first-principles. Such a modeling framework will be critical to gaining fundamental physics of wireless propagation channels in real-world scenarios.
Tracing the paths of collections of particles through a flow field is a key step for many flow visualization and analysis methods. When a flow field is interpolated from the nodes of an unstructured mesh, the process ...
详细信息
Tracing the paths of collections of particles through a flow field is a key step for many flow visualization and analysis methods. When a flow field is interpolated from the nodes of an unstructured mesh, the process of advecting a particle must first find which cell in the unstructured mesh contains the particle. Since the paths of nearby particles often diverge, the parallelization of particle advection quickly leads to incoherent memory accesses of the unstructured mesh. We have developed a new block advection GPU approach that reorganizes particles into spatially coherent bundles as they follow their advection paths, which greatly improves memory coherence and thus shared-memory GPU performance. This approach works best for flows that meet the CFL criterion on unstructured meshes of uniformly sized elements, small enough to fit at least two timesteps in GPU memory.
In this paper we have developed algorithms to solve macroeconometric models with forward-looking variables based on Newton method for nonlinear systems of equations. The most difficult step for Newton methods represen...
详细信息
In this paper we have developed algorithms to solve macroeconometric models with forward-looking variables based on Newton method for nonlinear systems of equations. The most difficult step for Newton methods represents the resolution of a large linear system for each iteration. Thus, we compare the performances resulted by solving this linear system using two iterative methods and the direct method. We’ve also described an implementation of the parallel versions of such algorithms using a software package. Our experiments confirm that the iterative methods have a low computational complexity and storage requirements, but the parallel versions of direct methods show a superior speedup.
暂无评论