The U.S. Geological Survey has developed an open-standard data integration framework for working efficiently and effectively with large collections of climate and other geoscience data. A web interface accesses catalo...
详细信息
The U.S. Geological Survey has developed an open-standard data integration framework for working efficiently and effectively with large collections of climate and other geoscience data. A web interface accesses catalog datasets to find data services. Data resources can then be rendered for mapping and dataset metadata are derived directly from these web services. Algorithm configuration and information needed to retrieve data for processing are passed to a server where all large-volume data access and manipulation takes place. The data integration strategy described here was implemented by leveraging existing free and open source software. Details of the software used are omitted;rather, emphasis is placed on how open-standard web services and data encodings can be used in an architecture that integrates common geographic and atmospheric data.
The computing objectives of the UK human Genome Mapping Project (HGMP) are to establish and make available a database of genes, genetic markers and map locations, and to develop new computing environments and methods ...
详细信息
The computing objectives of the UK human Genome Mapping Project (HGMP) are to establish and make available a database of genes, genetic markers and map locations, and to develop new computing environments and methods for acquisition and analysis of such data. In collaboration with other human genome research centres a full client/server model of distributed computing environment to encompass the concept of a human genome computing infrastructure, represented by a variety of kinds of nodes, performing diverse functions, is being developed. Online access over wide area networks is now possible to the HGMP-RC computing facilities in Harrow and to selected national and international databases and centres of human genome excellence in the UK and abroad for registered HGMP Resource Centre users. Eighteen months of user experience in developing, operating and supporting such an environment, and focussing on intricacies with interoperability of nodes located on various wide area networks, are discussed.
A set of asynchronous processes where each process executes a sequence of phases is considered. A process begins its next phase only upon completion of its previous phase. The analysis seeks to design a synchronizat...
详细信息
A set of asynchronous processes where each process executes a sequence of phases is considered. A process begins its next phase only upon completion of its previous phase. The analysis seeks to design a synchronization mechanism that guarantees that: 1. no process begins the next phase until all processes have completed the previous phase, and 2. no process will be permanently blocked from executing its next phase if all processes have completed their previous phase. Nothing may be assumed about the initial values of the shared variables. In the absence of this requirement, the following simple algorithm suffices: a counter variable is initially zero. The counter variable is incremented by one whenever a process completes a phase. A process begins its next phase only if the counter variable is equal to or greater than the number of the phase times the number of processes.
Given the unpredictable nature of the nodes in distributed computing systems, some of the tasks can be significantly delayed. Such delayed tasks are called stragglers. Straggler mitigation can be achieved by redundant...
详细信息
Given the unpredictable nature of the nodes in distributed computing systems, some of the tasks can be significantly delayed. Such delayed tasks are called stragglers. Straggler mitigation can be achieved by redundant computation. In maximum distance separable (MDS) redundancy method, a task is divided into k subtasks which are encoded to n coded subtasks, such that a task is completed if any k out of n coded subtasks are completed. Two important metrics of interest are task completion time, and server utilization which is the aggregate completed work by all servers in this duration. We consider a proactive straggler mitigation strategy where n(0) out of n coded subtasks are started at time 0 while the remaining n - n(0) coded subtasks are launched when l(0) <= min {n(0), k} of the initial ones finish. The coded subtasks are halted when k of them finish. For this flexible forking strategy with multiple parameters, we analyze the mean of two performance metrics when the random service completion time at each server is independent and distributed identically (i.i.d.) to a shifted exponential. From this study, we find a tradeoff between the metrics which provides insights into the parameter choices. Experiments on Intel DevCloud illustrate that the shifted exponential distribution adequately captures the random coded subtask completion times, and our derived insights continue to hold.
We present a method for autonomic intrusion detection and response to optimize processes of cybersecurity in large distributed systems. These environments are characterized by technology fragmentation and complex oper...
详细信息
We present a method for autonomic intrusion detection and response to optimize processes of cybersecurity in large distributed systems. These environments are characterized by technology fragmentation and complex operations making them highly susceptible to attacks like hijacking, man-in-the-middle, denial-of-service, phishing, and others. The autonomic intrusion response system introduces models of operational analysis and reaction based on the combination of autonomic computing and big data. We implemented a proof-of-concept and executed experiments that demonstrate significant improvement in effectiveness and scalability of the method in complex environments.
Interactive coding allows two or more parties to carry out a distributed computation over a communication network that may be noisy. The ultimate goal is to develop efficient coding schemes that tolerate a high level ...
详细信息
Interactive coding allows two or more parties to carry out a distributed computation over a communication network that may be noisy. The ultimate goal is to develop efficient coding schemes that tolerate a high level of noise while increasing the communication by only a constant factor (i.e., constant rate). In this work (the second part) we provide computationally efficient, constant rate schemes that conduct any computation on arbitrary networks, and succeed with high probability in the presence of adversarial noise that can insert, delete, or alter communicated messages. Our schemes are nonfully- utilized and incur a polynomial (in the size of the network) blowup in the round complexity. Our first scheme resists an oblivious adversary that corrupts at most a fraction epsilon/m of the total communication, where m is the number of links in the network and epsilon is a small constant. In contrast to the first part of this work, the scheme in this part does not assume that the parties pre-share a long random string. Our second scheme resists an arbitrary (non-oblivious) adversary that corrupts at most a fraction epsilon/m logm of the communication. We further improve the resilience to epsilon/m log logm by assuming the parties pre-share a long common random string.
With the continued evolution of computing architectures towards many-core computing, algorithms that can effectively and efficiently use many cores are crucial. In this paper, we propose, as a proof of principle, a pa...
详细信息
With the continued evolution of computing architectures towards many-core computing, algorithms that can effectively and efficiently use many cores are crucial. In this paper, we propose, as a proof of principle, a parallel space-time algorithm that layers time parallelization together with a parallel elliptic solver to solve time dependent partial differential equations (PDEs). The parallel elliptic solver utilizes domain decomposition to divide a spatial grid into subdomains, and applies a parallel Schwarz iteration to find consistent solutions. The high-order parallel time integrator employed belongs to the family of revisionist integral deferred correction methods (RIDC) introduced by Christlieb, Macdonald, and Ong [SIAM J. Sci. Comput., 32 (2010), pp. 818-835], which allows for the small scale parallelization of solutions to initial value problems. The two established algorithms are combined in this proposed space-time algorithm to add parallel scalability. As a proof of concept, we utilize a framework involving classical Schwarz matching conditions and RIDC integrators. It will be shown that the resulting Schwarz iterations can be analyzed using standard domain decomposition analysis, and that the required Schwarz iterations (at each time step) can be evaluated simultaneously in parallel, after initial start-up costs. Additionally, it will be shown that if the domain decomposition iteration converges for the prediction step, the domain decomposition iterations for the correction steps will converge at the same rate. Numerical experiments demonstrate that the RIDC-DD algorithms attain the designed order of accuracy. Several scaling studies are also performed.
With the advent of large-scale problems, feature selection has become a fundamental preprocessing step to reduce input dimensionality. The minimum-redundancy-maximum-relevance (mRMR) selector is considered one of the ...
详细信息
With the advent of large-scale problems, feature selection has become a fundamental preprocessing step to reduce input dimensionality. The minimum-redundancy-maximum-relevance (mRMR) selector is considered one of the most relevant methods for dimensionality reduction due to its high accuracy. However, it is a computationally expensive technique, sharply affected by the number of features. This paper presents fast-mRMR, an extension of mRMR, which tries to overcome this computational burden. Associated with fast-mRMR, we include a package with three implementations of this algorithm in several platforms, namely, CPU for sequential execution, GPU (graphics processing units) for parallel computing, and Apache Spark for distributed computing using big data technologies. (C) 2016 Wiley Periodicals, Inc.
Over 600 participants at the Database World and Client-Server World show were surveyed for their opinions on their client-server computing preferences. Over 1/3 of those surveyed do not currently use client-server ap...
详细信息
Over 600 participants at the Database World and Client-Server World show were surveyed for their opinions on their client-server computing preferences. Over 1/3 of those surveyed do not currently use client-server applications, largely because they are reluctant to accept them. However, all but 1.5% of the attendees expected to be using such applications by 1995. Seventy-six percent of the sample of conference attendees said they used the PC as their primary desktop platform and Windows as their primary desktop operating system. In terms of features, graphical user-interfaces were the first priority, followed by desktop tools for generating reports, distributed database access, and client-server management tools. When asked to cite the major barrier to increased use of client-server computing, 1/3 of those in the survey named nontechnical issues. Organizations with more than 5,000 employees have the greatest investment and strongest interest in client-server solutions.
A computationally efficient modified Fano algorithm for decoding of convolutional codes, recently proposed by the authors [1] has been further studied through simulation. Computational efforts and the BER performance ...
详细信息
A computationally efficient modified Fano algorithm for decoding of convolutional codes, recently proposed by the authors [1] has been further studied through simulation. Computational efforts and the BER performance of the original Fano and the modified Fano algorithms have been compared for the i) BSC and ii) Rayleigh fading channel with flat fading assumption. For the simulation example, the performance of the modified Fano algorithm was found to be comparable to the Viterbi decoder.
暂无评论