Introduction: Computational chemistry dramatically accelerates the drug discovery process and high-performance computing (HPC) can be used to speed up the most expensive calculations. Supporting a local HPC infrastruc...
详细信息
Introduction: Computational chemistry dramatically accelerates the drug discovery process and high-performance computing (HPC) can be used to speed up the most expensive calculations. Supporting a local HPC infrastructure is both costly and time-consuming, and, therefore, many research groups are moving from in-house solutions to remote-distributed computing platforms. Areas covered: The authors focus on the use of distributed technologies, solutions, and infrastructures to gain access to HPC capabilities, software tools, and datasets to run the complex simulations required in computational drug discovery (CDD). Expert opinion: The use of computational tools can decrease the time to market of new drugs. HPC has a crucial role in handling the complex algorithms and large volumes of data required to achieve specificity and avoid undesirable side-effects. distributed computing environments have clear advantages over in-house solutions in terms of cost and sustainability. The use of infrastructures relying on virtualization reduces set-up costs. distributed computing resources can be difficult to access, although web-based solutions are becoming increasingly available. There is a trade-off between cost-effectiveness and accessibility in using on-demand computing resources rather than free/academic resources. Graphics processing unit computing, with its outstanding parallel computing power, is becoming increasingly important.
BackgroundMicrobial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can...
详细信息
BackgroundMicrobial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community's constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing *** we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation *** is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https://***/borenstein-lab/MetaLAFFA and can be installed via Conda as described in the accompanying documentation.
This paper proposes a scheduling algorithm based on fast edge resource placement, so as to further improve the effect of bilateral resource integration of the mergers and acquisitions (M&A). Among them, the random...
详细信息
ISBN:
(数字)9798350388725
ISBN:
(纸本)9798350388732
This paper proposes a scheduling algorithm based on fast edge resource placement, so as to further improve the effect of bilateral resource integration of the mergers and acquisitions (M&A). Among them, the random demand model is taken as the basic resource scheduling method, and the algorithm is optimized by introducing edge computing and user random demand, thereby further improving the resource scheduling effect of mergers and acquisitions. The simulation results show that under different merger and acquisition resources, Zipf values, and types of resources, compared with traditional mean-based scheduling algorithms, the scheduling algorithm based on fast edge resource placement have higher resource scheduling benefits and significantly reduced time complexity. Therefore, it can be concluded that the scheduling algorithm based on fast edge resource placement has good performance, and it can carry out resource scheduling with good effect. In addition, applying the proposed algorithm in the actual enterprise M&A scenario can help enterprises to carry out better resource management, which is feasible and has certain practical application value.
作者:
Xu, YinanLiu, HuiLong, ZhihaoCent South Univ
Sch Traff & Transportat Engn Inst Artificial Intelligence & Robot Key Lab Traff Safety TrackMinist Educ Changsha 410075 Hunan Peoples R China
The randomness of the wind speed leads to the intermittency of wind power, which is a challenge to realize wind power energy as reliable and renewable power. The prediction of wind speed time series can promote the us...
详细信息
The randomness of the wind speed leads to the intermittency of wind power, which is a challenge to realize wind power energy as reliable and renewable power. The prediction of wind speed time series can promote the use of wind energy. However, the traditional stand-alone methods are unable to meet the requirements of wind speed big data environments. In the study, a hybrid distributed computing framework on Apache Spark is applied for wind speed big data forecasting. Using the distributed computing strategy, the framework can divide the wind speed big data into RDD groups and operate them in parallel. In the framework, a modified wind speed extreme learning machine predictor is built using the distributed computing Process strategy, enhanced by the data decomposition and result reconstruction components on Apache Spark. The experimental results indicate that the proposed distributed computing framework on Spark can forecast wind speed big data in multi-step accurately. Besides, the effectiveness of different components in the framework is verified. It is also proved that the proposed distributed computing framework has a faster computation speed when processing big data, compared to the stand-alone method.
In recent years, interest in quantum computing has increased due to technological advances in quantum hardware and algorithms. Despite the promises of quantum advantage, the applicability of quantum devices has been l...
详细信息
ISBN:
(数字)9798331541378
ISBN:
(纸本)9798331541385
In recent years, interest in quantum computing has increased due to technological advances in quantum hardware and algorithms. Despite the promises of quantum advantage, the applicability of quantum devices has been limited to few qubits on hardware that experiences decoherence due to noise. One proposed method to get around this challenge is distributed quantum computing (DQC). Like classical distributed computing, DQC aims at increasing compute power by spreading the compute processes across many devices, with the goal to minimize the noise and circuit depth required by quantum devices. In this paper, we cover the fundamental concepts of DQC and provide insight into where the field of DQC stands with respect to the field of chemistry—a field which can potentially be used to demonstrate quantum advantage on noisy-intermediate scale quantum devices.
This paper has proposed an integrated healthcare monitoring solution for the soldiers deployed in adverse environmental conditions, using the internet of things (IoT) with distributed computing. For these soldiers, th...
详细信息
This paper has proposed an integrated healthcare monitoring solution for the soldiers deployed in adverse environmental conditions, using the internet of things (IoT) with distributed computing. For these soldiers, the health parameters of every individual need to be monitored on a real-time basis and subsequent analysis of the dataset to be made for initiating appropriate medical support with the lowest possible delay. In this paper, a three-layer service-oriented IoT architecture has been proposed where the computational functionalities are distributed among all the layers. The proposed distributed computing mechanism has implemented two levels of filtration of redundant information that belongs to safe soldiers. The first level of filtering is done at the end-node using the Fuzzy classification approach and the second level of filtering is done at the intermediate node using the time-series pattern analysis approach. This layer-wise filtration process results in a reduction in data flooding and computational burden on the cloud due to which system response time improves to suit emergency applications. A prototype has been developed to validate the effectiveness of the proposed solution. (C) 2020 Elsevier Inc. All rights reserved.
Motivated by mobile edge computing and wireless data centers, we study a wireless distributed computing framework where the distributed nodes exchange information over a wireless interference network. Our framework fo...
详细信息
Motivated by mobile edge computing and wireless data centers, we study a wireless distributed computing framework where the distributed nodes exchange information over a wireless interference network. Our framework follows the structure of MapReduce. This framework consists of Map, Shuffle, and Reduce phases, where Map and Reduce are computation phases and Shuffle is a data transmission phase. In our setting, we assume that the transmission is operated over a wireless interference network. We demonstrate that, by duplicating the computation work at a cluster of distributed nodes in the Map phase, one can reduce the amount of transmission load required for the Shuffle phase. In this work, we characterize the fundamental tradeoff between computation load and communication load, under the assumption of one-shot linear schemes. The proposed scheme is based on side information cancellation and zero-forcing, and we prove that it is optimal in terms of computation-communication tradeoff. The proposed scheme outperforms the naive TDMA scheme with single node transmission at a time, as well as the coded TDMA scheme that allows coding across data, in terms of the computation-communication tradeoff.
The paper studies three fundamental problems in graph analytics, computing connected components (CCs), biconnected components (BCCs), and 2-edge-connected components (ECCs) of a graph. With the recent advent of big da...
详细信息
The paper studies three fundamental problems in graph analytics, computing connected components (CCs), biconnected components (BCCs), and 2-edge-connected components (ECCs) of a graph. With the recent advent of big data, developing efficient distributed algorithms for computing CCs, BCCs and ECCs of a big graph has received increasing interests. As with the existing research efforts, we focus on the Pregel programming model, while the techniques may be extended to other programming models including MapReduce and Spark. The state-of-the-art techniques for computing CCs and BCCs in Pregel incur O(m x #supersteps) total costs for both data communication and computation, where m is the number of edges in a graph and #supersteps is the number of supersteps. Since the network communication speed is usually much slower than the computation speed, communication costs are the dominant costs of the total running time in the existing techniques. In this paper, we propose a new paradigm based on graph decomposition to compute CCs and BCCs with O(m) total communication cost. The total computation costs of our techniques are also smaller than that of the existing techniques in practice, though theoretically almost the same. Moreover, we also study distributed computing ECCs. We are the first to study this problem and an approach with O(m) total communication cost is proposed. Comprehensive empirical studies demonstrate that our approaches can outperform the existing techniques by one order of magnitude regarding the total running time.
We propose a strategy for computing estimators in some nonstandard M-estimation problems, where the data are distributed across different servers and the observations across servers, though independent, can come from ...
详细信息
We propose a strategy for computing estimators in some nonstandard M-estimation problems, where the data are distributed across different servers and the observations across servers, though independent, can come from heterogeneous sub-populations, thereby violating the identically distributed assumption. Our strategy fixes the super-efficiency phenomenon observed in prior work on distributed computing in (i) the isotonic regression framework, where averaging several isotonic estimates (each computed at a local server) on a central server produces super-efficient estimates that do not replicate the properties of the global isotonic estimator, i.e. the isotonic estimate that would be constructed by transferring all the data to a single server, and (ii) certain types of M-estimation problems involving optimization of discontinuous criterion functions where M-estimates converge at the cube-root rate. The new estimators proposed in this paper work by smoothing the data on each local server, communicating the smoothed summaries to the central server, and then solving a non-linear optimization problem at the central server. They are shown to replicate the asymptotic properties of the corresponding global estimators, and also overcome the super-efficiency phenomenon exhibited by existing estimators.
The significant growth in the number of electric vehicles indicates an increased demand on the power distribution system, specifically on the low-voltage residential network. Without a well-organized schedule for char...
详细信息
ISBN:
(纸本)9781728181929
The significant growth in the number of electric vehicles indicates an increased demand on the power distribution system, specifically on the low-voltage residential network. Without a well-organized schedule for charging electric vehicles, users will typically apply immediate charging upon arrival to home. This may burden the system and may damage power system equipment. To avoid this adverse effect on the system, a process of scheduling electric vehicle charging should be established. This paper proposes a multi-agent based distributed computing process for solving the electric vehicle charge scheduling problem in a secure way that benefits both the customer and the system. This process breaks down the problem into to global and local problem with the former for system objective and the latter for individual vehicle owners' objective. In this work, the local problems are modeled as sub gradient problems that can be solved simultaneously by corresponding agents. The optimality of the sub gradient solutions with respect to global objective are made sure through information sharing between the agents during each iteration. The detailed modeling and implementation of the proposed method along with numerical analysis to demonstrate the effectiveness are presented in the paper.
暂无评论