Scaffold filling is a critical step in DNA assembly. Its basic task is to fill the missing genes (fragments) into an incomplete genome (scaffold) to make it similar to the reference genome. There have been a lot of wo...
详细信息
Scaffold filling is a critical step in DNA assembly. Its basic task is to fill the missing genes (fragments) into an incomplete genome (scaffold) to make it similar to the reference genome. There have been a lot of work under distinct measurements in the literature of genome comparison. For genomes with gene duplications, common string partition reveals the similarity more precisely, since it constructs a one-to-one correspondence between the same segments in the two genomes. In this paper, we adopt duo-preservation as the measurement, which is the complement of common string partition, i.e., the number of duo-preservations added to the number of common strings is exactly the length of a genome. Towards a proper scaffold filling, we just focus on the increased duo-preservations. This problem is called scaffold filling to maximize increased duo-preservations (abbr. SF-MIDP). We show that SF-MIDP is solvable in linear time for a simple version where all the genes of the scaffold are matched in a block-matching, but MAX SNP-complete for the general version, and cannot be approximated within 16263/16262. Moreover, we present a basic approximation algorithm of factor 2, by which the optimal solution can be described in a new way, and then, improve the approximation factor to 12/7 via a greedy method.
algorithms for approximating the nondominated set of multiobjective optimization problems are reviewed. The approaches are categorized into general methods that are applicable under mild assumptions and, thus, to a wi...
详细信息
algorithms for approximating the nondominated set of multiobjective optimization problems are reviewed. The approaches are categorized into general methods that are applicable under mild assumptions and, thus, to a wide range of problems, and into algorithms that are specifically tailored to structured problems. All in all, this survey covers 52 articles published within the last 41 years, that is, between 1979 and 2020.
Motivated by the practical applications in solving plenty of important combinatorial optimization problems, this paper investigates the Budgeted k-Submodular Maximization problem defined as follows: Given a finite set...
详细信息
Motivated by the practical applications in solving plenty of important combinatorial optimization problems, this paper investigates the Budgeted k-Submodular Maximization problem defined as follows: Given a finite set V, a budget B and a k-submodular function f : (k + 1)(V) (sic) R+, the problem asks to find a solution s = (S-1, S-2,..., S-k) is an element of (k + 1)(V), in which an element e is an element of V has a cost c(i) (e) when added into the i -th set Si, with the total cost of s that does not exceed B so that f (s) is maximized. To address this problem, we propose two single pass streaming algorithms with approximation guarantees: one for the case that an element e has only one cost value when added to all i -th sets and one for the general case with different values of c(i) (e). We further investigate the performance of our algorithms in two applications of the problem, Influence Maximization with k topics and sensor placement of k types of measures. The experiment results indicate that our algorithms can return competitive results but require fewer the number of queries and running time than the state-of-the-art methods.
We study clustering problems such as k-Median, k-Means, and Facility Location in graphs of low highway dimension, which is a graph parameter modeling transportation networks. It was previously shown that approximation...
详细信息
We study clustering problems such as k-Median, k-Means, and Facility Location in graphs of low highway dimension, which is a graph parameter modeling transportation networks. It was previously shown that approximation schemes for these problems exist, which either run in quasi-polynomial time (assuming constant highway dimension) (Feldmann et al., 2018) [8] or run in FPT time (parameterized by the number of clusters k, the highway dimension, and the approximation factor) (Becker et al., 2018;Braverman et al., 2021) [9, 10]. In this paper we show that a polynomial-time approximation scheme (PTAS) exists (assuming constant highway dimension). We also show that the considered problems are NP-hard on graphs of highway dimension 1. (C) 2021 Elsevier Inc. All rights reserved.
Given a set of vehicles that are allowed to move in a plane along a predefined directed rectilinear path, the collision-free routing problem seeks a maximum number of vehicles that can move without collision. This pro...
详细信息
Given a set of vehicles that are allowed to move in a plane along a predefined directed rectilinear path, the collision-free routing problem seeks a maximum number of vehicles that can move without collision. This problem is known to be NP-hard (Ajaykumar et al., 2016). Here we study a variant of this problem called the constrained collision-free routing problem (in short, CCRP). In this problem, each vehicle is allowed only to move in a directed L-shaped path. First, we show that CCRP is NP-hard. Further, we prove that any beta-approximation algorithm for the maximum independent set problem in B-1-VPG graphs would produce a beta-approximation for CCRP. Also, we propose a 2-approximation algorithm for the maximum independent set problem on intersection graph of n unit-L frames (the union of a unit-length vertical and a unit-length horizontal line segment that shares an endpoint) with O(n) space and O(n(2)) time. (C) 2021 Elsevier B.V. All rights reserved.
In a large cloud data center HPC system, a critical problem is how to allocate the submitted tasks to heterogeneous servers that will achieve the goal of maximizing the system's gain defined as the value of comple...
详细信息
In a large cloud data center HPC system, a critical problem is how to allocate the submitted tasks to heterogeneous servers that will achieve the goal of maximizing the system's gain defined as the value of completed tasks minus system operation costs. We consider this problem in the online setting that tasks arrive in batches and propose a novel deep reinforcement learning (DRL) enhanced greedy optimization algorithm of two-stage scheduling interacting task sequencing and task allocation. For task sequencing, we deploy a DRL module to predict the best allocation sequence for each arriving batch of tasks based on the knowledge (allocation strategies) learnt from previous batches. For task allocation, we propose a greedy strategy that allocates tasks to servers one by one online following the allocation sequence to maximize the total gain increase. We show that our greedy strategy has a performance guarantee of competitive ratio 1/1+kappa to the optimal offline solution, which improves the existing result for the same problem, where kappa is upper bounded by the maximum cost-to-gain ratio of each task. While our DRL module enhances the greedy algorithm by providing the likely-optimal allocation sequence for each batch of arriving tasks, our greedy strategy bounds DRL's prediction error within a proven worst-case performance guarantee for any allocation sequence. It enables a better solution quality than that obtainable from both DRL and greedy optimization alone. Extensive experiment evaluation results in both simulation and real application environments demonstrate the effectiveness and efficiency of our proposed algorithm. Compared with the state-of-the-art baselines, our algorithm increases the system gain by about 10% to 30%. Our algorithm provides an interesting example of combining machine learning (ML) and greedy optimization techniques to improve ML-based solutions with a worst-case performance guarantee for solving hard optimization problems.
Conventional wireless sensor networks (WSNs) consist of sensors with continuous transmission range, which depends on the relative positions of the transmitter and the receiver. However, sensors with different discrete...
详细信息
Conventional wireless sensor networks (WSNs) consist of sensors with continuous transmission range, which depends on the relative positions of the transmitter and the receiver. However, sensors with different discrete transmission ranges are preferred for future generation low-power sensor networks because of certain functional advantages. The discrete transmission ranges introduce connectivity constraints in transmitting the sensor data. The proliferation of low-power WSNs has led to the explosion of the volume of the data to be processed. As the transmission of the data is the major cause of energy depletion of sensors that critically affect the network lifetime, energy-efficient aggregation of the sensor data is an important networking problem. In this work, we address the data aggregation problem in networks with sensors of discrete transmission ranges. We model the problem as a solvable integer linear program. However, this method applies only to networks of small sizes because of the hardness of the program. To solve the problem in networks of large sizes, we introduce a graphical framework that captures the characteristics of the networks with sensors of discrete transmission ranges and design a polynomial-time approximation technique to find a solution. Furthermore, we embed compression techniques based on compressed sensing (CS), which are established to yield high data compaction for temporally and spatially correlated distributed sensor data streams, and evaluate the performance of the proposed methods.
In recent years, edge computing, as an extension of cloud computing, has emerged as a promising paradigm for powering a variety of applications demanding low latency, e.g., virtual or augmented reality, interactive ga...
详细信息
In recent years, edge computing, as an extension of cloud computing, has emerged as a promising paradigm for powering a variety of applications demanding low latency, e.g., virtual or augmented reality, interactive gaming, real-time navigation, etc. In the edge computing environment, edge servers are deployed at base stations to offer highly-accessible computing capacities to nearby end-users, e.g., CPU, RAM, storage, etc. From a service provider's perspective, caching app data on edge servers can ensure low latency in its users' data retrieval. Given constrained cache spaces on edge servers due to their physical sizes, the optimal data caching strategy must minimize overall user latency. In this article, we formulate this Constrained Edge Data Caching (CEDC) problem as a constrained optimization problem from the service provider's perspective and prove its NP-hardness. We propose an optimal approach named CEDC-IP to solve this CEDC problem with the Integer Programming technique. We also provide an approximation algorithm named CEDC-A for finding approximate solutions to large-scale CEDC problems efficiently and prove its approximation ratio. CEDC-IP and CEDC-A are evaluated on a real-world data set. The results demonstrate that they significantly outperform four representative approaches.
Spherical k-means clustering as a known NP-hard variant of the k-means problem has broad applications in data mining. In contrast to k-means, it aims to partition a collection of given data distributed on a spherical ...
详细信息
Spherical k-means clustering as a known NP-hard variant of the k-means problem has broad applications in data mining. In contrast to k-means, it aims to partition a collection of given data distributed on a spherical surface into k sets so as to minimize the within-cluster sum of cosine dissimilarity. In the paper, we introduce spherical k-means clustering with penalties and give a 2max{2,M}(1+M)(lnk+2)-approximation algorithm. Moreover, we prove that when against spherical k-means clustering with penalties but on separable instances, our algorithm is with an approximation ratio 2max{3,M+1} with high probability, where M is the ratio of the maximal and the minimal penalty cost of the given data set.
We are given a set of parallel jobs that have to be executed on a set of speed-scalable processors varying their speeds dynamically. Running a job at a slower speed is more energy-efficient, however, it takes a longer...
详细信息
We are given a set of parallel jobs that have to be executed on a set of speed-scalable processors varying their speeds dynamically. Running a job at a slower speed is more energy-efficient, however, it takes a longer time and affects the performance. Every job is characterized by the processing volume and the number or the set of the required processors. Our objective is to minimize the maximum completion time so that the energy consumption is not greater than a given energy budget. For various particular cases, we propose polynomial-time approximation algorithms, consisting of two stages. At the first stage, we give an auxiliary convex program. By solving this problem, we get processing times of jobs and a lower bound on the makespan. Then, at the second stage, we transform our problem into the corresponding scheduling problem with the constant speed of processors and construct a feasible schedule. We also obtain an "almost exact" solution for the preemptive settings based on a configuration linear program.
暂无评论