As businesses increasingly rely on cloud-based big data analytics services to drive insights, reducing the cost of storing and analyzing large volumes of data in the cloud has become a major concern. During the execut...
详细信息
As businesses increasingly rely on cloud-based big data analytics services to drive insights, reducing the cost of storing and analyzing large volumes of data in the cloud has become a major concern. During the execution of big data analysis jobs, some of the generated data can be reused by subsequent jobs. By storing such intermediate data, the cost of running big data jobs can be greatly reduced for businesses using cloud services. An important challenge is how to determine which data should be stored in order to save costs. Existing storing strategies do not differentiate between data with different usage frequencies, resulting in significant storage costs in practical applications. To address the aforementioned challenges, in this paper we propose two online algorithms, one deterministic and the other randomized, which dynamically determine whether to store the data with the aim of saving cost. We show that our proposed deterministic algorithm (resp., randomized) incurs costs within a factor of 2 - alpha' (resp., 2/1+alpha') times the minimum cost obtained by an optimal offline algorithm which is assumed to know the exact future a priori. Finally, through extensive experiments with real-world workload of big data jobs in Alibaba Cloud environment, we demonstrate that our proposed online algorithms can achieve significant cost savings under common cloud pricing schemes.
Given a sequence of n real numbers and an integer parameter k, the problem studied in this paper is to compute k subsequences of consecutive elements with the sums of their elements being the largest, the second large...
详细信息
Given a sequence of n real numbers and an integer parameter k, the problem studied in this paper is to compute k subsequences of consecutive elements with the sums of their elements being the largest, the second largest, and the kth largest among all possible range sums of the input sequence. For any value of k, 1 <= k <= n (n + 1)/2, we design a fast algorithm that takes O (n + k log n) time in the worst case to compute and rank all such subsequences. We also prove that our algorithm is optimal for k = O (n) by providing a matching lower bound. Moreover, our algorithm is an improvement over the previous results on the maximum sum subsequences problem (where only the subsequences are requested and no ordering with respect to their relative sums will be determined). Furthermore, given the fact that we have computed the fth largest sums, our algorithm retrieves the (l + 1)th largest sum in O (log n) time, after O (n) time of preprocessing. (c) 2007 Elsevier B.V. All rights reserved.
Channel assignment is a very important topic in wireless networks. In this paper, we study FDMA channel assignment in a noncooperative wireless network, where devices are selfish. Existing work on this problem has con...
详细信息
Channel assignment is a very important topic in wireless networks. In this paper, we study FDMA channel assignment in a noncooperative wireless network, where devices are selfish. Existing work on this problem has considered Nash Equilibrium (NE), which is not a very strong solution concept and may not guarantee a good system performance. In contrast, in this work, we introduce a payment formula to ensure the existence of a Strongly Dominant Strategy Equilibrium (SDSE), a different solution concept that gives participants much stronger incentives. We show that, when the system converges to an SDSE, it also achieves global optimality in terms of system throughput. Furthermore, we extend our work to the case in which some radios have a limited tunability. We show that in such a case, nevertheless, it is generally impossible to have a similar SDSE solution;with additional assumptions on the numbers of radios and the types of channels, etc., we can again achieve an SDSE solution that guarantees optimal system throughput. Besides this extension, we also consider other extensions of our strategic game to achieve throughput fairness and to deal with possibly inconsistent information caused by players joining and leaving. Finally, we evaluate our design with simulated experiments. Numerical results verify that the system does converge to the globally optimal channel assignment with the proposed payment formula, and that the system throughput is significantly higher than that achievable with the random-based and NE-based channel assignment schemes.
With the development of the room rental market, many room rental websites have been created, e.g., SpareRoom and EasyRoommate. On these websites, people find not only rooms for rent but also suitable roommates. Inspir...
详细信息
With the development of the room rental market, many room rental websites have been created, e.g., SpareRoom and EasyRoommate. On these websites, people find not only rooms for rent but also suitable roommates. Inspired by the rental mode in practice, a benchmark room allocation model was introduced by Chan et al., in which 2n agents must be allocated to n rooms that have the same capacity and each agent can be allocated to any room. However, in practice, rooms may differ in terms of capacity, e.g., college dorms or apartments may contain both two-bed rooms and four-bed rooms. Moreover, an agent can only be allocated to a room of which the rent does not exceed the agent's budget. In this scenario, we must consider not only the agents' preferences but also the capacity diversity of the rooms and the budget constraints while allocating the rooms. Therefore, this paper investigates the room allocation problem with capacity diversity and budget constraints. We mainly focus on finding an allocation that maximizes social welfare. First, this paper demonstrates that finding an allocation that maximizes the social welfare is NP-hard (i.e., non-deterministic polynomial-time hard), even if only one room's capacity is larger than 1 and the other rooms' capacities are all 1. Second, this paper presents a (c* + 2)/2 + e-factor approximation algorithm (with epsilon > 0) for the case in which the capacity of each room does not exceed a constant c*. Third, this paper proposes a heuristic algorithm based on the local search for the general case in which the capacity of each room is not bounded by a constant. The experimental results demonstrate that the proposed algorithm can produce near-optimal solutions. Finally, this paper investigates how to find a roommate stable or room envyfree allocation with a social welfare guarantee.
Energy-constrained sensor networks have been widely deployed for environmental monitoring and security surveillance purposes. Since sensors are usually powered by energy-limited batteries, in order to prolong the netw...
详细信息
Energy-constrained sensor networks have been widely deployed for environmental monitoring and security surveillance purposes. Since sensors are usually powered by energy-limited batteries, in order to prolong the network lifetime, most existing research focuses on constructing a load-balanced routing tree rooted at the base station for data gathering. However, this may result in a long routing path from some sensors to the base station. Motivated by the need of some mission-critical applications that require all sensed data to be received by the base station with minimal delay, this paper aims to construct a routing tree such that the network lifetime is maximized while keeping the routing path from each sensor to the base station minimized. This paper shows that finding such a tree is NP-hard. Thus a novel heuristic called top-down algorithm is presented, which constructs the routing tree layer by layer such that each layer is optimally extended, using a network flow model. A distributed refinement algorithm is then devised that dramatically improves on the load balance for the routing tree produced by the top-down algorithm. Finally, extensive simulations are conducted. The experimental results show that the top-down algorithm with balance-refinement delivers a shortest routing tree whose network lifetime achieves around 85% of the optimum. (C) 2012 Elsevier B.V. All rights reserved.
In Part 1, we developed a simulation algorithm design strategy based on event simulation, conditional importance sampling, and mean translation biasing of Gaussian noise distributions. Here, we extend this strategy to...
详细信息
In Part 1, we developed a simulation algorithm design strategy based on event simulation, conditional importance sampling, and mean translation biasing of Gaussian noise distributions. Here, we extend this strategy to systems with trellis-coded modulation using the error event simulation technique introduced in [6]. A fundamental principal exploited here is that the trellis code's distance spectrum can be used to design an efficient sampling distribution.
Previous works have proposed various approaches to implement service chaining by routing traffic through the desired middleboxes according to pre-defined policies. However, no matter what routing scheme is used, the p...
详细信息
Previous works have proposed various approaches to implement service chaining by routing traffic through the desired middleboxes according to pre-defined policies. However, no matter what routing scheme is used, the performance of service chaining depends on where these middleboxes are placed. Thus, in this paper, we study middlebox placement problem, i.e., given network information and policy specifications, we attempt to determine the optimal locations to place the middleboxes so that the performance is optimized. The performance metrics studied in this paper include the end-to-end delay and the bandwidth consumption, which cover both users' and network providers' interests. We first formulate it as 0-1 programming problem, and prove it is NP-hard. We then propose two heuristic algorithms to obtain the sub-optimal solutions. The first algorithm is a greedy algorithm, and the second algorithm is based on simulated annealing. Through extensive simulations, we show that in comparison with a baseline algorithm, the proposed algorithms can reduce 22 percent end-to-end delay and save 38 percent bandwidth consumption on average. The formulation and proposed algorithms have no special assumption on network topology or policy specifications, therefore, they have broad range of applications in various types of networks such as enterprise, data center and broadband access networks.
A data warehouse collects and maintains a large amount of data from several distributed and heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the...
详细信息
A data warehouse collects and maintains a large amount of data from several distributed and heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data, regardless of the availability of the data sources. In this paper we focus on the following problem: for a given set of materialized select-project-join (SPJ) views, how can we find and minimize the auxiliary data stored in a data warehouse in order to make all materialized views in the data warehouse self-maintainable? For this problem we first devise an algorithm for finding such an auxiliary view set by exploiting information sharing among the auxiliary views and materialized views themselves to reduce the total size of auxiliary views. We then consider how to make the data warehouse still self-maintainable by minor modifications when there is a view addition to or deletion from it by giving an algorithm for this incremental maintenance purpose. (C) 1999 Elsevier Science B.V. All rights reserved.
Chordal graphs are graphs with the property that each cycle of length greater than 3 has two non-consecutive vertices that are joined by an edge. An important subclass of chordal graphs are strongly chordal graphs (Fa...
详细信息
Chordal graphs are graphs with the property that each cycle of length greater than 3 has two non-consecutive vertices that are joined by an edge. An important subclass of chordal graphs are strongly chordal graphs (Farber, 1983). Chordal graphs appear for example in the design of acyclic data base schemes (Beeri et al., 1983). In this paper we study the computational complexity (both sequential and parallel) of the maximum matching problem for chordal and strongly chordal graphs. We show that there is a linear-time greedy algorithm for a maximum matching in a strongly chordal graph provided a strongly perfect elimination ordering is known. This algorithm can also be turned into a parallel algorithm. The technique used can also be extended to the perfect multidimensional matching for chordal and strongly chordal graphs yielding the first polynomial time algorithms for these classes of graphs (the multidimensional matching is NP-complete in general). (C) 1998 Elsevier Science B.V. All rights reserved.
A key challenge in defense and security systems is to implement functionality within a power budget. We show how data bandwidth redundancy and the need to change performance is exploited to achieve power efficient, fi...
详细信息
A key challenge in defense and security systems is to implement functionality within a power budget. We show how data bandwidth redundancy and the need to change performance is exploited to achieve power efficient, field programmable gate array realizations with improved sampling rates. A unified methodology is given for the implementation of a key function, the fast Fourier transform, for a Radar-based digital receiver. Locality of data, temporal and spatial resource usage are examined from first principles, leading to an algorithmic approach that demonstrates substantial industrial benefits in terms of power, performance and resource usage. A power saving of 18% is achieved over a Cooley Tukey design with a 100% speed improvement;the work is extended to other cyclical fast algorithms.
暂无评论