Longitudinal data, measurements taken from the same subjects over time, appear routinely in many scientific fields, such as biomedical science, public health, ecology and environmental sciences. With the rapid develop...
详细信息
Longitudinal data, measurements taken from the same subjects over time, appear routinely in many scientific fields, such as biomedical science, public health, ecology and environmental sciences. With the rapid development of information technology, modern longitudinal data are becoming massive in volume and high dimensional, hence often require distributed analysis in real-world applications. Standard divide-and-conquer techniques do not apply directly to longitudinal big data due to within-subject dependence. In this paper, we focus on developing a distributed algorithm to support quantile regression (QR) analysis of longitudinal big data, which currently remains an open and challenging issue. We employ weighted quantile regression (WQR) to accommodate the correlation in longitudinal big data, and parallelize the WQR estimation process with a two-stage algorithm to support distributed computing. Based on weights estimated in the first stage by the Newton-Raphson algorithm, the second stage solves the WQR problem using the multi-block alternating direction method of multipliers (ADMM). Simulation studies show that, compared to traditional non-distributed algorithms, our proposed method has favorable estimation accuracy and is computationally more efficient in both non-distributed and distributed environments. Further, we also analyze an air quality data set to illustrate the practical performance of this method.
Support vector machine (SVM) is a powerful binary classification tool, but the growing size of modern data is bringing challenges to it. First, the non-smoothness of hinge loss poses difficulties in large-scale comput...
详细信息
Support vector machine (SVM) is a powerful binary classification tool, but the growing size of modern data is bringing challenges to it. First, the non-smoothness of hinge loss poses difficulties in large-scale computation. Second, the existing large-scale distributed algorithms heavily rely on uniformity and randomness conditions, which are frequently violated in practice. To solve these issues, we first construct a convolution smoothing SVM, which enjoys a smooth and convex objective function. Then a distributed SVM is developed, in which the estimator can be calculated conveniently by minimizing a pilot sample-based distributed surrogate loss. In particular, it can be adaptive when the uniformity or randomness condition is violated. The established theoretical results and numerical experiments on both synthetic and real data all confirm the proposed methods.
With the rapid development of wireless communication, mobile computing, and GPS technologies, drivers' route decisions nowadays rely more on navigation services, such as Google or Waze. However, these navigation s...
详细信息
With the rapid development of wireless communication, mobile computing, and GPS technologies, drivers' route decisions nowadays rely more on navigation services, such as Google or Waze. However, these navigation services don't always come with improved traffic conditions. Individual drivers often make independent and selfish route decisions that are not systematically favorable and thus often result in severe congestions. This study aims to alleviate such problems by exploiting the information gaps between individuals and the central planner (CP). Specifically, we develop a correlated equilibrium routing mechanism (CeRM) for the CP, which drives a group of vehicles' route choices to an equilibrium with a systematically optimal traffic condition while still satisfying individuals' selfish nature. Participating drivers would only be better off by following the suggested routing guidance than navigating on their best responses to real-time traffic information. The CeRM is modeled as a nonconvex and nonlinear program involving a large-scale of users. A distributed Augmented Lagrangian algorithm (D-AL) is developed to efficiently solve the CeRM to provide online real-time navigation service, taking advantage of the onboard computation resources of individual vehicles. Considering the D-AL relies on the wireless communications between vehicles and the CP, we proved the convergence robustness of the D-AL against random communication failures and derived the convergence rate upper bound as a function of the communication failure probability. It is noticed that the convergence rate of the DAL degrades dramatically as the communication failure probability increases, which hampers the applicability of implementing the CeRM in practice. To improve the solution algorithm's resilience in the computation performance, we further designed and proved an acceleration scheme aided D-AL (aD-AL) to expedite the convergence rate under the high likelihood of communication failures. Numeric
Unmanned aerial vehicles (UAVs) are considered to be excellent candidates of airborne relays or base stations for the 3-D ad hoc networks. They can be promptly deployed to serve large bursts of communication traffic i...
详细信息
Unmanned aerial vehicles (UAVs) are considered to be excellent candidates of airborne relays or base stations for the 3-D ad hoc networks. They can be promptly deployed to serve large bursts of communication traffic in cellular networks or provide timely network support in wireless sensor networks (WSNs). It is desirable but challenging to dynamically find the optimal deployment strategy of UAVs in the air to provide a better Quality of Service (QoS) for a UAV-assisted wireless network. In this article, we propose a novel Gibbs-sampling distributed algorithm (GSDA) to dynamically optimize the UAVs' locations when they serve as airborne base stations for ground users. In our proposed GSDA, channel capacity is adopted as the objective function and a distributed approach is employed such that each UAV is able to optimize its location independently and asynchronously. Furthermore, we propose a polynomial-regression-based predictor to make use of users' moving trajectories and take advantage of the predicted users' future locations to expedite the convergence of the GSDA. Meanwhile, we also compare our proposed GSDA with the existing distributed genetic algorithm. The asynchronization of UAV location updates and the location errors are also investigated to evaluate the robustness of the GSDA. Simulation results demonstrate that our proposed novel GSDA is quite robust and superior to the existing distributed genetic algorithm.
This paper proposes a distributed collaborative complete coverage path planning (CCPP) algorithm based on a heuristic method to solve a CCPP problem for multiple agents in an unknown environment. Based on the relation...
详细信息
This paper proposes a distributed collaborative complete coverage path planning (CCPP) algorithm based on a heuristic method to solve a CCPP problem for multiple agents in an unknown environment. Based on the relationship between path length, energy consumption and number of turns, the algorithm instructs the agents to autonomously plan their respective paths in real time by giving the priority of directions. Simulation experiments show that the proposed CCPP algorithm can guarantee efficient collision-free complete coverage compared with related approaches.
This brief announcement presents an algorithm for (1 + epsilon) approximate single-source shortest paths for directed graphs with non-negative real edge weights in the CONGEST model that runs in (O) over tilde (n(1/2)...
详细信息
ISBN:
(纸本)9781450385480
This brief announcement presents an algorithm for (1 + epsilon) approximate single-source shortest paths for directed graphs with non-negative real edge weights in the CONGEST model that runs in (O) over tilde (n(1/2) + D + n(2/5)(+o)((1)) D-2/5) log W/epsilon(2)) rounds, where W is the ratio between the largest and smallest non-zero edge weights.
This paper considers a distributed resource allocation problem over time-varying networks. The objective of each agent in the network is to optimize the sum of separable convex functions subjected to resource constrai...
详细信息
This paper considers a distributed resource allocation problem over time-varying networks. The objective of each agent in the network is to optimize the sum of separable convex functions subjected to resource constraints by observing its local objective function and the information exchanged with its adjacent neighbors. Thus, the problem lies in a distributed framework. In existing literature dealing with similar problems, the measurement of the gradients/subgradients of the objective functions has been applied in the algorithm design. In this paper, by adding stochastic dithers to the local objective functions and constructing randomized differences, we propose a distributed gradient-free algorithm for solving the problem, and show that the algorithm is strongly convergent; that is, the estimates generated from each agent almost certainly converge to the optimal resource allocation solution of the network. Finally, the effectiveness of the algorithm is validated by conducting numerical experiments.
The active queue management (AQM) is a key technology for the information infrastructure and the Internet. The AQM for multiple bottleneck networks is especially challenging than that for single-bottleneck networks be...
详细信息
The active queue management (AQM) is a key technology for the information infrastructure and the Internet. The AQM for multiple bottleneck networks is especially challenging than that for single-bottleneck networks because the stability analysis of multiple bottleneck networks is more complex. In this paper, the active queue management problem for multiple bottleneck networks is considered. A novel random early detection algorithm is designed based on distributed average tracking technology, which is used to estimate the global queue length and improve the network's performance. Sufficient conditions for global asymptotic stability of the closed-loop active queue management system are presented, and some simulation results with NS-2 are given. Compared with the RED algorithm, simulation results show that the presented algorithm has smaller oscillations at routers, which highlights its effectiveness.
In current days, sensor nodes are deployed in hostile environments for various military and commercial applications. Sensor nodes are becoming faulty and having adverse effects in the network if they are not diagnosed...
详细信息
In current days, sensor nodes are deployed in hostile environments for various military and commercial applications. Sensor nodes are becoming faulty and having adverse effects in the network if they are not diagnosed and inform the fault status to other nodes. Fault diagnosis is difficult when the nodes behave faulty some times and provide good data at other times. The intermittent disturbances may be random or kind of spikes either in regular or irregular intervals. In literature, the fault diagnosis algorithms are based on statistical methods using repeated testing or machine learning. To avoid more complex and time consuming repeated test processes and computationally complex machine learning methods, we proposed a one shot likelihood ratio test (LRT) here to determine the fault status of the sensor node. The proposed method measures the statistics of the received data over a certain period of time and then compares the likelihood ratio with the threshold value associated with a certain tolerance limit. The simulation results using a real time data set shows that the new method provides better detection accuracy (DA) with minimum false positive rate (FPR) and false alarm rate (FAR) over the modified three sigma test. LRT based hybrid fault diagnosis method detecting the fault status of a sensor node in wireless sensor network (WSN) for real time measured data with 100% DA, 0% FAR and 0% FPR if the probability of the data from faulty node exceeds 25%.
Analysis of large volume of data is very complex due to not only the high level of skewness and heteroscedasticity of variance but also the difficulty of data storage. Expectile regression is a common alternative meth...
详细信息
Analysis of large volume of data is very complex due to not only the high level of skewness and heteroscedasticity of variance but also the difficulty of data storage. Expectile regression is a common alternative method to analyze heterogeneous data. distributed storage can reduce effectively the storage burden of a single machine. In this paper, we consider fitting linear expectile regression model to estimate conditional expectile based on large-scale data. We store the data in a distributed manner and construct a gradient-enhanced loss (GEL) function as a proxy for the global loss function. A distributed algorithm is proposed for the optimization of the GEL function. The asymptotic properties of the proposed estimator are established. Simulation studies are conducted to assess the finite-sample performance of our proposed estimator. Applications to an analysis of the National Health Interview Survey data set demonstrate the practicability of the proposed method.
暂无评论