Internet-based virtual computing environment (iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services. Virtual machines (...
详细信息
Internet-based virtual computing environment (iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services. Virtual machines (VMs) have been widely used in iVCE to isolate different users/jobs and ensure trustworthiness, but traditionally VMs require a long period of time for booting, which cannot meet the requirement of iVCE's large-scale and highly dynamic applications. To address this problem, in this paper we design and implement VirtMan, a fast booting system for a large number of virtual machines in iVCE. VirtMan uses the Linux Small Computer System Interface (SCSI) target to remotely mount to the source image in a scalable hierarchy, and leverages the homogeneity of a set of VMs to transfer only necessary image data at runtime. We have implemented VirtMan both as a standalone system and for OpenStack. In our 100-server testbed, VirtMan boots up 1000 VMs (with a 15 CB image of Windows Server 2008) on 100 physical servers in less than 120 s, which is three orders of magnitude lower than current public clouds.
The Gutenberg–Richter (GR) relation is an exponential law widely used for describing earthquakes’ statistical magnitude distributions. Using statistical physics approaches, we present robust models based on the Tsal...
详细信息
The Gutenberg–Richter (GR) relation is an exponential law widely used for describing earthquakes’ statistical magnitude distributions. Using statistical physics approaches, we present robust models based on the Tsallis q - and Kaniadakis κ -entropies, aiming to capture the influence of irregular fragments occupying space between two tectonic plates with irregular surfaces. The proposed models are called q -GR and κ -GR laws, respectively. Using Bayesian statistical analysis, we examined a large dataset of over 450,000 seismic events recorded along the San Andreas Fault between 2000 and 2023. Our findings reveal that the q -GR and κ -GR models outperform the classical GR law. The results show the κ -GR model exhibits particularly strong empirical support, with optimal performance occurring when κ ≈ 1 .
The pull-based development model, widely used in distributed software teams on open source communities, can efficiently gather the wisdom from crowds. Instead of sharing access to a central repository,contributors cre...
详细信息
The pull-based development model, widely used in distributed software teams on open source communities, can efficiently gather the wisdom from crowds. Instead of sharing access to a central repository,contributors create a fork, update it locally, and request to have their changes merged back, i.e., submit a pull-request. On the one hand, this model lowers the barrier to entry for potential contributors since anyone can submit pull-requests to any repository, but on the other hand it also increases the burden on integrators, who are responsible for assessing the proposed patches and integrating the suitable changes into the central repository. The role of integrators in pull-based development is crucial. They must not only ensure that pull-requests should meet the project’s quality standards before being accepted, but also finish the evaluations in a timely manner. To keep up with the volume of incoming pull-requests, continuous integration(CI) is widely adopted to automatically build and test every pull-request at the time of submission. CI provides extra evidences relating to the quality of pull-requests, which would help integrators to make final decision(i.e., accept or reject). In this paper, we present a quantitative study that tries to discover which factors affect the process of pull-based development model, including acceptance and latency in the context of CI. Using regression modeling on data extracted from a sample of Git Hub projects deploying the Travis-CI service, we find that the evaluation process is a complex issue, requiring many independent variables to explain adequately. In particular, CI is a dominant factor for the process, which not only has a great influence on the evaluation process per se, but also changes the effects of some traditional predictors.
High-dimensional data arising from diverse scientific research fields and industrial development have led to increased interest in sparse learning due to model parsimony and computational advantage. With the assumptio...
详细信息
High-dimensional data arising from diverse scientific research fields and industrial development have led to increased interest in sparse learning due to model parsimony and computational advantage. With the assumption of sparsity, many computational problems can be handled efficiently in practice. Structured sparse learning encodes the structural information of the variables and has been quite successful in numerous research fields. With various types of structures discovered, sorts of structured regularizations have been proposed. These regularizations have greatly improved the efficacy of sparse learning algorithms through the use of specific structural information. In this article, we present a systematic review of structured sparse learning including ideas, formulations, algorithms, and applications. We present these algorithms in the unified framework of minimizing the sum of loss and penalty functions, summarize publicly accessible software implementations, and compare the computational complexity of typical optimization methods to solve structured sparse learning problems. In experiments, we present applications in unsupervised learning, for structured signal recovery and hierarchical image reconstruction, and in supervised learning in the context of a novel graph-guided logistic regression.
Similarity of nominal data plays fundamental roles in numerous fields of both machine learning and data mining. Unlike the similarity of numerical data, that of nominal data is much more difficult to describe, and few...
详细信息
Similarity of nominal data plays fundamental roles in numerous fields of both machine learning and data mining. Unlike the similarity of numerical data, that of nominal data is much more difficult to describe, and few efforts have been done for it. Although existing nominal similarity measures can reveal a part of data properties, they suffer from low accuracy due to ignoring value relationships or integrating multi-view relationships inappropriately. In this paper, we propose a novel hierarchical measure for nominal data similarity (HNS). The HNS leverages the intrinsic data characteristics by considering low-level information both within and between attributes, and hierarchically seizes the value distributions, attribute interactions and attribute to object contributions. Meanwhile, it aggregates multi-view relationships trough a bottom to top framework, remaining consistency as well as complementary details. We theoretically analyzed this measure, and experiments on six UCI data sets demonstrate that the HNS outperforms the state-of-the-art nominal similarity measures in term of target alignment and clustering accuracy.
Ontologies have proven to be useful for capturing and organizing knowledge as a hierarchical set of terms and their relationships. However, curating gene ontology data by hand requires specialized knowledge of certain...
详细信息
Bundle method for regularized risk minimization (BMRM) is a variant of Cutting Plane Method (CPM). It performs efficiently in solving a convex minimization problem, which is a core part in a plethora of machine learni...
详细信息
Bundle method for regularized risk minimization (BMRM) is a variant of Cutting Plane Method (CPM). It performs efficiently in solving a convex minimization problem, which is a core part in a plethora of machine learning applications. Nonetheless, while exposed to the challenge of large-scale learning, the synchronous parallel implementation of BMRM easily encounters the straggler problem due to the diversity among heterogeneous working nodes' capability and unevenness in the inherent data distribution. In this paper, we propose a novel asynchronous distributed BMRM implementation, which employs an asynchronous computing window to fully explore the fast nodes' computational capabilities while reserving the good convergence of the BMRM. Extensive experiments show that the asynchronous BMRM algorithm has significant improvement of performance over its synchronous counterpart, and owns the ability to solve large-scale problems efficiently.
In this paper, we propose an indoor robot autonomous navigation system. The robot firstly explores in an unknown environment, and then navigates autonomously by using the explored map. The robot is equipped a 2D laser...
详细信息
In this paper, we propose an indoor robot autonomous navigation system. The robot firstly explores in an unknown environment, and then navigates autonomously by using the explored map. The robot is equipped a 2D laser scanner as the main sensor. The laser scanner is used for path planning and frontier-based exploration. A 2D global occupancy map is built for path planning, frontier-based exploration and multi-objective autonomous navigation. Laser scans are transmitted into Simultaneous Localization and Mapping (SLAM) process in the exploration phase. In indoor environment, the exploration efficiency is improved by merging a heuristic algorithm. By using multi-threading technology and a 3D perception approach proposed in this paper, the robot equipped with a low-cost RGBD sensor can detect all kinds of obstacles to achieve highly reliable navigation in complicated 3D environment. Meanwhile, we develop a multi-objective navigation application to make human-robot interaction more convenient and satisfy multi-task deployment. Our approaches are demonstrated by experimental results.
Network-on-chip system plays an important role to improve the performance of chip multiprocessor systems. As the complexity of the network increases, congestion problem has become the major performance bottleneck and ...
详细信息
暂无评论