Cluster analysis aims to categorize data objects into cohesive groups based on their intrinsic characteristics, often modeled by probability distributions. This paper presents a novel Mathematical Programming-Dynamic ...
详细信息
Cluster analysis aims to categorize data objects into cohesive groups based on their intrinsic characteristics, often modeled by probability distributions. This paper presents a novel Mathematical Programming-Dynamic Programming (MP-DP) clustering method developed by the authors, applied to datasets characterized by exponential, right-triangular, and uniform distributions. The MP-DP technique optimizes cluster partitions by leveraging the probability distributions inherent in the data. We conducted a comparative evaluation to assess the performance of MP-DP against four established clustering methodologies: K-Means, Fuzzy C-Means, expectation-maximization, and Genie++ hierarchical clustering. Results from extensive simulations and real-world datasets consistently demonstrate the superior efficacy of MP-DP in achieving optimal clustering outcomes. Specifically, MP-DP excels in handling diverse data distributions and effectively mitigating the effects of noise and uncertainty, thereby enhancing clustering accuracy and reliability. This study highlights the significant advancement offered by MP-DP in clustering research. It underscores the method's potential for applications across various domains, such as healthcare, environmental monitoring, and manufacturing, where robust and efficient data clustering is essential for insightful data analysis and decision-making.
Ensemble clustering can utilize the complementary information among multiple base clusterings, and obtain a clustering model with better performance and more robustness. Despite its great success, there are still two ...
详细信息
Ensemble clustering can utilize the complementary information among multiple base clusterings, and obtain a clustering model with better performance and more robustness. Despite its great success, there are still two problems in the current ensemble clustering methods. First, most ensemble clustering methods often treat all base clusterings equally. Second, the final ensemble clustering result often relies on k-means or other discretization procedures to uncover the clustering indicators, thus obtaining unsatisfactory results. To address these issues, we proposed a novel ensemble clustering method based on structured graph learning, which can directly extract clustering indicators from the obtained similarity matrix. Moreover, our methods take sufficient consideration of correlation among the base clusterings and can effectively reduce the redundancy among them. Extensive experiments on artificial and real-world datasets demonstrate the efficiency and effectiveness of our methods.
The precision of floating-point numbers is a critical task in high-performance computing. Many scientific applications rely on floating-point arithmetic, but excessive precision can lead to unnecessary computational o...
详细信息
The precision of floating-point numbers is a critical task in high-performance computing. Many scientific applications rely on floating-point arithmetic, but excessive precision can lead to unnecessary computational overhead. Reducing precision may introduce unacceptable errors. Addressing this trade-off is essential for optimizing performance while ensuring numerical accuracy. In this paper, we present a genetic algorithm-based approach for tuning the precision of floating-point computations. Our method leverages algorithmic differentiation and first-order Taylor series approximation to assess the impact of precision variations efficiently. We employ stochastic partitioning algorithms with multiple precision combinations that meet the error requirements. Moreover, we present a genetic heuristic algorithm to determine the maximum number of variables that can sustain precision alterations without compromising the desired error threshold. The proposed approach is evaluated across various benchmark programs, analyzing the effects of precision tuning under increasing error thresholds. Our findings reveal that, for a majority of these programs, reducing precision through partitioning leads to significant performance enhancements, with improvements of up to 15%.
High-level programming languages have transformed graphics processing units (GPUs) from domain-restricted devices into powerful compute platforms. Yet many "general-purpose GPU" (GPGPU) applications fail to ...
详细信息
ISBN:
(纸本)9783981537024
High-level programming languages have transformed graphics processing units (GPUs) from domain-restricted devices into powerful compute platforms. Yet many "general-purpose GPU" (GPGPU) applications fail to fully utilize the GPU resources. Executing multiple applications simultaneously on different regions of the GPU (spatial multitasking) thus improves system performance. However, within-die process variations lead to significantly different maximum operating frequencies (F-max) of the streaming multiprocessors (SMs) within a GPU. As the chip size and number of SMs per chip increase, the frequency variation is also expected to increase, exacerbating the problem. The increased number of SMs also provides a unique opportunity: we can allocate resources to concurrently-executing applications based on how those applications are affected by the different available F-max values. In this paper, we study the effects of per-SM clocking on spatial multitasking-capable GPUs. We demonstrate two factors that affect the performance of simultaneously-running applications: (i) the SM partitioning algorithm that decides how many resources to assign to each application, and (ii) the assignment of SMs to applications based on the operating frequencies of those SMs and the applications characteristics. Our experimental results show that spatial multitasking that partitions SMs based on application characteristics, when combined with per-SM clocking, can greatly improve application performance by up to 46% on average compared to cooperative multitasking with global clocking.
Multi-Agent Path Finding () seeks collision-free paths for multiple agents from start to goal locations. This paper considers a generalization of called Multi-Agent Combinatorial Path Finding () where agents must coll...
详细信息
Multi-Agent Path Finding () seeks collision-free paths for multiple agents from start to goal locations. This paper considers a generalization of called Multi-Agent Combinatorial Path Finding () where agents must collectively visit a set of intermediate target locations before reaching their goals. is challenging as it involves both planning collision-free paths for multiple agents and target sequencing, i.e., assigning targets to and computing the visiting order for each agent. A recent method Conflict-Based Steiner Search () is developed to solve to optimality, which, however, does not scale well when the number of agents or targets is large (e.g. 50 targets). While research has developed methods to plan bounded sub-optimality paths for many agents, it remains unknown how to find bounded sub-optimal solutions in the presence of many targets. This paper fills this gap by developing a method for target sequencing (A for Approximation and K* for K-best), which leverages approximation algorithms for traveling salesman problems. is motivated by, but is a standalone method that can solve K-best routing problems in general. We prove that has worst-case polynomial runtime complexity and finds bounded sub-optimal solutions. With, we develop two variants that find bounded sub-optimal paths for . Our results verify the fast running speeds of our methods with up to 200 targets.
Pinning control provides an effective approach to controlling large-scale networks and conserving control resources. This article presents a solution to pinning synchronization in directed networks with a precise inde...
详细信息
Pinning control provides an effective approach to controlling large-scale networks and conserving control resources. This article presents a solution to pinning synchronization in directed networks with a precise index that measures the pinning synchronization capability of directed networks, capturing full topological information about the networks. Building upon this index, the article utilizes matrix analysis tools, such as the non-negative matrix theory and strongly connected decomposition to analyze the impact of network structures and controller parameters on the network synchronizability. Specifically, the study investigates the influence of the in-degree of unpinned nodes, the difference between in-degrees and out-degrees of nodes, strong connectivity components, and the linear feedback control gains on the network synchronizability. Moreover, the article addresses the challenge of optimally selecting pinned nodes by using a graph partitioning algorithm and a greedy node selection algorithm, which can be applied to effectively select pinned nodes in a large-scale network. Extensive simulations on a range of real-world directed networks validate the efficiency of the proposed algorithms and demonstrate their superiority over seven baseline algorithms.
Regular path queries (RPQs) in graph databases are bottlenecked by the memory wall. Emerging processing-in-memory (PIM) technologies offer a promising solution to dispatch and execute path matching tasks in parallel w...
详细信息
Regular path queries (RPQs) in graph databases are bottlenecked by the memory wall. Emerging processing-in-memory (PIM) technologies offer a promising solution to dispatch and execute path matching tasks in parallel within PIM modules. We present an efficient PIM-based data management system tailored for RPQs and graph updates. Our solution, called PimBeam, facilitates efficient batch RPQs and graph updates by implementing a PIM-friendly dynamic graph partitioning algorithm. This algorithm effectively addresses graph skewness issues while maintaining graph locality with low overhead for handling RPQs. PimBeam streamlines label filtering queries by adding a filtering module on the PIM side and leveraging the parallelism of PIM. For the graph updates, PimBeam enhances processing efficiency by amortizing the host CPU's update overhead to PIM modules. Evaluation results of PimBeam indicate 3.59x speedup for RPQs and 29.33x speedup for graph update on average over the state-of-the-art traditional graph database.
Many existing surrogate-assisted optimization algorithms are limited to designing antennas with continuous variables only. However, numerous challenges emerge when tackling antenna optimization problems that involve b...
详细信息
Many existing surrogate-assisted optimization algorithms are limited to designing antennas with continuous variables only. However, numerous challenges emerge when tackling antenna optimization problems that involve both continuous and binary design variables. This article proposes an efficient surrogate-assisted mixed continuous/binary particle swarm optimization (SAMPSO) algorithm to address these mixed-variable antenna optimization problems. The SAMPSO tightly integrates machine learning (ML) models with PSO in two key aspects: an ML-guided swarm updating method and an ML-assisted prescreening strategy. In addition, a novel local ML model training method is developed to reduce the algorithm time complexity. To verify its effectiveness, the SAMPSO is compared with two existing algorithms in solving benchmark functions and designing antennas. The results demonstrate that SAMPSO can achieve design objectives with a faster convergence speed.
Complete coverage of multiple robots for a large map is an important collaborative planning task, which is widely used in disaster search and rescue, forest fire prevention, resource exploration, and other fields. It ...
详细信息
Complete coverage of multiple robots for a large map is an important collaborative planning task, which is widely used in disaster search and rescue, forest fire prevention, resource exploration, and other fields. It generally focuses on coverage completion with less robot (mostly drone) occupation and higher time efficiency. However, as the map becomes larger, most existing works will fail to remain near-optimal due to escalating complexity. In this letter, we formulate the problem as two levels of traveling salesman problem (TSP) to reduce the complexity, where the lower level is single-robot local coverage planning with preset TSP solutions, and the higher level is multiple TSP (mTSP) global planning executed by multiple robots. To better adapt to dynamic scenarios, we apply a distributed multi-agent reinforcement learning (MARL) framework that allows efficient online computation, and a dense segmented Siamese network (DSSN) to achieve efficient and effective solutions. We show that compared to existing advanced methods, our strategy can effectively solve the problem with coverage costs decreased by over 30$\%$ average and the execution time reduced to seconds in large maps. Furthermore, our network DSSN achieves an additional improvement with random settings to the previous mTSP architecture. We also discuss the influence of different robot densities and separated block sizes on the results, and demonstrate the adaptability to irregular and large-scale obstacles.
The edge intelligence (EI) has been widely applied recently. Splitting the model between device, edge server, and cloud can significantly improve the performance of EI. The model segmentation without user mobility has...
详细信息
The edge intelligence (EI) has been widely applied recently. Splitting the model between device, edge server, and cloud can significantly improve the performance of EI. The model segmentation without user mobility has been investigated in detail in previous studies. However, in most EI use cases, the end devices are mobile. Few studies have been conducted on this topic. These works still have many issues, such as ignoring the energy consumption of mobile device, inappropriate network assumption, and low effectiveness on adapting user mobility, etc. Therefore, to address the disadvantages of model segmentation and resource allocation in previous studies, we propose mobility and cost aware model segmentation and resource allocation algorithm for accelerating the inference at edge (MCSA). Specifically, in the scenario without user mobility, the loop iteration gradient descent (Li-GD) algorithm is provided. When the mobile user has a large model inference task that needs to be calculated, it will take the energy consumption of mobile user, the communication and computing resource renting cost, and the inference delay into account to find the optimal model segmentation and resource allocation strategy. In the scenario with user mobility, the mobility aware Li-GD (MLi-GD) algorithm is proposed to calculate the optimal strategy. Then, the properties of the proposed algorithms are investigated, including convergence, complexity, and approximation ratio. The experimental results demonstrate the effectiveness of the proposed algorithms.
暂无评论