The 21st century is a time of information, data, and knowledge. Information technology is changing every aspect of human society. From the perspective of machine learning, clustering, an important branch of data minin...
详细信息
"Incremental Learning (IL)" is the niche area of "Machine Learning." It is of utmost essential to keep learning incremental for ever-increasing data from all domains for effectual decisions, predic...
详细信息
In this paper we revisit the task of spell checking focusing on target word recovery. We propose a new approach that relies on phonetic information to improve the accuracy of clustering algorithms in identifying missp...
详细信息
Image coding technologies are essential to use communication channels effectively. We have been studied vector quantization, because it does not cause deterioration in image quality in a high compression region and al...
详细信息
The major evolution of the semantic web has become exchanging data between applications in all domains of activities. Based on this vision, different applications in recent days, e.g. in the fields of community web po...
详细信息
Pattern Recognition and Data Mining pose several problems in which, by their inherent nature, it is considered that an object can belong to more than one class;that is, clusters can overlap each other. OClustR and DCl...
详细信息
Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy are the main ingredients. IoT generates large datasets on a daily basis but it is currently considered as "dark data...
详细信息
Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy are the main ingredients. IoT generates large datasets on a daily basis but it is currently considered as "dark data", i.e., data generated but never analyzed. The efficient analysis of this data is mandatory to create intelligent applications for the next generation of IoT applications that benefits society. Artificial Intelligence (AI) techniques are very well suited to identifying hidden patterns and correlations in this data deluge. In particular, clustering algorithms are of the utmost importance for performing exploratory data analysis to identify a set (a.k.a., cluster) of similar objects. clustering algorithms are computationally heavy workloads and require to be executed on high-performance computing clusters, especially to deal with large datasets. This execution on HPC infrastructures is an energy hungry procedure with additional issues, such as high-latency communications or privacy. Edge computing is a paradigm to enable light-weight computations at the edge of the network that has been proposed recently to solve these issues. In this paper, we provide an in-depth analysis of emergent edge computing architectures that include low-power Graphics Processing Units (GPUs) to speed-up these workloads. Our analysis includes performance and power consumption figures of the latest Nvidia's AGX Xavier to compare the energy-performance ratio of these low-cost platforms with a high-performance cloud-based counterpart version. Three different clustering algorithms (i.e., k-means, Fuzzy Minimals (FM), and Fuzzy C-Means (FCM)) are designed to be optimally executed on edge and cloud platforms, showing a speed-up factor of up to 11x for the GPU code compared to sequential counterpart versions in the edge platforms and energy savings of up to 150% between the edge computing and HPC platforms.
In order to overcome the premature convergence in the particle swarm optimization algorithm, dynamically chaotic perturbation is introduced to form a dynamically chaotic PSO, briefly denoted as DCPSO. To get rid of th...
详细信息
clustering is an essential tool in data mining research and applications. It is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artifici...
详细信息
clustering is an essential tool in data mining research and applications. It is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning. Several clustering techniques have been proposed and implemented, and most of them successfully find excellent quality or optimal clustering results in the domains mentioned earlier. However, there has been a gradual shift in the choice of clustering methods among domain experts and practitioners alike, which is precipitated by the fact that most traditional clustering algorithms still depend on the number of clusters provided a priori. These conventional clustering algorithms cannot effectively handle real-world data clustering analysis problems where the number of clusters in data objects cannot be easily identified. Also, they cannot effectively manage problems where the optimal number of clusters for a high-dimensional dataset cannot be easily determined. Therefore, there is a need for improved, flexible, and efficient clustering techniques. Recently, a variety of efficient clustering algorithms have been proposed in the literature, and these algorithms produced good results when evaluated on real-world clustering problems. This study presents an up-to-date systematic and comprehensive review of traditional and state-of-the-art clustering techniques for different domains. This survey considers clustering from a more practical perspective. It shows the outstanding role of clustering in various disciplines, such as education, marketing, medicine, biology, and bioinformatics. It also discusses the application of clustering to different fields attracting intensive efforts among the scientific community, such as big data, artificial intelligence, and robotics. This survey paper will be beneficial for both practitioners and researchers. It will serve as a good reference point for researchers and practitioners to desi
This paper studies the multivehicle task assignment problem where several dispersed vehicles need to visit a set of target locations in a time-invariant drift field while trying to minimize the total travel time. Usin...
详细信息
This paper studies the multivehicle task assignment problem where several dispersed vehicles need to visit a set of target locations in a time-invariant drift field while trying to minimize the total travel time. Using optimal control theory, we first design a path planning algorithm to minimize the time for each vehicle to travel between two given locations in the drift field. The path planning algorithm provides the cost matrix for the target assignment, and generates routes once the target locations are assigned to a vehicle. Then, we propose several clustering strategies to assign the targets, and we use two metrics to determine the visiting sequence of the targets clustered to each vehicle. Mainly used to specify the minimum time for a vehicle to travel between any two target locations, the cost matrix is obtained using the path planning algorithm, and is in general asymmetric due to time-invariant currents of the drift field. We show that one of the clustering strategies can obtain a min-cost arborescence of the asymmetric target-vehicle graph where the weight of a directed edge between two vertices is the minimum travel time from one vertex to the other respecting the orientation. Using tools from graph theory, a lower bound on the optimal solution is found, which can be used to measure the proximity of a solution from the optimal. Furthermore, by integrating the target clustering strategies with the target visiting metrics, we obtain several task assignment algorithms. Among them, two algorithms guarantee that all the target locations will be visited within a computable maximal travel time, which is at most twice of the optimal when the cost matrix is symmetric. Finally, numerical simulations show that the algorithms can quickly lead to a solution that is close to the optimal.
暂无评论