An intrusion detection system is an intelligent system develope to identify and counteract intrusive efforts. clustering algorithm are used in intrusion detection systems for separating norma activities from abnormal ...
详细信息
Spatial data mining is the process of identifying or extracting efficient, novel, potentially useful and ultimately understandable patterns from the spatial data set, the spatial clustering analysis is one of the most...
详细信息
University course timetabling is a NP-hard problem that be performed for each semester frequently. In this paper, we use a two-step algorithm for timetabling of common lecturers among departments. In the first step, w...
详细信息
University course timetabling is a NP-hard problem that be performed for each semester frequently. In this paper, we use a two-step algorithm for timetabling of common lecturers among departments. In the first step, we use a fuzzy multi-criteria decision-making comparison and local search algorithms with seven neighborhood structures and random iteration. It means that we use a fuzzy multi-criteria comparison algorithm to eliminate the ambiguities and soft constraints of common lecturers among departments. In addition, we apply the local search algorithm with seven neighboring structures to avoidtrapping intolocaloptima and improve the fuzzy multi-criteria comparison over the preferences and soft constraints of lecturers. In the second step, the common lecturers' timetable generated in the first step by the clustering approach (k-means, fuzzy c-means and funnel shape) is clustered based on the preferences and soft constraints of common lecturers among departments. Now, our common lecturers prepared by the clustering algorithms are mapped to the traversed free resources according to the paper's aims: (1) descending satisfaction of preferences and soft constraints of common lecturers among departments and (2) minimizing the loss of extra resources of each faculty, so that an optimal instance of our common lecturers timetabling is generated among departments. The applied datasets are in terms of satisfying the scheduling requirements in the real world for multi-departments of Islamic Azad University of Ahar branch.
The reliability of the smart grid is adversely affected due to system uncertainties. Also, the steadily growing deployment of renewable distributed generation (DG) units increases the uncertainties of smart grids. Hen...
详细信息
The reliability of the smart grid is adversely affected due to system uncertainties. Also, the steadily growing deployment of renewable distributed generation (DG) units increases the uncertainties of smart grids. Hence, it is essential to concern the uncertainties in the field of reliability evaluation of smart grids. Although the Monte Carlo simulation (MCS) has received a significant deal of consideration in the literature, there is a research gap in using the clustering algorithms to assess smart grids' reliability. This article aims to fill such a research gap by proposing a new reliability assessment method, using various clustering algorithms. The benefits from the proposed method's accuracy and fast computation are highlighted, while optimal operation, optimal short-term planning, and repetitive problems should be studied. In this paper, the performance and accuracy of various classic (k-means, fuzzy c-means, and k-medoids) and metaheuristic (genetic algorithm, particle swarm optimization, differential evolutionary, harmony search, and artificial bee colony) clustering algorithms are studied. Comparing different scenario reduction algorithms in the proposed reliability evaluation method is one of the most contributions. The proposed method is applied to two realistic test systems. Test results infer that the proposed method is adequately precise, while the required computation time is less than MCS-based approaches. Test results for both test systems imply that the accurate expected energy not supplied (EENS) with less than 2.1% is achievable applying the proposed method. The fuzzy c-means clustering algorithm results in the best accuracy among the studied classic and nonclassic (metaheuristic) algorithms.
Data mining techniques are a powerful method for extracting information from large databases. Among these techniques, clustering and projection of data from high-dimensional spaces hold a main role, since they allow t...
详细信息
Data mining techniques are a powerful method for extracting information from large databases. Among these techniques, clustering and projection of data from high-dimensional spaces hold a main role, since they allow to discover hidden structures in the data set. Following this approach, this paper presents a data analysis method designed to help the management and investigation of occupational accident databases. The purpose is to discover the most common sequences of events leading to accidents for devising preventive actions. To this aim, we developed a two-level approach based on the joint use of the Kohonen's Self-Organizing Map and the k-means clustering algorithm. This approach allows not only to group the accidents in different classes but also to visualize them in a way understandable for the analyst. The method has been applied with satisfactory results to a large database of occupational accidents occurred in the Italian wood processing industry. A comparison with the Hierarchical clustering method confirmed the effectiveness of the proposed approach. (C) 2011 Elsevier Ltd. All rights reserved.
A comprehensive analysis of the COVID-19 pandemic is necessary to prepare for future healthcare challenges. In this regard, the large number of datasets collected during the pandemic has allowed various studies on dis...
详细信息
A comprehensive analysis of the COVID-19 pandemic is necessary to prepare for future healthcare challenges. In this regard, the large number of datasets collected during the pandemic has allowed various studies on disease behavior and characteristics. For example, collected datasets can be used to extract knowledge about the symptomatic behavior of the disease. In this work, we are interested in analyzing the relationships between the different symptoms of the disease, considering various dimensions, such as countries, variants of COVID-19, and age groups. To this end, we consider the co-occurrence of symptoms as a fundamental element. More precisely, we implemented clustering techniques to discover symptomatic patterns across the various dimensions. For instance, in analyzing the dominant patterns, we observe that symptom congestion or runny nose almost always appears with the symptom muscle pain across many dimensions. Hence, the information on symptom patterns can be helpful in decision-making processes to detect and combat COVID-19 and similar diseases.
Many datasets including social media data and bibliographic data can be modeled as graphs. clustering such graphs is able to provide useful insights into the structure of the data. To improve the quality of clustering...
详细信息
Many datasets including social media data and bibliographic data can be modeled as graphs. clustering such graphs is able to provide useful insights into the structure of the data. To improve the quality of clustering, node attributes can be taken into account, resulting in attributed graphs. Existing attributed graph clustering methods generally consider attribute similarity and structural similarity separately. In this paper, we represent attributed graphs as star-schema heterogeneous graphs, where attributes are modeled as different types of graph nodes. This enables the use of personalized pagerank (PPR) as a unified distance measure that captures both structural and attribute similarities. We employ DBSCAN for clustering, and we update edge weights iteratively to balance the importance of different attributes. The rapidly growing volume of data nowadays challenges traditional clustering algorithms, and thus, a distributed method is required. Hence, we adopt a popular distributed graph computing system Blogel, based on which, we develop four exact and approximate approaches that enable efficient PPR score computation when edge weights are updated. To improve the effectiveness of the clustering, we propose a simple yet effective edge weight update strategy based on entropy. In addition, we present a game theory based method that enables trading efficiency for result quality. Extensive experiments on real-life datasets offer insights into the effectiveness and efficiency of our proposals.
In this article, a general approach for directed graph clustering and two new density-based clustering objectives are presented. First, using an equivalence between the clustering objective functions and a trace maxim...
详细信息
In this article, a general approach for directed graph clustering and two new density-based clustering objectives are presented. First, using an equivalence between the clustering objective functions and a trace maximization expression, the directed graph clustering objectives are converted into the corresponding weighted kernel k-means problems. Then, a nonspectral algorithm, which covers both the direction and weight information of the directed graphs, is thus proposed. Next, with Rayleigh's quotient, the upper and lower bounds of clustering objectives are obtained. After that, we introduce a new definition of weak links to characterize the effectiveness of clustering. Finally, illustrative examples are given to demonstrate effectiveness of the results. This article provides a glance at the potential connection between density-based and pattern-based clustering. Compared with other approaches for directed graph clustering, the method proposed in this article naturally avoids the loss of the nonsymmetric edge data because there is no need for any additional symmetrization.
The large amount of data available for analysis and management raises the need for defining, determining, and extracting meaningful information from the data. Hence in scientific, engineering, and economics studies, t...
详细信息
The large amount of data available for analysis and management raises the need for defining, determining, and extracting meaningful information from the data. Hence in scientific, engineering, and economics studies, the practice of clustering data arises naturally when sets of data have to be divided into subgroups with the aim of possibly deducting common features for data belonging to the same subgroup. For instance, the innovation scoreboard [1] (see Figure 1) allows for the classification of the countries into four main clusters corresponding to the level of innovation defining the “leaders,” the “followers,” the “trailing,” and the “catching up” countries. Many other disciplines may require or take advantage of a clustering of data, from market research [2] to gene expression analysis [3], from biology to image processing [4][7]. Therefore, several clustering techniques have been developed (for details see “Review of clustering algorithms”).
The creation of stable, scalable and adaptive clusters with good performance, faster convergence rate and with minimal overhead is a challenging task in mobile ad hoc networks (MANETs). This study proposes two cluster...
详细信息
The creation of stable, scalable and adaptive clusters with good performance, faster convergence rate and with minimal overhead is a challenging task in mobile ad hoc networks (MANETs). This study proposes two clustering techniques for MANETs, which are (k, r)-dominating set-based, weighted and adaptive to changes in the network topology. The set of dominating nodes functions as the clusterheads. The scenario-based clustering algorithm for MANETs (SCAM) is a greedy approximation algorithm, whereas the distributed-SCAM (DSCAM) selects the (k, r)-dominating set through a distributed election mechanism. These algorithms achieve variable degree of clusterhead redundancy through the parameter k which contributes to reliability. Similarly, flexibility in creating variable diameter clusters is achieved with the parameter r. To improve the stability of the created clusters, the affiliation of other nodes with the clusterhead is decided based on the quality of the clusterhead, which is a function of connectivity, stability, residual battery power and transmission range. Mechanisms are available for accounting the group mobility and load balancing. The performance of these algorithms are evaluated through simulation and the results show that these algorithms create stable, scalable and load-balanced clusters with relatively less control overhead in comparison with the existing popular algorithms.
暂无评论