Due to the increasing threat of malware to computer systems and networks, traditional malware detection and recognition technologies face difficulties and limitations. Therefore, exploring new methods to improve the a...
详细信息
Due to the increasing threat of malware to computer systems and networks, traditional malware detection and recognition technologies face difficulties and limitations. Therefore, exploring new methods to improve the accuracy and efficiency of malware identification has become an urgent need. This study introduces ant colony algorithm to optimize traditional clustering algorithms and algorithm parameters. The experimental results showed that the improvement rates of the improved algorithm in accuracy, echo value, and false alarm rate were 0.253, 0.115, and 0.056, respectively. The accuracy on the training and validation sets continued to increase and the loss curve continued to decrease. In addition, the improved algorithm had stronger modeling ability for data feature relationships and temporal information. This is of great help in improving the recognition ability of virus and worm software. The improved algorithm had a lower occupancy rate of computing resources compared to other algorithms, but it could also effectively monitor device operation. Compared with traditional methods, this method can more accurately identify malicious software and effectively identify malicious software samples from large-scale datasets. This is of great significance for protecting computer systems and network security.
Container port congestion threatens the effectiveness and sustainability of the global supply chain because it stagnates cargo flows and triggers ripple effects across connected, multimodal freight transport networks....
详细信息
Container port congestion threatens the effectiveness and sustainability of the global supply chain because it stagnates cargo flows and triggers ripple effects across connected, multimodal freight transport networks. This study aims to develop a novel and tangible method to measure port congestion by investigating ship behaviors between different zones in port waters. Different port zones have varying ship densities because ships moor in the anchorage area randomly but dock at berths in an orderly and close fashion. This observation leads us to apply the density-based clustering method for port zone identification and differentiation. In order to ensure the method is globally applicable and accurate, we develop a new clustering algorithm, an iterative, multi-attribute DBSCAN (IMA-DBSCAN), which incorporates an iterative process, together with both spatial information and domain knowledge. The necessary input data for the algorithm is extracted from the Automatic Identification System (AIS), a satellite-based tracking system with real-time ship positioning and sailing data. An illustrative case suggests that our algorithm can rapidly and precisely identify anchorage areas and individual berths (even in a port with complicated geographic features), while other methods cannot. The algorithm is applied to measure congestion at 20 major container ports in the world. The results show a significant increase in congestion at the Port of Los Angeles from August to December 2020, which matches the realistic statistics and proves the efficiency and practical applicability of the proposed algorithm.
Growing economy has boomed tourism, but intelligent travel planning services restrict long-term and stable tourism development. Typically, travel planning requires substantial time and cost. And currently, less focus ...
详细信息
Growing economy has boomed tourism, but intelligent travel planning services restrict long-term and stable tourism development. Typically, travel planning requires substantial time and cost. And currently, less focus on user preferences in most tourist attraction recommendations also results in low efficiency. In this paper, firstly, the K-means algorithm is introduced for clustering analysis of user behavior or interests, so as to better understand user preferences. Gaussian kernel density estimation and similarity measurement are also adopted to improve the traditional K-means algorithm, which provides the foundation for a tourist attraction recommendation model. Then, to further improve transportation route planning, the study introduces the ant colony algorithm, adaptive crossover strategy and local search algorithm to enhance the traditional genetic algorithm for an optimized travel path planning model. The outcomes show that the improved clustering algorithm possesses the highest accuracy of 0.96 and 0.78 in Iris and Glass datasets respectively, along with a sum of squared errors of 96.73 and 476.48 respectively. The shortest running time in the Yeast data-set is 1.22 s. The improved clustering algorithm with 50 nearest neighbors has an average absolute error value of 0.749, and its longest running time does not exceed 1 s. In summary, the model developed in this study is highly applicable to personalized recommendation services and efficient travel routes.
作者:
Ni, KanGunma Univ
Grad Sch Sci & Technol 1-5-1 Tenjincho Kiryu 3768515 Japan
. The current clustering algorithm has problems such as sensitivity to initial selection, poor global search ability, and low clustering efficiency, which not only affects its segmentation ability but also makes it di...
详细信息
. The current clustering algorithm has problems such as sensitivity to initial selection, poor global search ability, and low clustering efficiency, which not only affects its segmentation ability but also makes it difficult to meet the needs of practical applications. Therefore, in response to this issue, the Differential Evolution (DE) algorithm was used to improve the Artificial Bee Colony (ABC) algorithm, and in practical clustering applications, the improved algorithm was combined with Fuzzy C-Means (FCM) to algorithm, which was experimentally analyzed. The experimental results showed that the mutation strategy results of DEABC-FC were around 60.574 and 1.541e02 on Iris and Glass, and around 1.228e-10 and 6.003e-09 on CMC and Wine datasets, both lower than those of Artificial Bee Colony-Fuzzy clustering (ABC-FC) and FCM algorithms. In addition, the DEABC-FC algorithm has undergone 15 iterations on 4 datasets, and its clustering performance is relatively good, and it can effectively balance global and local search. At the same time, after changing the values of the mutation factor and crossover factor separately, it was found that increasing the value of the mutation factor improved the diversity of the population and the stability of the algorithm, but also had a certain impact on the convergence of the algorithm. The increase in the value of the crossover factor makes the DEABC-FC algorithm have faster convergence speed and better global optimization ability. Overall, DEABC-FC has stronger global optimization ability and convergence speed in clustering compared to ABC-FC and FCM methods, which can effectively offset the shortcomings of clustering algorithms and have important significance in practical clustering applications.
Determining accurate PM2.5 pollution concentrations and understanding their dynamic patterns are crucial for scientifically informed air pollution control strategies. Traditional reliance on linear correlation coeffic...
详细信息
Determining accurate PM2.5 pollution concentrations and understanding their dynamic patterns are crucial for scientifically informed air pollution control strategies. Traditional reliance on linear correlation coefficients for ascertaining PM2.5-related factors only uncovers superficial relationships. Moreover, the invariance of conventional prediction models restricts their accuracy. To enhance the precision of PM2.5 concentration prediction, this study introduces a novel integrated model that leverages feature selection and a clustering algorithm. Comprising three components-feature selection, clustering, and integrated prediction-the model first employs the non-dominated sorting genetic algorithm (NSGA-III) to identify the most impactful features affecting PM2.5 concentration within air pollutants and meteorological factors. This step offers more valuable feature data for subsequent modules. The model then adopts a two-layer clustering method (SOM+K-means) to analyze the multifaceted irregularity within the dataset. Finally, the model establishes the Extreme Learning Machine (ELM) weak learner for each classification, integrating multiple weak learners using the AdaBoost algorithm to obtain a comprehensive prediction model. Through feature correlation enhancement, data irregularity exploration, and model adaptability improvement, the proposed model significantly enhances the overall prediction performance. Data sourced from 12 Beijing-based monitoring sites in 2016 were utilized for an empirical study, and the model's results were compared with five other predictive models. The outcomes demonstrate that the proposed model significantly heightens prediction accuracy, offering useful insights and potential for broadened application to multifactor correlation concentration prediction methodologies for other pollutants.
In marketing, customer segmentation is a very critical element. This paper focuses on clustering algorithms. First, the commonly used K-means algorithm was introduced, and then, it was optimized using the improved Lio...
详细信息
In marketing, customer segmentation is a very critical element. This paper focuses on clustering algorithms. First, the commonly used K-means algorithm was introduced, and then, it was optimized using the improved Lion Swarm Optimization (ILSO) algorithm and the Calinski-Harabasz (CH) index. The results of the experiment for the UCI dataset showed that the CH indicator obtained an accurate number of clusters, and the clustering accuracy of the ILSO-K-means algorithm was higher, both above 90%. Then, in customer segmentation, the customers of an enterprise were divided into four groups using the ILSO-K-means algorithm, and different marketing suggestions were given. The experimental analysis proves the usability of the ILSO-K-means algorithm in customer segmentation, which can be further applied in practice.
Distance measure is an effective tool for describing the difference between two vectors. Many scholars have proposed a lot of distance measures between the intuitionistic fuzzy sets. However, there are few works about...
详细信息
Distance measure is an effective tool for describing the difference between two vectors. Many scholars have proposed a lot of distance measures between the intuitionistic fuzzy sets. However, there are few works about the interval-valued intuitionistic multiplicative (IVIM) distance measure. The few research achievements are not sufficient to deal with the problems involving the distance between two interval-valued intuitionistic multiplicative sets (IVIMSs). Thus, there still exist some shortages in fully describing the difference between two IVIMSs. In this paper, we first propose an improved distance measure, the projection-based distance measure, which can reflect the difference between two objects more accurately with IVIM information. After that, a new method is introduced to determine the experts' weights based on the projection-based distance measure. Then, to handle the group decision making problem in which the weights of experts are unknown, we use the proposed projection-based distance measure to construct the similarity matrix in Boole clustering method. Finally, the clustering method is applied to the customer classification problem to test the reliability of the method.
A new clustering algorithm for spatio-temporal data is developed. The proposed method leverages a weighted combination of a spatial haversine distance matrix and a spectral density based temporal distance matrix betwe...
详细信息
A new clustering algorithm for spatio-temporal data is developed. The proposed method leverages a weighted combination of a spatial haversine distance matrix and a spectral density based temporal distance matrix between the locations. Concepts of partition around medoids algorithm and the gap statistic are utilized to develop the algorithm and to determine the optimal number of clusters. Such a non-parametric algorithm is novel as it incorporates both spatial and temporal distances of the units and it can work for time-series of possibly different lengths. Theoretical guarantee of consistency of the proposed method is provided. An elaborate simulation study is also given to demonstrate the efficacy of the algorithm. As an interesting real life application, the proposed algorithm is implemented to analyze the spatio-temporal dynamics of the time series of coronavirus (COVID-19) incidence rates observed at county-level in the United States of America. The results are demonstrated on datasets of different sizes: the entire country, the Midwest region and the state of California. Special emphasis is given on the last two cases to display how the clustering results offer interesting insights into the epidemic progression in these areas. Particularly, it sheds light on whether state-mandated restrictions impacted the entire state similarly or if there are interesting local behaviors in terms of the COVID-19 spread. & COPY;2023 Elsevier B.V. All rights reserved.
Underwater acoustic sensor networks (UASNs), which are popular in various application fields, including marine resources development, environmental exploration, seismic monitoring, etc., have made great progress in re...
详细信息
Underwater acoustic sensor networks (UASNs), which are popular in various application fields, including marine resources development, environmental exploration, seismic monitoring, etc., have made great progress in recent years. To maintain good scheduling performance, clustering algorithms and multiple access control (MAC) protocols have been widely used in sensor networks to improve network efficiency. However, the existing algorithms and protocols still have many shortcomings. For example, many clustering algorithms consider the delay performance little, the cluster structures are not always fully utilized by MAC protocols, and the cluster maintenance strategies are not considered. This article is devoted to solving those problems. By taking the node traffic and distances into account simultaneously, we design the cluster structure reasonably. And based on this structure, we plan a conflict-free handshake protocol with minimal idle time gaps. Besides, we also design a joining-cluster strategy for the free nodes to maintain the network without interference. Simulation results show that our work can perform well in network uniformity and end-to-end delay.
A high amount of hits on the vertex detector of the International Linear Collider (ILC) are generated by the beam background, which leads to an increase in the data flow of the detector system. Charged particles comin...
详细信息
A high amount of hits on the vertex detector of the International Linear Collider (ILC) are generated by the beam background, which leads to an increase in the data flow of the detector system. Charged particles coming from the beam background have low momentum, resulting in the generation of elongated clusters. The CMOS pixel sensor (CPS), which integrates pre-processing functions and on-chip artificial neural networks (ANNs), could remove these elongated clusters. clustering is the first step for data pre-processing and is used to collect clusters from raw data. In this article, a pixel-level clustering algorithm with a 5 chi 5 window executed in real time is proposed. The algorithm is tested using 4500 frames (500 frames for each angle of incidence) of raw data (12 bits/pixel) from a MIMOSA-18 sensor and compared to conventional clustering algorithms. The clustering implementation for an example array of 5 chi 5 pixels is synthesized for different frequencies (100 and 200 MHz) and analog-to-digital converter (ADC) resolutions (4 and 8 bits). The power dissipation and occupied area of the different implementations are analyzed. The hardware implementation of the algorithm provides the possibility to integrate the clustering function into the CPS.
暂无评论