Tropical cyclones (TCs) as a natural hazard pose a major threat and risk to the human population globally. This threat is expected to increase in a warming climate as the frequency of severe TCs is expected to increas...
详细信息
Tropical cyclones (TCs) as a natural hazard pose a major threat and risk to the human population globally. This threat is expected to increase in a warming climate as the frequency of severe TCs is expected to increase. In this study, the influence of different monthly sea surface temperature (SST) patterns on the locations and frequency of tropical cyclone genesis (TCG) in the Southwest Pacific (SWP) region is investigated. Using principal component analysis and k-meansclustering of monthly SST between 1970 and 2019, nine statistically different SST patterns are identified. Our findings show that the more prominent ENSO patterns such as the Modoki El Nino (i.e., Modoki I and Modoki II) and Eastern Pacific (EP) El Nino impact the frequency and location of TCG significantly. Our results enhance the overall understanding of the TCG variability and the relationship between TCG and SST configurations in the SWP region. The results of this study may support early warning system in SWP by improving seasonal outlooks and quantification of the level of TC-related risks for the vulnerable Pacific Island communities.
We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some comm...
详细信息
We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit convergence rates is established for the estimation of the common factors, the cluster-specific factors, and the latent clusters. Numerical illustration with both simulated data as well as a real data example is also reported. As a spin-off, the proposed new approach also advances significantly the statistical inference for the factor model of Lam and Yao. for this article are available online.
To reduce the calculation cost and improve the accuracy of flow field prediction, an adaptive proper orthogonal decomposition (APOD) surrogate model based on k-means clustering algorithm was proposed to reconstruct th...
详细信息
To reduce the calculation cost and improve the accuracy of flow field prediction, an adaptive proper orthogonal decomposition (APOD) surrogate model based on k-means clustering algorithm was proposed to reconstruct the flow field of impeller. The experiment samples were designed by introducing the perturbation of the blade control parameters such as blade wrap angle and blade angle of outlet. k-means clustering algorithm was used to classify the sample blade shapes, and find out the cluster of the objective blade. The snapshot set, which consisted of the blade shape and the flow field data of impeller, can be described as a linear combination of orthogonal basis by POD method. The radial basis function (RBF) was used to fit the orthogonal basis coefficients of the objective blade, and then the flow field of objective impeller was reconstructed. The traditional fixed sample POD (FPOD) method and the proposed APOD method were used to reconstruct the flow field in impeller, respectively, and the prediction results of the two methods were compared and analyzed. The results show that the proposed APOD method could quickly and accurately reconstruct the objective flow field. The flow field prediction accuracy of the APOD method is significantly higher than the FPOD method, and the calculation time for the flow field prediction is less than 1/360 of the CFD.
In observational studies, unbalanced observed covariates between treatment groups often cause biased inferences on the estimation of treatment effects. Recently, generalized propensity score (GPS) has been proposed to...
详细信息
In observational studies, unbalanced observed covariates between treatment groups often cause biased inferences on the estimation of treatment effects. Recently, generalized propensity score (GPS) has been proposed to overcome this problem;however, a practical technique to apply the GPS is lacking. This study demonstrates how clusteringalgorithms can be used to group similar subjects based on transformed GPS. We compare four popular clusteringalgorithms: k-meansclustering (kMC), model-based clustering, fuzzy c-meansclustering and partitioning around medoids based on the following three criteria: average dissimilarity between subjects within clusters, average Dunn index and average silhouette width under four various covariate scenarios. Simulation studies show that the kMC algorithm has overall better performance compared with the other three clusteringalgorithms. Therefore, we recommend using the kMC algorithm to group similar subjects based on the transformed GPS.
A rock mass discontinuity is a fundamental element of a rock mass structure that regulates its mechanical and hydrogeological properties and has significant engineering implications. In this research, we offer a metho...
详细信息
A rock mass discontinuity is a fundamental element of a rock mass structure that regulates its mechanical and hydrogeological properties and has significant engineering implications. In this research, we offer a method for extracting discontinuities and performing efficient clustering analysis based on 3D point cloud data for rock outcrops. First, the k-d tree approach is utilized to organize the point cloud data so that the normal vector and curvature can be calculated quickly. Discontinuities are then extracted using a multirule region growing algorithm, and the dip directions and dip angles of the discontinuities are calculated. Then, the improved farthest point sampling algorithm and the elbow method are used to optimize the k-meansalgorithm and finally automatically determine the main discontinuity set and average direction for the rock mass. The approach is tested on two real cases and compared to the methods of international researchers, and it is discovered that the method proposed in this work shows good accuracy, with an average deviation of less than 5 degrees from the dip direction and dip angle. Comparative tests with many point cloud data sets show that this new method can be used to extract discontinuities from massive-scale rock outcrop point cloud data and perform cluster analysis with high efficiency. The proposed method gives geologists and geological engineers a new tool for quickly and efficiently understanding rock outcrop discontinuities.
k-meansalgorithm is one of the basic clustering techniques that is used in many data mining applications. In this paper we present a novel pattern based clusteringalgorithm that extends the k-meansalgorithm for clu...
详细信息
k-meansalgorithm is one of the basic clustering techniques that is used in many data mining applications. In this paper we present a novel pattern based clusteringalgorithm that extends the k-meansalgorithm for clustering moving object trajectory data. The proposed algorithm uses a key feature of moving object trajectories namely, its direction as a heuristic to determine the different number of clusters for the k-meansalgorithm. In addition, we use the silhouette coefficient as a measure for the quality of our proposed approach. Finally, we present experimental results on both real and synthetic data that show the performance and accuracy of our proposed technique. (C) 2011 Faculty of Computers and Information, Cairo University. Production and hosting by Elsevier B. V. All rights reserved.
Internet of Things (IoT) is establishing to be the next stage of Internet resulting in a relation between internet connection data generating smart objects. IoT is understood as a seamless interconnection of sensor ob...
详细信息
ISBN:
(纸本)9781538617199
Internet of Things (IoT) is establishing to be the next stage of Internet resulting in a relation between internet connection data generating smart objects. IoT is understood as a seamless interconnection of sensor objects with the computational world. clustering these objects is very helpful for data analytics generating a combination of structure of similar types of object characteristics. clustering, therefore, is very useful in IoT for improving the Quality of Service (QoS) and organizing the resources. In this work, a k-meansclustering based resource allocation model for IoT has been discussed. The performance has been evaluated primarily for the response time realized for transmission of a message from the source to the destination. Simulation study reveals the effectiveness of the model.
The extended belief rule base (EBRB) system has been successfully applied to classification problems in various fields. However, the existing EBRB generation method converts all data into extended belief rules, which ...
详细信息
ISBN:
(数字)9783030875718
ISBN:
(纸本)9783030875718;9783030875701
The extended belief rule base (EBRB) system has been successfully applied to classification problems in various fields. However, the existing EBRB generation method converts all data into extended belief rules, which leads to the large scale of rule base and affects the efficiency and accuracy of subsequent inference. In view of this, this paper proposes an EBRB rule reduction method based on the adaptive k-means clustering algorithm (RC-EBRB). In the rule generation process, the k-means clustering algorithm is applied to obtain the rule cluster centers, which are used to generate new rules. In the end, these new rules form a reduced EBRB. Moreover, in order to determine the initial cluster centers and the number of clusters in the k-means clustering algorithm, the algorithm idea of k-means++ is introduced and a reduction granularity adjustment algorithm with threshold is proposed, respectively. Finally, four datasets on commonly used classification datasets from UCI are used to verify the performance of the proposed method. The experimental results are compared with the existing EBRB methods and the traditional machine learning methods, which prove the effectiveness of the method.
clustering analysis of load curves on basis of electricity information big data is an important basis of load characterisitic and electricity consumption habits analysis of large users. In view of the slow speed of tr...
详细信息
ISBN:
(纸本)9781479941261
clustering analysis of load curves on basis of electricity information big data is an important basis of load characterisitic and electricity consumption habits analysis of large users. In view of the slow speed of traditional k-means clustering algorithm in the background of big data, a parallel k-means clustering algorithm is proposed to speed up the clustering procedure. Firstly, all the load curves are de-noised by wavelet decomposing in order to reduce the influence of small fluctuations. Secondly, a multi-core parallel technology based k-means clustering algorithm is applied to load curve clustering. Thirdly, more than 40,000 load curves are clustered by the multi-core parallel technology based k-means clustering algorithm. Test results show that the proposed parallel k-means clustering algorithm can speed up clustering procedure effectively.
Aiming at the problems of distributed photovoltaic power stations, such as wide distribution and difficult scheduling, a cluster dynamic partitioning strategy based on distributed photovoltaic output prediction and im...
详细信息
ISBN:
(纸本)9798350339345
Aiming at the problems of distributed photovoltaic power stations, such as wide distribution and difficult scheduling, a cluster dynamic partitioning strategy based on distributed photovoltaic output prediction and improved clusteringalgorithm is proposed. Firstly, the output data of photovoltaic power station is analyzed for correlation, and new sample data is constructed and sent to the deep recurrent neural network for prediction, so as to obtain reliable output prediction results. Then, the grey wolf optimization algorithm is used to improve the k-means clustering algorithm, which is used to analyze the data set containing the output value and environmental parameters of photovoltaic power plants, so as to obtain the distributed photovoltaic power plant cluster with the best dynamic supply and demand balance. Finally, based on the IEEE 33 node system, the proposed strategy is tested and analyzed. The experimental results show that the modularity and dynamic supply and demand balance values of its cluster division are 0.783 and 0.819 respectively, and the photovoltaic output prediction effect is ideal.
暂无评论