Cluster analysis is a common application of maritime traffic pattern recognition. However, most current methods are unsupervised clusteringalgorithms that rely solely on the features of the trajectories themselves fo...
详细信息
Cluster analysis is a common application of maritime traffic pattern recognition. However, most current methods are unsupervised clusteringalgorithms that rely solely on the features of the trajectories themselves for clustering, making them susceptible to uneven density distributions. To address this issue, we propose a semisupervised clustering algorithm that utilizes geospatial data to segment trajectories and generate clustering constraints. First, shoreline and anchorage information are employed for trajectory segmentation to enhance the consistency of local trajectory features. Second, cannot-link constraints derived from shoreline data are used to modify the distance between two trajectories, thereby preventing the clustering of trajectories separated by nonnavigable waters into a single cluster. Last, experimental verification is conducted using automatic identification system data, and the results are compared with those of existing algorithms. The results demonstrate more concentrated trajectory segmentation points, easier selection of clustering parameters, and significantly improved evaluation outcomes based on external clustering validity indices.
Deep learning model training requires a large number of labeled samples, but the acquisition of labeled samples is time-consuming and laborious in the medical field. To solve this problem, a semisupervisedclustering ...
详细信息
Deep learning model training requires a large number of labeled samples, but the acquisition of labeled samples is time-consuming and laborious in the medical field. To solve this problem, a semisupervised clustering algorithm combined with a 3D convolutional neural network model is proposed to improve the classification performance for benign and malignant pulmonary nodules. The research contents are as follows: Firstly, a multiresolution 3D dual path squeeze excitation deep learning network model is constructed. Then, the feature extractor in the network model is used to extract the high-level features of the image, and semisupervisedclustering is applied to the extracted image features. The corresponding pseudolabels can be obtained for the unlabeled samples, and the categories of unlabeled samples are determined and utilized. Finally, the oversampling algorithm is used to balance the data categories of different types of samples, and the benign and malignant pulmonary nodules are classified by a classifier constructed by a 3D dual path squeeze excitation network. The experimental results show that the proposed semisupervised clustering algorithm can label the categories of unlabeled samples. The proposed network model can learn more characteristics related to pulmonary nodules and can effectively improve the classification performance of pulmonary nodules. The proposed network model was tested using Lung Image Database Consortium (LIDC-IDRI) dataset, and an accuracy of 94.4% and an AUC of 0.931 were obtained. Compared with some existing classification models, the proposed method can achieve a better classification effect of pulmonary nodules.
clusteringalgorithms depend strongly on the dissimilarity considered to evaluate the sample proximities. In real applications, several dissimilarities are available that may come from different object representations...
详细信息
ISBN:
(纸本)9781424496365
clusteringalgorithms depend strongly on the dissimilarity considered to evaluate the sample proximities. In real applications, several dissimilarities are available that may come from different object representations or data sources. Each dissimilarity provides usually complementary information about the problem. Therefore, they should be integrated in order to reflect accurately the object proximities. In many applications, the user feedback or the a priory knowledge about the problem provide pairs of similar and dissimilar examples. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting. The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves classification and clustering results based on a single dissimilarity and data source.
暂无评论