With the increasing complexity of consumer preferences and behaviors, businesses face challenges to capture the dynamic nature of online consumer behavior, highlighting the need for advanced approaches. This study aim...
详细信息
With the increasing complexity of consumer preferences and behaviors, businesses face challenges to capture the dynamic nature of online consumer behavior, highlighting the need for advanced approaches. This study aims to enhance customer segmentation in e-marketing by analyzing and comparing various machine learning-based clustering methods, with a particular focus on unsupervised clustering techniques for predicting Customer Lifetime Value (CLV). While prior research has utilized unsupervised clustering for customer segmentation, this current study uniquely integrates K-Means++ with other clustering techniques to enhance segmentation accuracy and gain deeper insights into consumer behavior. This study adopts a structured, unsupervised clustering approach, enabling natural customer groupings without predefined labels, which is particularly suitable for customer segmentation in scenarios with limited labeled data. Several clustering techniques are investigated, including K-Means, K-Medoids, Agglomerative clustering, DBSCAN, Fuzzy C-Means, K-Means++, Mini Batch K-Means, Mean Shift, and Gaussian Mixture Models (GMM). K-Means++ demonstrated superior performance in segmentation accuracy, outperforming other techniques under various conditions. Performance is evaluated using key metrics such as the Silhouette Score and Davies-Bouldin Index. Utilizing Kaggle datasets, the analysis follows a comprehensive preprocessing protocol comprising RFM (Recency, Frequency, Monetary) analysis, outlier removal, and data normalization to ensure data integrity and facilitate systematic identification of distinct consumer segments. This research highlights the potential and significance of machine learning in refining customer segmentation processes within e-marketing, ultimately aiding businesses in optimizing their marketing effectiveness and strategic planning. While focusing primarily on a limited selection of clustering methods, the study underscores the necessity for ongoing explor
This paper investigates the long-term average age of information (AoI)-minimal problem in an unmanned aerial vehicle (UAV)-assisted wireless-powered communication network (WPCN), which consists of a static hybrid acce...
详细信息
This paper investigates the long-term average age of information (AoI)-minimal problem in an unmanned aerial vehicle (UAV)-assisted wireless-powered communication network (WPCN), which consists of a static hybrid access point (HAP), a mobile UAV, and many static sensor nodes (SNs) randomly distributed on multiple islands. The UAV first is fully charged by the HAP, and then flies to each island to charge SNs and receive data from them. Before running out the energy in battery, the UAV flies back to the HAP to offload the received data and be fully charged again. Due to the finite battery capacity of the UAV, it is impossible for the UAV to traverse all the islands to collect all the data from SNs for once flight. We are thus inspired to divide islands into multiple clusters so that the UAV could traverse all the islands in each cluster, and formulate the long-term average AoI-minimal problem by jointly optimizing the transmit power of SNs, clustering of islands, and UAV's flight trajectory. To tackle the NP-hard problem, we decouple it into two subproblems: the power allocation subproblem for SNs, and the joint clustering of islands and UAV's flight trajectory design subproblem. To solve the first subproblem, we propose a hybrid TDMA and NOMA (HTN) protocol that takes advantage of the two protocols. To solve the second subproblem, we propose a clustering-based dynamic adjustment of the shortest path (C-DASP) algorithm. Simulations verify the effectiveness and superiority of the proposed HTN protocol and C-DASP algorithm.
To address the optimization problem of expensive multi-modal multi-objective black-box functions, this paper proposes a systematic optimization method based on a back propagation neural network as a proxy model, combi...
详细信息
To address the optimization problem of expensive multi-modal multi-objective black-box functions, this paper proposes a systematic optimization method based on a back propagation neural network as a proxy model, combined with clustering algorithm to cluster and differentiate multiple modes within the model. The method utilizes normalization and weighted summation of numerical experimental data to optimize the hyperparameters of the optimization algorithm's decision-making process. This approach resolves the multiple constraints of traditional optimization methods. Finally, applied to the hollow-core anti-resonant fiber with gap angle constraints to ensure its topological properties remain unchanged, the model optimizes three categories totaling six modes, optimizing the objectives of birefringence and loss. The top five optimal models achieve a minimum loss of 3.54x10(-3) dB/m, maximum birefringence of 1.25x10(-4), higher order modulation enhanced receiver of 50, and bandwidth of 1.425 mu m.
The aim of this paper is to investigate the multiple criteria group decision-making problem in which the evaluation values provided by experts construct some hesitant fuzzy numbers. We first analyze the characteristic...
详细信息
The aim of this paper is to investigate the multiple criteria group decision-making problem in which the evaluation values provided by experts construct some hesitant fuzzy numbers. We first analyze the characteristics of hesitant fuzzy number (HFN) and define the concept of feature vector of HFN. Then, applying the feature vectors of HFNs, some new similarity measures of HFNs are presented, which do not need to add elements to the HFN with fewer elements in the calculation process. Therefore, fuzzy similarity matrix is constructed, which is used to obtain a fuzzy equivalent matrix by using transitive closure method. By applying the fuzzy equivalent matrix, a novel hesitant fuzzy clustering algorithm is given. Furthermore, a new multiple criteria group decision making (MCGDM) algorithm is developed on the basis of the hesitant fuzzy clustering algorithm and the idea of ideal solution in multiple criteria decision making theory. To illustrate the effectiveness and feasibility of the developed MCGDM method, a numerical example is given and analyzed in detail. The results illustrate that the proposed method can provide more reasonable and credible rankings comparing with existing methods owing to keeping original data during computation.
There is a demand for effective clustering methods, especially given the increasing complexity of data scenarios in modern applications. Motivated by the limitations of traditional clustering algorithms, particularly ...
详细信息
There is a demand for effective clustering methods, especially given the increasing complexity of data scenarios in modern applications. Motivated by the limitations of traditional clustering algorithms, particularly in handling noise, imbalanced data, and large-scale datasets, this work introduces TRUNC, a Transfer Learning Unsupervised Network for Data clustering. TRUNC leverages a bio-inspired approach, utilizing a single-layer feed-forward winner-takes-all neural network enhanced with transfer learning to optimize clustering performance. The proposed algorithm is evaluated across a range of synthetic and real-world datasets, demonstrating its robustness and performance over conventional and state-of-the-art clustering methods. Key contributions include the integration of transfer learning into the clustering process, a detailed sensitivity analysis of the TL parameter, and extensive experimentation that confirms the efficiency and generalization capability of TRUNC. Results indicate that TRUNC improves clustering quality and convergence speed, offering a competitive and scalable solution for various clustering tasks. This proposal opens new avenues for the integration of TRUNC's principles with other bio-inspired algorithms and its application across diverse domains.
To make the proper planning of bus public transportation systems, especially with the introduction of electric buses to the fleets, it is essential to characterize the routes, patterns of traffic, speed, constraints, ...
详细信息
To make the proper planning of bus public transportation systems, especially with the introduction of electric buses to the fleets, it is essential to characterize the routes, patterns of traffic, speed, constraints, and presence of high slopes. Currently, GPS (Global Position System) is available worldwide in the fleet. However, they often produce datasets of poor quality, with low data rates, loss of information, noisy samples, and eventual paths not belonging to regular bus routes. Therefore, extracting useful information from these poor data is a challenging task. The current paper proposes a novel method based on an unsupervised competitive density clustering algorithm to obtain hot spot clusters of any density. The clusters are a result of their competition for the GPS samples. Each cluster attracts GPS samples until a maximum radius from its centroid and thereafter moves toward the most density areas. The winning clusters are sorted using a novel distance metric with the support of a visual interface, forming a sequence of points that outline the bus trajectory. Finally, indicators are correlated to the clusters making a trajectory characterization and allowing extensive assessments. According to the actual case studies, the method performs well with noisy GPS samples and the loss of information. The proposed method presents quite a fixed parameter, allowing fair performance for most GPS datasets without needing custom adjustments. It also proposes a framework for preparing the input GPS dataset, clustering, sorting the clusters to outline the trajectory, and making the trajectory characterization.
In image clustering applications, deep feature clustering has recently demonstrated impressive performance, which employs deep neural networks for feature learning that favors clustering exercises. In this context, de...
详细信息
In image clustering applications, deep feature clustering has recently demonstrated impressive performance, which employs deep neural networks for feature learning that favors clustering exercises. In this context, density-based methods have emerged as the preferred choice for the clustering mechanism within the framework of deep feature clustering. However, as the performance of these clustering algorithms is primarily effective on the low-dimensional feature data, deep feature learning models play a crucial role here. With far infrared (FIR) thermal imaging systems working in real-world scenarios, the images captured are largely affected by blurred edges, background noise, thermal irregularities, few details, etc. In this work, we demonstrate the effectiveness of granular computing-based techniques in such scenarios, where the input data contains indiscernible image regions and vague boundary regions. We propose a novel adaptive non-homogeneous granulation (ANHG) technique here that can adaptively select the smallest possible size of granules within a purview of unequally-sized granulation, based on a segmentation assessment index. Proposed ANHG in combination with deep feature learning helps in extracting complex, indiscernible information from the image data and capturing the local intensity variation of the data. Experimental results show significant performance improvement of the density-based deep feature clustering method after the incorporation of the proposed granulation scheme.
Real-world multi-view data may manifest as point-clouds, but their meaningful structure often resides on a lower dimensional manifold embedded in the higher dimensional space. Consequently, existing graph based multi-...
详细信息
Real-world multi-view data may manifest as point-clouds, but their meaningful structure often resides on a lower dimensional manifold embedded in the higher dimensional space. Consequently, existing graph based multi-view algorithms focus primarily on extraction of the low-rank subspaces for clustering. However, simultaneous optimization of the individual graph structures, their contributions/weights along with the clustering subspaces is likely to give a more comprehensive idea of the clusters present in the data set. In this regard, the paper proposes a Riemannian manifold optimization algorithm that harnesses the geometry and structure preserving properties of symmetric positive definite (SPD) manifold and Grassmann manifold for efficient multi-view clustering. The SPD manifold is used to optimize the graph Laplacians corresponding to the individual views, while preserving their symmetricity, positive definiteness, and related properties. The Grassmann manifold, on the other hand, is used to minimize the disagreement between the joint and individual clustering subspaces. Optimization over the Grassmann manifold additionally enforces the clustering solutions to be basis invariant such that all cluster indicator matrices whose columns span the same subspace map to the same clustering solution. A gradient based line-search algorithm, which alternates between two different manifolds, is proposed to optimize the multi-view graph Laplacians and its associated subspaces. The matrix perturbation theory is used to theoretically bound the disagreement between the clustering subspaces. Extensive experiments on several multi-view benchmark and multi-omics cancer data sets establish the effectiveness of the proposed method. The proposed work demonstrates how incorporation of Riemannanian optimization into the graph learning framework can boost multi-view clustering performance.
Small-target detection in sea clutter has always been difficult due to the low signal clutter ratio (SCR). Feature-based detection methods have emerged as a significant area of research in recent years, where pattern ...
详细信息
Small-target detection in sea clutter has always been difficult due to the low signal clutter ratio (SCR). Feature-based detection methods have emerged as a significant area of research in recent years, where pattern recognition methods involve the use of similarity measures. The Gaussian kernel function used in traditional spectral clustering algorithms performs well when handling nonlinear data, however, it may not effectively capture the relationships between sparsely distributed samples. Therefore, this article proposes a novel method for small-target detection in sea clutter: spectral clustering based on neighborhood density similarity measure (SC-ND) to address the aforementioned issues. It incorporates neighborhood density into the Gaussian kernel function and then uses label information to guide the construction of the similarity matrix, at which point a false alarm controllable is achieved. The similarity matrix is then spectrally decomposed to obtain a new representation of the samples, which are subsequently classified using an improved k-means algorithm. Validated on the IPIX 1993 radar dataset, the proposed method achieves a detection probability of 71.1% with an observation time of 1.024 s and a false alarm rate of 0.001. The results are better than fractal-based detector, tri-feature-based detector, and feature-based detector using three TF features.
The core goal of software defect prediction (SDP) is to identify modules with a high likelihood of defects, thereby enabling prioritization of quality assurance activities with low inspection effort. There are many su...
详细信息
The core goal of software defect prediction (SDP) is to identify modules with a high likelihood of defects, thereby enabling prioritization of quality assurance activities with low inspection effort. There are many supervised defect prediction models that are extensively studied. However, these methods require the need for labeling data to get enough training modules, which will cause a lot of waste of human resources. Cross-project defect prediction primarily reuses models trained on other projects with enough historical data. However, this strategy is often hindered by large distribution differences across different projects and privacy concerns of data. Unsupervised learning technique is an alternative solution to the unlabeled data, but it mainly focuses on single-view prediction by concatenating all the software metrics. This ignores the diversity and complementarity of different types of metrics. This study proposes a novel approach, namely, multiview unsupervised software defect prediction (MUSDP). It aims to collaboratively learn the diversity and complementarity of different views to build a robust and reliable defect prediction model. Extensive experiments on 28 releases from eight software projects indicate that MUSDP exhibits superior or comparable results regarding G-mean, AUC, P-opt, and Recall@20% compared to competing supervised and unsupervised methods. For the interpretation of MUSDP, the number of added and deleted lines significantly influence its predictions.
暂无评论