Few clustering methods show good performance on multivariate time series (MTS) data. Traditional methods rely too much on similarity measures and perform poorly on the MTS data with complex structures. This paper prop...
详细信息
ISBN:
(纸本)9781665438599
Few clustering methods show good performance on multivariate time series (MTS) data. Traditional methods rely too much on similarity measures and perform poorly on the MTS data with complex structures. This paper proposes an MTS clustering algorithm based on graph embedding called MTSC-GE to improve the performance of MTS clustering. MTSC-GE can map MTS samples to the feature representations in a low-dimensional space and then cluster them. While mining the information of the samples themselves, MTSC-GE builds the whole time series data into a graph, paying attention to the connections between samples from an overall perspective and discovering the local structural feature of MTS data. The proposed MTSC-G E consists of three stages. The first stage builds a graph using the original dataset, where each of the MTS samples is regarded as a node in the graph. The second stage uses the graph embedding technique to obtain a new representation of each node. Finally, MTSC-G E uses the K - Means algorithm to cluster based on the newly obtained representation. We compare MTSC-GE with six state-of-the-art benchmark methods on five public datasets, experimental results show that MTSC-GE has achieved good performance.
data-driven decision in big data era is becoming ubiquitous in electronic grid. In particular, daily collected power consumption records enable workload aware device clustering, which is crucial for critical domain ap...
详细信息
In practical applications, high-quality labeled data is critical for short text classification. But in many cases, it is expensive and time-consuming to obtain labeled information. Semi-supervised short text classific...
In practical applications, high-quality labeled data is critical for short text classification. But in many cases, it is expensive and time-consuming to obtain labeled information. Semi-supervised short text classification is hence attracting more attention. However, due to the sparsity of short texts, the performance of existing short text classification models always needs to be improved. Therefore, in this paper, we propose a semi-supervised Short text classification method based on Dual-Channel data Augmentation called SDCA. More specifically, in order to solve the sparsity of short texts, this model first adopts the multi-stage word-level TCN (Temporal Convolutional Network)-based attention to enhanced semantic features and an one-dimensional convolution-based attention mechanism to augment the relevance of surrounding short texts. Secondly, the unlabeled data are augmented by word embedding weighted augmentation and word replacement augmentation, so that the model can make full use of the unlabeled short texts and further enhance the network training. Finally, extensive experiments conducted on four benchmark datasets demonstrate the effectiveness of the proposed model on semi-supervised short text classification.
Web sites have become the main targets of many attackers. Signature-based detection needs to maintain a large signature database and Honeypot based methods are not efficient. Since attackers always make the malicious ...
详细信息
Web sites have become the main targets of many attackers. Signature-based detection needs to maintain a large signature database and Honeypot based methods are not efficient. Since attackers always make the malicious codes in Web pages difficult to detect by the browser users, their methods can be classified into various fingerprints. Various malicious codes were analyzed to identify 6 types of fingerprints. The system utilizes a spider integrated with script interpretation to fetch target Web pages and extract specific tags for detection by HTML parsing for matching with the fingerprints to detect malicious codes. This method needs fewer fingerprints than traditional detection methods and is more efficient. Results for 60 websites show that the system has a false negative rate of 2.63% and a false positive rate of 1.99%.
Magnetic resonance imaging(MRI) is a kind of imaging modality, which offers clearer images of soft tissues than computed tomography(CT). It is especially suitable for brain disease detection. It is beneficial to detec...
详细信息
Magnetic resonance imaging(MRI) is a kind of imaging modality, which offers clearer images of soft tissues than computed tomography(CT). It is especially suitable for brain disease detection. It is beneficial to detect diseases automatically and accurately. We proposed a pathological brain detection method based on brain MR images and online sequential extreme learning machine. First, seven wavelet entropies(WE) were extracted from each brain MR image to form the feature vector. Then, an online sequential extreme learning machine(OS-ELM) was trained to differentiate pathological brains from the healthy *** experiment results over 132 brain MRIs showed that the proposed approach achieved a sensitivity of 93.51%, a specificity of 92.22%, and an overall accuracy of 93.33%,which suggested that our method is effective.
Speech emotion recognition is an important technique for human-computer interface applications. Due to contain rich information of emotion, the spectral feature is widely used for emotion recognition. However, the rec...
详细信息
The current streaming feature structure learning needs to be improved in the processing of nonlinear continuous data and the dynamic acquisition of causal structures. In this paper, we propose a causal structure learn...
详细信息
ISBN:
(纸本)9781665424288
The current streaming feature structure learning needs to be improved in the processing of nonlinear continuous data and the dynamic acquisition of causal structures. In this paper, we propose a causal structure learning algorithm, CANSF, based on the streaming feature of additive noise models. We have made three contributions. First, by using the information carried by the noise of nonlinear continuous data, we propose a real correlation identification method based on logarithmic likelihood, which can identify the real correlation and redundant features of target features, and dynamically select parent and child nodes for each feature. Second, based on regression analysis, a method to determine the causal direction is proposed, which can be used for dynamic orientation. Third, a learning method of causal structure based on streaming features is proposed, which can obtain the Causal structure diagram directly and dynamically.
This paper integrates event study with machine learning to analyze the spillover effects of the "Binance Incident" in the cryptocurrency market on financial markets. Utilizing the Lasso model, this paper pre...
详细信息
This study presents a novel heuristic algorithm called the "Minimal Positive Negative Product Strategy" to guide the CDCL algorithm in solving the Boolean satisfiability problem. It provides a mathematical e...
详细信息
Based on analysis of relationships between solution qualities and number of initial dynamic points in elastic net algorithm, we propose an improved elastic network algorithm (IENA) introduced in a heuristic cloning st...
详细信息
暂无评论