In the process of building domain knowledge graph, the result of relationship extraction between entities is an important guarantee of the quality of the graph. Therefore, we propose a clustering method based on reinf...
详细信息
In the process of building domain knowledge graph, the result of relationship extraction between entities is an important guarantee of the quality of the graph. Therefore, we propose a clustering method based on reinforcement learning for remote supervised relation extraction. For the relationship extraction of accident information in the aviation domain mapping, a clustering method combining local dense and global dissimilarity is proposed in combination with remote supervision, which can obtain a large amount of low-noise labeled data and reduce part of the wrong labeling and missing labeling due to the strong specialization in the aviation domain;meanwhile, reinforcement learning is introduced to denoise the negative instance noise in the positive sample data;Finally, we propose a two-attention segmentation (DAPCNN) relationship extraction model to mine deep semantic sentences. The experimental results show that in the civil aviation relationship extraction text constructed in this paper, the Micro_R, Micro_P and Micro_F1 values of the proposed relationship extraction method reach 83.41 %, 84.16 % and 83.96 %. In the open relationship extraction dataset DuIE, The Micro_R, Micro_P and Micro_F1 of the proposed method are up to 83.41 %, 93.58 % and 94.02 % respectively. Compared with the current advanced multi-instance and multi-label model, the proposed method can more accurately extract the relationship between aviation accident entities. At the same time, the performance of the open data set is also good, and has a certain universality.
Objective: Understanding country-level nutrition intake is crucial to global nutritional policies that aim to reduce disparities and relevant disease burdens. Still, there are limited numbers of studies using clusteri...
详细信息
Objective: Understanding country-level nutrition intake is crucial to global nutritional policies that aim to reduce disparities and relevant disease burdens. Still, there are limited numbers of studies using clustering techniques to analyse the recent Global Dietary Database (GDD). This study aims to extend an existing multivariate time series (MTS) clustering algorithm to allow for greater customisability and provide the first cluster analysis of the GDD to explore temporal trends in country-level nutrition profiles (1990-2018).Design: Trends in sugar-sweetened beverage intake and nutritional deficiency were explored using the newly developed programme 'MTSclust'. Time series clustering algorithms are different from simple clustering approaches in their ability to appreciate temporal ***: Nutritional and demographical data from 176 countries were analysed from the ***: Population representative samples of the 176 in the ***: In a three-class test specific to the domain, the MTSclust programme achieved a mean accuracy of 715 % (adjusted Rand Index [ARI] = 0381) while the mean accuracy of a popular algorithm, DTWclust, was 58 % (ARI = 0224). The clustering of nutritional deficiency and sugar-sweetened beverage intake identified several common trends among countries and found that these did not change by demographics. MTS clustering demonstrated a global convergence towards a Western ***: While global nutrition trends are associated with geography, demographic variables such as sex and age are less influential to the trends of certain nutrition intake. The literature could be further supplemented by applying outcome-guided methods to explore how these trends link to disease burdens.
With the emergence of big data and cloud computing, data stream arrives rapidly, large-scale and continuously, real-time data stream clustering analysis has become a hot topic in the study on the current data stream m...
详细信息
ISBN:
(纸本)9781467391672
With the emergence of big data and cloud computing, data stream arrives rapidly, large-scale and continuously, real-time data stream clustering analysis has become a hot topic in the study on the current data stream mining. Some existing data stream clustering algorithms cannot effectively deal with the high-dimensional data stream and are incompetent to find clusters of arbitrary shape in real-time, as well as the noise points could not be removed timely. To address these issues, this paper proposes PGDC-Stream, a algorithm based on grid and density for clustering data streams in a parallel distributed environment [4]. The algorithm adopts density threshold function to deal with the noise points and inspect and remove them periodically. It also can find clusters of arbitrary shape in large-scale data flow in real-time. The Map-Reduce framework is used for parallel cluster analysis of data streams.
With the continuous and rapid development of online questionnaire survey,the low response rate has plagued operating *** solve this problem,this paper proposed an effective user invitation model by our improved cluste...
详细信息
ISBN:
(纸本)9781467377249
With the continuous and rapid development of online questionnaire survey,the low response rate has plagued operating *** solve this problem,this paper proposed an effective user invitation model by our improved clustering algorithm,which analyzed large-scale historical user behavior characteristic data,including users' quality data,users' preferential data and users' similarity *** experiments with large-scale data from an online survey company have been conducted to validate the feasibility and effectiveness of our proposed *** results demonstrate that the questionnaire response rate is increased and our approach can be easily deployed in real-world online survey application for effective personalized survey recommendation.
The goals of wireless sensor networks (WSNs) are to sense and collect data and to transmit the information to a sink. Because the sensor nodes are typically battery powered, the main challenges in WSNs are to optimise...
详细信息
The goals of wireless sensor networks (WSNs) are to sense and collect data and to transmit the information to a sink. Because the sensor nodes are typically battery powered, the main challenges in WSNs are to optimise the energy consumption and to prolong the network lifetime. This paper proposes a centralised clustering algorithm termed the minimum distance clustering algorithm that is based on an improved differential evolution (MD-IDE). The new algorithm combines the advantages of simulated annealing and differential evolution to determine the cluster heads (CHs) for minimising the communication distance of the WSN. Many simulation results demonstrate that the performance of MD-IDE outperforms other well-known protocols, including the low-energy adaptive clustering hierarchy (LEACH) and LEACH-C algorithms, in the aspects of reducing the communication distance of the WSN for reducing energy consumption.
As a mainstream research direction in the field of image segmentation,medical image segmentation plays a key role in the quantification of lesions,three-dimensional reconstruction,region of interest extraction and so ...
详细信息
As a mainstream research direction in the field of image segmentation,medical image segmentation plays a key role in the quantification of lesions,three-dimensional reconstruction,region of interest extraction and so *** with natural images,medical images have a variety of ***,the emphasis of information which is conveyed by images of different modes is quite *** it is time-consuming and inefficient to manually segment medical images only by professional and experienced ***,large quantities of automated medical image segmentation methods have been ***,until now,researchers have not developed a universal method for all types of medical image *** paper reviews the literature on segmentation techniques that have produced major breakthroughs in recent *** the large quantities of medical image segmentation methods,this paper mainly discusses two categories of medical image segmentation *** is the improved strategies based on traditional clustering *** other is the research progress of the improved image segmentation network structure model based on *** power of technology proves that the performance of the deep learning-based method is significantly better than that of the traditional *** paper discussed both advantages and disadvantages of different algorithms and detailed how these methods can be used for the segmentation of lesions or other organs and tissues,as well as possible technical trends for future work.
Deep embedded clustering (DEC) is a representative clustering algorithm that leverages deep-learning frameworks. DEC jointly learns low-dimensional feature representations and optimizes the clustering goals but only w...
详细信息
Deep embedded clustering (DEC) is a representative clustering algorithm that leverages deep-learning frameworks. DEC jointly learns low-dimensional feature representations and optimizes the clustering goals but only works with numerical data. However, in practice, the real-world data to be clustered includes not only numerical features but also categorical features that DEC cannot handle. In addition, if the difference between the soft assignment and target values is large, DEC applications may suffer from convergence problems. In this study, to overcome these limitations, we propose a deep embedded clustering framework that can utilize mixed data to increase the convergence stability using soft-target updates;a concept that is borrowed from an improved deep Q learning algorithm used in reinforcement learning. To evaluate the performance of the framework, we utilized various benchmark datasets composed of mixed data and empirically demonstrated that our approach outperformed existing clustering algorithms in most standard metrics. To the best of our knowledge, we state that our work achieved state-of-the-art performance among its contemporaries in this field.
In order to ensure the reliable transmission of important service information,that meets the requirements of accessing at any time,this paper puts forward the improved adaptive weighted clustering *** simulation and v...
详细信息
In order to ensure the reliable transmission of important service information,that meets the requirements of accessing at any time,this paper puts forward the improved adaptive weighted clustering *** simulation and verification,this algorithm can be applied to the maneuvering communication network which can realize the effective supplement and reasonable extension to existing network.
In recent years, evolutionary algorithms (EAs) have gained attention among scholars and have been applied to optimization engineering with various degrees of success. Concurrently, machine learning methods have rapidl...
详细信息
ISBN:
(纸本)9798400708909
In recent years, evolutionary algorithms (EAs) have gained attention among scholars and have been applied to optimization engineering with various degrees of success. Concurrently, machine learning methods have rapidly developed in the field of artificial intelligence and have been increasingly integrated with other domains. This paper introduces a novel multi-population differential evolution algorithm called DE-FR, based on the proposed DBSCAN-FR clustering algorithm. This paper contributes to the improvement of the differential evolution algorithm in the following aspects. Firstly, it presents an enhanced clustering algorithm, DBSCAN-FR, which incorporates a forward distance filtering mechanism to divide the population into several groups successfully in high dimensional space. Secondly, it introduces a novel differential evolution algorithm named DE-FR, which builds upon the DBSCAN-FR clustering algorithm aims to solve complex single-objective optimization problems. Lastly, the proposed algorithm is compared with other classical differential evolution variants on CEC2014 benchmarks, and experimental results demonstrate its competitive performance.
Traditional point-of-interest (POI) data are collected by professional surveying and mapping organizations and are distributed in electronic maps. With the booming Internet and the development of crowdsourcing, the PO...
详细信息
Traditional point-of-interest (POI) data are collected by professional surveying and mapping organizations and are distributed in electronic maps. With the booming Internet and the development of crowdsourcing, the POI data defined in various formats are issued by some Internet companies and non-profit organizations. Due to the multiple sources and diverse formats of POI data, some problems occur in the data fusion process, such as conceptual definition differences, inconsistent classification, inefficient fusion algorithms, inaccurate fusion results, etc. To overcome the challenges of multi-source POI data fusion, this paper proposes a standardized POI data model and an ontology-based POI category system. Furthermore, a fusion framework and a fusion algorithm based on a two-stage clustering approach are proposed. The proposed method is compared with existing algorithms using datasets of different sizes, including POI surveying and mapping data from Kunming, China, Weibo check-in POI data, and real estate POI data. The experimental results demonstrate that the fusion effects of the proposed algorithm are superior to those of existing algorithms in terms of different evaluation indexes and operational efficiency.
暂无评论