Billions of contributions are made every day across multiple online communities and social media websites in the form of social messages, social blogs and online discussion. The aim of this paper is to identify such c...
详细信息
ISBN:
(纸本)9781509030125
Billions of contributions are made every day across multiple online communities and social media websites in the form of social messages, social blogs and online discussion. The aim of this paper is to identify such comments and posts which are racist and malicious in nature so that they could be effetely banned and removed in order to counter them. This article uses set of documents with racist comments as text corpus on which appropriate machine learning algorithm is applied to detect racist comments or meaning. To detect anti-social content there is a need to find the extent of similarity between a pair of text messages as a source and classified terms which are antisocial or in discriminating terms. The approach devised in this article to detect antisocial behavior is a technique based on term frequency based content classification.
This paper proposes a learning-based adaptive imputation method (LAI) for imputing missing power data in an energy system. This method estimates the missing power data by using the pattern that appears in the collecte...
详细信息
This paper proposes a learning-based adaptive imputation method (LAI) for imputing missing power data in an energy system. This method estimates the missing power data by using the pattern that appears in the collected data. Here, in order to capture the patterns from past power data, we newly model a feature vector by using past data and its variations. The proposed LAI then learns the optimal length of the feature vector and the optimal historical length, which are significant hyper parameters of the proposed method, by utilizing intentional missing data. Based on a weighted distance between feature vectors representing a missing situation and past situation, missing power data are estimated by referring to the k most similar past situations in the optimal historical length. We further extend the proposed LAI to alleviate the effect of unexpected variation in power data and refer to this new approach as the extended LAI method (eLAI). The eLAI selects a method between linear interpolation (LI) and the proposed LAI to improve accuracy under unexpected variations. Finally, from a simulation under various energy consumption profiles, we verify that the proposed eLAI achieves about a 74% reduction of the average imputation error in an energy system, compared to the existing imputation methods.
The product feature set of online reviews obtained by the current product feature extraction methods has a low coverage rate of review information. In order to solve this problem, this paper proposes a method of produ...
详细信息
The product feature set of online reviews obtained by the current product feature extraction methods has a low coverage rate of review information. In order to solve this problem, this paper proposes a method of product feature extraction based on knn algorithm. We establish the classification system of product feature set firstly. Then we extract part of product features as training set manually, and according to similarity between words and the classification system, the product features of all reviews are quickly classified and extracted. At last, the PMI algorithm is used to filter and supplement it to improve the correct rate and the review information coverage rate of product feature set. Through the examples of online clothing reviews data in the Taobao platform, we prove that this method can effectively improve the review information coverage rate of product feature set.
Location-based services have been deep into all aspects of life and it provides a convenient and efficient service experience for people. Currently, technology is relatively mature and widely used in the outdoor posit...
详细信息
ISBN:
(纸本)9781467399043
Location-based services have been deep into all aspects of life and it provides a convenient and efficient service experience for people. Currently, technology is relatively mature and widely used in the outdoor positioning. By contrast, for indoor positioning, although there are a lot of hot technology, but they are mostly insufficient lead to it is hard to popularize. So how to improve the popularity of indoor positioning in the case of improve the positioning accuracy has bacame a hot research topoc. This paper analyzes and studies several typical fingerprint localization algorithm, including NN, knn and Wknn, and then propose an algorithmic improvement program, it introduces signal propagation model, finds and narrows the K-gon.
With the rapid development of information technology, the concept of big data is used in information collection on different things, especially for the text classification. This paper propose an improved knn algorithm...
详细信息
ISBN:
(纸本)9781509044993
With the rapid development of information technology, the concept of big data is used in information collection on different things, especially for the text classification. This paper propose an improved knn algorithm based on clustering for the automatic classification of Web text. In addition, we find a new method to find out which text in the same category belongs to the same cluster. Finally, we classify Web text automatically and test them by using the existing and improved knn algorithm respectively. Simulation results show that the improved algorithm can significantly raise the accuracy of automatic classification.
The knn classification algorithm is one of the most commonly used algorithm in the AI field. This paper proposes two improved algorithms, namely knnTS, and knnTS-PK+ The two improved algorithms are based on knnPK+ alg...
详细信息
The knn classification algorithm is one of the most commonly used algorithm in the AI field. This paper proposes two improved algorithms, namely knnTS, and knnTS-PK+ The two improved algorithms are based on knnPK+ algorithm, which uses PK-Means + + algorithm to select the center of the spherical region, and sets the radius of the region to form a sphere to divide the data set in the space. The knnPK+ algorithm improves the classification accuracy on the premise of stabilizing the classification efficiency of knn classification algorithm. In order to improve the classification efficiency of knn algorithm on the premise that the accuracy of knn classification algorithm remains unchanged, knnTS algorithm is proposed. It uses tabu search algorithm to select the radius of spherical region, and uses spherical region division method with equal radius to divide the data set in space. On the basis of the first two improved algorithms, knnTS-PK+ algorithm combines them to divide the data sets in space. Experiments are carried out on the new data set and the classification results were obtained. Results revealed show that the two improved algorithms can effectively improve the classification accuracy and efficiency after the data samples are cut reasonably. (C) 2021 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria University.
A disadvantage of k-nearest neighbor(knn)algorithm is the large amount of calculation. The tree index structure can reduce the amount of calculation but it will generate an index page buffer management problem in the ...
详细信息
A disadvantage of k-nearest neighbor(knn)algorithm is the large amount of calculation. The tree index structure can reduce the amount of calculation but it will generate an index page buffer management problem in the case of the main memory capacity is limited. The traditional page replacement policy is not aimed at a tree-based high-dimensional indexing structure design features, so the page buffer hit ratio is lower. Therefore, we analyzed the characteristics of tree index structure and then designed a replacement policy based access probability distribution, the experimental results show that the replacement policy is effective.
Location-based services have been deep into all aspects of life and it provides a convenient and efficient service experience for ***,technology is relatively mature and widely used in the outdoor *** contrast,for ind...
详细信息
Location-based services have been deep into all aspects of life and it provides a convenient and efficient service experience for ***,technology is relatively mature and widely used in the outdoor *** contrast,for indoor positioning,although there are a lot of hot technology,but they are mostly insufficient lead to it is hard to *** how to improve the popularity of indoor positioning in the case of improve the positioning accuracy has bacame a hot research *** paper analyzes and studies several typical fingerprint localization algorithm,including NN,knn and Wknn,and then propose an algorithmic improvement program,it introduces signal propagation model,finds and narrows the K-gon.
For the elderly, falls can be extremely fatal. However, due to the physical decline of the elderly, it is difficult to avoid falls. Therefore, to the greatest extent feasible lessen the harm that falls on the elderly ...
详细信息
For the elderly, falls can be extremely fatal. However, due to the physical decline of the elderly, it is difficult to avoid falls. Therefore, to the greatest extent feasible lessen the harm that falls on the elderly inflict, so that they can be found in the first time of falls, this study based on wearable devices, proposed a fall monitoring system using an improved K-nearest neighbor algorithm. The improved fuzzy K-nearest neighbor algorithm combined with support vector machine algorithm is applied to improve the efficiency and accuracy of the algorithm, and reduce the false positive rate and false negative rate as much as possible. The suggested model's average precision in the simulation experiment is 97.5%. The specificity was 97.6%. The sensitivity was 97.5%. The convergence performance is also good, 24 iterations can reach the optimal. In the actual experiment, the average accuracy reached 98.7%;The false alarm rate is only 0.7%;The negative rate was 2.5%;Its performance is superior to other two algorithms. This shows that the proposed method has excellent accuracy, false positive rate and false negative rate in practical application, which has important significance for the health and safety of the elderly.
Intrabody communication (IBC) establishes a wireless connection between devices in a Wireless Body Area Network (WBAN) by utilizing the human body as a transmission medium. The characteristics of the IBC channel are s...
详细信息
Intrabody communication (IBC) establishes a wireless connection between devices in a Wireless Body Area Network (WBAN) by utilizing the human body as a transmission medium. The characteristics of the IBC channel are significantly influenced by the geometric and biological features of the human body and tissues. This paper analyzes a dataset with experimental real subjects' data on signal loss in a galvanic IBC channel, models IBC identification using the K-Nearest Neighbors (knn) algorithm, and proposes a novel IBC WBAN architecture incorporating an identification function. The analysis of the dataset revealed that the IBC channel gain exhibits a wide range of variations depending on individual human body characteristics such as height, weight, body mass index, and body composition. Consequently, biometric identification can be leveraged within the IBC WBAN paradigm. Through modeling IBC identification on cleaned and labeled data, we demonstrated an identification accuracy of 99.9% based on the results of our modeling. The proposed IBC WBAN architecture with an integrated identification function is anticipated to enhance the application scope and accelerate the development of IBC WBANs.
暂无评论