In recent years, energy demand has increased rapidly in developing countries. Energy demand estimation (EDE) plays an important role for policy makers and related organizations. Generally, energy demand can be mathema...
详细信息
Background: With the surge in the volume of collected data, deduplication will undoubtedly become one of the problems faced by researchers. There is significant advantage for deduplication to reduce storage, network b...
详细信息
Background: With the surge in the volume of collected data, deduplication will undoubtedly become one of the problems faced by researchers. There is significant advantage for deduplication to reduce storage, network bandwidth, and system scalability of coarse-grained redundant data. Since the conventional methods of deleting duplicate data include hash comparison and binary differential incremental. They will lead to several bottlenecks for processing large scale data. And, the traditional Simhash similarity method has less consideration on the natural similarity of text in some specific fields and cannot run in parallel program with large scale text data processing efficiently. This paper examines several most important patents in the area of data detection. Then, this paper will focus on large scale of data deduplication based on MapReduce and HDFS. Methods: We propose a duplicate data detection approach based on MapReduce and HDFS, which uses the Simhash similarity computing algorithm and SSN algorithm, and explain our distributed duplicate detection workflow. The important technical advantages of the invention include generating a checksum for each processed record and comparing the generated checksum to detect duplicate record. It produces the fingerprints of short text with Simhash similarity algorithm. It clusters the fingerprint results using Shared Nearest Neighbor (SNN) algorithm. The whole parallel progress is implemented using MapReduce programming model. Results: From the experimental results, we conclude that our proposed approach obtains MapReduce job schedules with significantly less executing time, making it suitable for processing large scale datasets in real applications. The experimental results show the proposed approach has better performance and efficiency. Conclusion: In this patent, we propose a duplicate data detection approach based on MapReduce and HDFS, which uses the Simhash similarity computing algorithm and SSN algorithm. The results
In recent years, mobile devices have taken a significant role in improving the quality of people’s life. In order to enhance the usability of those devices, more and more sensors have been built in. Furthermore, the ...
详细信息
In high voltage applications such as power grid and rail transportation, IGBT devices generally operate under very harsh conditions, which places high demands on device reliability. In this paper, the electric field d...
ISBN:
(数字)9781728116754
ISBN:
(纸本)9781728116761
In high voltage applications such as power grid and rail transportation, IGBT devices generally operate under very harsh conditions, which places high demands on device reliability. In this paper, the electric field distribution of silicon chip termination area and press-pack IGBT package structure under high voltage reverse bias experimental conditions is simulated by Fem software. The results indicate the potential electrical insulation failure areas in the package model and the corresponding optimization measures are proposed. To some extent, the problem of breakdown discharge of IGBT devices during operation is alleviated.
Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing (NLP). Decomposing the problem of ranking hypotheses into ...
详细信息
CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. However, most existing de...
详细信息
Recently, Web services have generated great interests in both vendors and researchers. Web services, based on existing Internet protocols and open standards, can provide a flexible solution to the problem of applicati...
详细信息
In adaptive radiotherapy planning, the contour map of the treatment target area is still the most difficult problem. Usually this work is done by a professional radiologist oncologist, which is very time-consuming and...
In adaptive radiotherapy planning, the contour map of the treatment target area is still the most difficult problem. Usually this work is done by a professional radiologist oncologist, which is very time-consuming and labor-intensive. In order to solve this problem, this paper proposes an automatic contour method based on support vector machine. Experiments were conducted on the lower abdomen MR data set of eight patients. In the experiment, the simulated data was only used to train the classifier, and the treatment day images were used to evaluate the performance of the classifier. DSI was used to compare the manual contour and automatic contour of the kidney. The experiment showed that, The automatic contour method based on support vector machine has better classification performance than most classification algorithms. Among the eight sets of results, the DSI value of the six sets of results is 1, and the smallest DSI value is also greater than 9.423.
Mobile user authentication and key agreement for wireless networks is an important security priority. In recent years, several user authentication and key agreement protocols with smart cards for wireless communicatio...
详细信息
Mobile user authentication and key agreement for wireless networks is an important security priority. In recent years, several user authentication and key agreement protocols with smart cards for wireless communications have been proposed. In 2011, Xu et al. proposed an effcient mutual authentication and key agreement protocol with an anonymity property. Although the protocol of Xu et al. has many benefits, we find that it still suffers from several weaknesses which have been previously overlooked. In this paper, we propose a secure and effcient mutual authentication and key agreement protocol. Confidentiality of the session key and updating of the password effciently are presented as the main contributions of this paper. Finally, evaluations of our proposed protocol show that our protocol can withstand various known types of attacks, and also satisfies essential functionality requirements. Additionally, effciency analyses show that our protocol is simple and cost-effcient.
暂无评论