As one part of preprocessing, automatic word segmentation is an key issue in Chinese information retrieval. Since integral words are put wholly together to compose into the more meaningful words and more express users...
详细信息
The main problem of existing static vulnerability detection methods based source code analysis is their high false positive and false negative rates. One main reason is lack of accurate and effective identification an...
详细信息
The main problem of existing static vulnerability detection methods based source code analysis is their high false positive and false negative rates. One main reason is lack of accurate and effective identification and analysis of security-related program elements, e.g. data validation checking, tainted data source, etc. A static vulnerability detection method based on data security state tracing and checking is proposed. In this method, the state space of state machine model is extended;the security state of a variable is identified by a vector that may correspond to multiple security-related properties rather than by a single property;Fine-grained state transition is provided to support accurate recognition of program security-related behaviors;The recognition of validation checking is introduced in vulnerability state machine to reduce false positives;and a systematic discrimination mechanism for tainted data is constructed to prevent false negatives result from neglecting tainted data sources. The experimental results of a prototype system show that this method can effectively detect buffer overflow and other type's vulnerabilities in software systems, and with obviously lower false positive than existing mainstream static detection methods and avoid some serious false negatives of these methods.
Sensor networks are widely used in many applications to collaboratively collect information from the physical environment. In these applications, the exploration of the relationship and linkage of sensing data within ...
详细信息
Sensor networks are widely used in many applications to collaboratively collect information from the physical environment. In these applications, the exploration of the relationship and linkage of sensing data within multiple regions can be naturally expressed by joining tuples in these regions. However, the highly distributed and resource-constraint nature of the network makes join a challenging query. In this paper, we address the problem of processing join query among different regions progressively and energy-efficiently in sensor networks. The proposed algorithm PEJA (Progressive Energy-efficient Join Algorithm) adopts an event-driven strategy to output the joining results as soon as possible, and alleviates the storage shortage problem in the in-network nodes. It also installs filters in the joining regions to prune unmatchable tuples in the early processing phase, saving lots of unnecessary transmissions. Extensive experiments on both synthetic and real world data sets indicate that the PEJA scheme outperforms other join algorithms, and it is effective in reducing the number of transmissions and the delay of query results during the join processing.
In keyword search over relational databases (KSORD), retrieval of user's initial query is often unsatisfying. User has to reformulate his query and execute the new query, which costs much time and effort. In this ...
详细信息
In keyword search over relational databases (KSORD), retrieval of user's initial query is often unsatisfying. User has to reformulate his query and execute the new query, which costs much time and effort. In this paper, a method of automatically reformulating user queries by relevance feedback is introduced, which is named VSM-RF. Aimed at the results of KSORD systems, VSM-RF adopts a ranking method based on vector space model to rank KSORD results. After the first time of retrieval, using user feedback or pseudo feedback just as user like, VSM-RF computes expansion terms based on probability and reformulates the new query using query expansion. After KSORD systems executing the new query, more relevant results are produced by the new query in the result list and presented to user. Experimental results verify this method's effectiveness.
keyword search over relational databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema...
详细信息
keyword search over relational databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. In KSORD, retrieval of user's initial query is often unsatisfying. User has to reformulate his query and execute the new query, which costs much time and effort. A method of automatically reformulating user queries by user feedback aimed at the results of KSORD is introduced in this paper, which is named UFBP (user feedback based on probability). After the first time of retrieval, according to the users' feedback information, UFBP computes terms to be added into the expanded query based on probability and reformulates the new query using query expansion. After KSORD executing the new query automatically, more relevant results are presented to user. Experimental results verify its effectiveness.
Compared with traditional magnetic disks, Flash memory has many advantages and has been used as external storage media for a wide spectrum of electronic devices (such as PDA, MP3, Digital Camera and Mobile Phone) in r...
详细信息
Closed frequent itemsets(CFI) mining uses less memory to store the entire information of frequent itemsets thus is much suitable for mining stream. In this paper, we discuss recent CFI mining methods over stream and p...
详细信息
Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain...
详细信息
Both Content analysis and link, analysis have its advantages in measuring relationships among documents. In this paper. we propose a new method to combine these two methods to compute the similarity of research papers...
详细信息
ISBN:
(纸本)9783540881919
Both Content analysis and link, analysis have its advantages in measuring relationships among documents. In this paper. we propose a new method to combine these two methods to compute the similarity of research papers so that we can do clustering of these papers more accurately. In order to improve the efficiency of similarity calculation, we develop a strategy to deal with the relationship graph separately, without affecting the accuracy. We also design an approach to assign different weights to different links to the papers, which can enhance the accuracy of similarity calculation. The experimental results conducted oil ACM data Set show that our new algorithm. S-SimRank, outperforms other algorithms.
暂无评论