This report introduces the work of BUPT (PRIS) in Entity Track in TREC2009. The task and data are both new this year. In our work, an improved two-stage retrieval model is proposed according to the task. The first sta...
This report introduces the work of BUPT (PRIS) in Entity Track in TREC2009. The task and data are both new this year. In our work, an improved two-stage retrieval model is proposed according to the task. The first stage is document retrieval, in order to get the similarity of the query and documents. The second stage is to find the relationship between documents and entities. We also focus on entity extraction in the second stage and the final ranking.
Semi-structured Chinese document anlysis is the most diffcult task for complex structure and Chinese semantics. According to the generic characteristics of the semi-structured document and the specific characteristics...
详细信息
User recommendation problem is important for mobile operators when they provide some new service to users. The traditional methods provide a low success rate. In this paper, we present a novel user selection method of...
详细信息
User recommendation problem is important for mobile operators when they provide some new service to users. The traditional methods provide a low success rate. In this paper, we present a novel user selection method of advertising recommendation according to the maximal frequent items discovery theory. The experimental results demonstrate that our method can improve the success rate dramatically and reduce the amount of garbage advertisements.
Collaborative filtering is a very important technology in E-commerce. Unfortunately, with the increase of users and commodities, the user rating data is extremely sparse, which leads to the low efficient collaborative...
详细信息
Collaborative filtering is a very important technology in E-commerce. Unfortunately, with the increase of users and commodities, the user rating data is extremely sparse, which leads to the low efficient collaborative filtering recommendation system. To address these issues, an optimized collaborative filtering recommendation algorithm based on item is proposed. While calculating the similarity of two items, we obtain the ratio of users who rated both items to those who rated each of them. The ratio is taken into account in this method. The experimental results show that the proposed algorithm can improve the quality of collaborative filtering.
Blog Distillation is the process of finding a blog with a principle and recurring interest. In this paper, two baselines are used to validate the results of our experiments. A set of features of individual feed is fir...
详细信息
Blog Distillation is the process of finding a blog with a principle and recurring interest. In this paper, two baselines are used to validate the results of our experiments. A set of features of individual feed is firstly constructed by decision tree to represent the similarity distribution of every feed against certain interest. Features are then selected by computing their centroid distances to standard centroids of relevant feeds and irrelevant feeds. Later, SVM classifier is used to predict and re-rank the top 250 results of two baselines. The result shows that our algorithm can effectively present the feeds' similarity distribution and re-rank them into a new order which has much better MAP.
Semi-structured Chinese document analysis is the most difficult task for complex structure and Chinese semantics. According to the generic characteristics of the semi-structured document and the specific characteristi...
详细信息
Semi-structured Chinese document analysis is the most difficult task for complex structure and Chinese semantics. According to the generic characteristics of the semi-structured document and the specific characteristics of the resume document, the paper researched on resume document block analysis based on pattern matching, multi-level information identification and feedback control algorithms was also prompted. Based on the research, resume parser system was implemented for ChinaHR, which is the biggest recruitment Website. It can read, analysis, retrieval and store the information automatically. According to all kinds of experiments results, the accuracy and efficiency of this system can generally satisfy the practical requirements. As the research on the processing of the semi-structured document, it will not only be as a directive of the further research on the resume analysis, but also be as the reference to other form of the semi-structured document.
The issue of fault location received extensive concern in the field of telecom network management. The data mining approaches are introduced to extract clues from the telecom alarm data for fault location. Aiming at t...
详细信息
The issue of fault location received extensive concern in the field of telecom network management. The data mining approaches are introduced to extract clues from the telecom alarm data for fault location. Aiming at the key problems in telecom data mining, we have made a comprehensive analysis on the telecom network and its data as well as the fault propagation, some important characteristics are discovered, and a fault location oriented network model is built to improve the traditional approaches in data transforming and data mining. An enhanced data mining algorithm is proposed to introduce the constraints in real world into the data mining procedures. A data mining tool (PRISMiner) is implemented to benchmark the new algorithm, and our experiments show that the new algorithm is quite effective in improving the accuracy and efficiency of the PreFixSpan mining algorithm.
NETCONF is a new network management protocol, based on the XML encoding method, which was proposed by IETF in 2006. It aims to overcome the shortcomings of SNMP predominantly used for configuration tasks at present. I...
详细信息
ISBN:
(纸本)9781424448999;9781424448982
NETCONF is a new network management protocol, based on the XML encoding method, which was proposed by IETF in 2006. It aims to overcome the shortcomings of SNMP predominantly used for configuration tasks at present. In the NETCONF protocol, subtree filtering mechanism and XPath capability is defined, both used to allow a client to select particular XML subtrees of the data model of the server. In this paper, we describe how we implement the two mechanisms and evaluate them from two aspects of functionality and performance. We find that the subtree filtering is not as flexible as XPath capability, but has better performance.
In 2006, the IETF released its latest effort, NETCONF, a brand new network management protocol, which is based on the XML encoding method. The NETCONF protocol is thought to be able to meet the requirement of configur...
详细信息
In 2006, the IETF released its latest effort, NETCONF, a brand new network management protocol, which is based on the XML encoding method. The NETCONF protocol is thought to be able to meet the requirement of configuration management which SNMP fails to do well. The NETCONF protocol also performs better in other fields such as the efficiency, more flexible operations, etc. But, as a new protocol, NETCONF is not perfect either and it also has some shortcomings in several aspects. In this paper, some common problems and challenges in the field of network management which NETCONF has not efficiently solved are reviewed and a few pieces of suggestion will be given out.
Wireless sensor networks (WSNs) have promised us a new monitor and control model over the distributed computing environment. In general, these networks consist of a large number of sensor nodes densely distributed ove...
详细信息
Wireless sensor networks (WSNs) have promised us a new monitor and control model over the distributed computing environment. In general, these networks consist of a large number of sensor nodes densely distributed over the region of interest for collecting information or monitor & track certain specific phenomena from the physical environment. Network management in WSNs becomes extremely important and vital in order to keep the whole network and application work properly. Until now there still doesn't emerge a considerable, common agreed network management solution for WSN. This paper summarizes the uniqueness of network management protocol for WSN, provide an overview analysis of several current existed protocols, so as to find out and summarize their advantage and disadvantage, and provide some directions for the future research of network management protocol in WSN.
暂无评论