Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet - a collection of text fragments from the underlying document - tha...
详细信息
ISBN:
(纸本)9781450302470
Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet - a collection of text fragments from the underlying document - that has some relation to the query that is being answered. In this study we investigate how 10 humans construct snippets: participants first generate their own natural language snippet, and then separately extract a snippet by choosing text fragments, for four queries related to two documents. By mapping their generated snippets back to text fragments in the source document using eye tracking data, we observe that participants extract these same pieces of text around 73% of the time when creating their extractive snippets. In comparison, we notice that automated approaches for extracting snippets only use these same fragments 22% of the time. However, when the automated methods are evaluated using a position-independent bag-of-words approach, as typically used in the research literature for evaluating snippets, they appear to be much more competitive, with only a 24 point difference in coverage, compared to the human extractive snippets. While there is a 51 point difference when word position is taken into account. In addition to demonstrating this large scope for improvement in snippet generation algorithms with our novel methodology, we also offer a series of observations on the behaviour of participants as they constructed their snippets. Copyright 2010 ACM.
Reversible logic transforms logic signal in a way that allows the original input signals to be recovered from the produced outputs, has attracted great attention because of its application in many areas. Traditional s...
详细信息
The paper discusses the concept of innovation experiment systems in the context of long-lived embedded systems. These systems need to evolve continuously to stay competitive and provide value to the customer and end-u...
详细信息
Sending messages, retransmission of identical messages, and sensing information in the monitored environment can quickly reduce the energy consumption of a sensor node present in a wireless sensor network. Due to repe...
详细信息
Description Logics (DLs) are gaining more popularity as the foundation of ontology languages for the Semantic Web. As most information in real life is imperfect, there has been an increasing interest recently in exten...
详细信息
Recent advances in neuroscience have shown that the neuropathological disorders are closely related with diseases such as Alzheimer's. Those damages are particularly associated with the intermediate visual percept...
详细信息
Localization in indoor spaces has to rely on sensing devices (e.g., Radio Frequency Identification (RFID) readers, WiFi routers, bluetooth beacons) rather than GPS devices. On the other side, we could build a smart in...
详细信息
The widening performance gap between processor and disk demands innovations. Several optimization techniques, (e.g., read caching, write buffering, prefetching), have been proposed to narrow this gap. Read caching is ...
详细信息
ISBN:
(纸本)9780889868205
The widening performance gap between processor and disk demands innovations. Several optimization techniques, (e.g., read caching, write buffering, prefetching), have been proposed to narrow this gap. Read caching is the least effective technique among them in resource-poor condition. In this paper, we propose a novel technique which splits the disk cache into metadata and data to improve read miss ratio. The proposal is based on observations that metadata is smaller in size and more frequently accessed than its data. Split caches can reduce the interference between metadata and data. Our study shows that by splitting the disk cache the effective read miss ratio can be improved by 20%, which in turn would bring about 16% performance improvements in response time. Furthermore, the total size of the split caches is about 35%-97% of the unified cache to reach the same read miss ratio.
Business and design decisions regarding software development should be based on data, not opinions among developers, domain experts or managers. The company running the most and fastest experiments among the customer ...
详细信息
With the growing number of sophisticated deep learning algorithms and fake video generation applications, it is now possible to create highly realistic deepfake videos. Faceswap is the most commonly employed deepfakes...
详细信息
暂无评论