Existing de-duplication solutions in cloud backup environment either obtain high compression ratios at the cost of heavy de-duplication overheads in terms of increased latency and reduced throughput, or maintain small...
详细信息
Existing de-duplication solutions in cloud backup environment either obtain high compression ratios at the cost of heavy de-duplication overheads in terms of increased latency and reduced throughput, or maintain small de-duplication overheads at the cost of low compression ratios causing high data transmission costs, which results in a large backup window. In this paper, we present SAM, a Semantic-Aware Multitiered source de-duplication framework that first combines the global file-level de-duplication and local chunk-level deduplication, and further exploits file semantics in each stage in the framework, to obtain an optimal tradeoff between the deduplication efficiency and de-duplication overhead and finally achieve a shorter backup window than existing approaches. Our experimental results with real world datasets show that SAM not only has a higher de-duplication efficiency/overhead ratio than existing solutions, but also shortens the backup window by an average of 38.7%.
Most of the current information retrieval systems are mainly based on full text matching of keywords or topic-based classification, often return a large number of irrelevant information, and are unable to meet the use...
详细信息
Most of the current information retrieval systems are mainly based on full text matching of keywords or topic-based classification, often return a large number of irrelevant information, and are unable to meet the user's request. Ontology-based semantic retrieval is a hot issue in current research. In this paper, the corn plant ontology is constructed using Formal Concept Analysis based approach in which the concept lattice is built from terminology-file relationship table and further reduced. Based on the corn plant ontology, we propose a semantic annotation method in which the feature words are selected by an improved method for weight calculation and the RDF triples are generated by syntactic parser. Finally a semantic retrieval system for corn plant is developed. In comparative experiment one hundred documents are selected as the dataset, and the result shows that the semantic retrieval system introduced in this paper is superior to keyword-based retrieval method in precision ratio and recall ratio.
Implementing runtime integrity measurement in an acceptable way is a big challenge. We tackle this challenge by developing a framework called Patos. This paper discusses the design and implementation concepts of our o...
详细信息
Nowadays, the key strategic issues about resource and environment, such as global warming, water resources shortage, food security etc. are more complex and multidisciplinary than ever before. In scientific research, ...
详细信息
ISBN:
(纸本)9781424473014
Nowadays, the key strategic issues about resource and environment, such as global warming, water resources shortage, food security etc. are more complex and multidisciplinary than ever before. In scientific research, the developmental tendency of modelling and integrating technique is all but irreversible. Focusing on the topic of model integration, we develop a Workflow-based Spatial Modeling Environment (WF-SME) which enables scientists and modeling users easy to design and create of complex systems models through a drag-drop operation or editing mathematical formula in model constructor. WF-SME, a visual integrated modeling environment is mainly used for the rapid realization of model dynamic integration, in which we can implement the composition and connection of the existing models, and generate new model through graphical operation. The papers firstly introduce the framework of model integration designed for WF-SME as well as its supporting theory, and then explain in detail how the modules of WF-SME have been realized. WF-SME is a platform for spatial modelling, which was designed in four parts, Spatial Workflow Designer, Model builder, Spatial Calculation engine and Visualization of Modelling Output. The spatial workflow designer built-in WFSME shipping with toolsets was adopted to create spatial workflow, reuse and aggregate models. Consisting of equation editor and data banding tool, the model builder enable modeller updates formulas on-demand. This editor parsed by MathML is powerful and flexible enough for the requirements of expressions of mathematical calculations. The spatial calculating engine developed based on component, through the map algebra language, executes mathematical calculations. The multi-form visualization module was developed to provide visual expression for a series of intermediate data and its result of the domain models. Finally, we illustrate how the typical practical application sample has been modeled in our modeling environment
Search engines and web crawlers can not access the Deep Web directly. The workable way to access the hidden database is through query interfaces. Automatic extracting attributes from query interfaces and translating q...
详细信息
In this paper, a hybrid algorithm named DPSOSA is proposed to find near-to-optimal elimination orderings in Bayesian networks. DPSO-SA is a discrete particle swarm optimization method enhanced by simulated annealing. ...
详细信息
In this paper, a hybrid algorithm named DPSOSA is proposed to find near-to-optimal elimination orderings in Bayesian networks. DPSO-SA is a discrete particle swarm optimization method enhanced by simulated annealing. Computational tests show that this hybrid method is very effective and robust for the elimination ordering problem.
To find an optimal elimination ordering for Bayesian networks, a multi-heuristic-based ant colony system named MHC-HS-ACS is proposed. MHC-HS-ACS uses a set of heuristics to guide the ants to search solutions. The heu...
详细信息
To find an optimal elimination ordering for Bayesian networks, a multi-heuristic-based ant colony system named MHC-HS-ACS is proposed. MHC-HS-ACS uses a set of heuristics to guide the ants to search solutions. The heuristic set can evolve with the searching procedure in an adaptive way. MHC-HS-ACS also utilizes a heuristic-based local search to accelerate its convergence. Computational experiments show that MHC-HS-ACS can find very high quality solutions.
According to the characteristics of the optimal elimination ordering problem in Bayesian networks, a heuristic-based genetic algorithm, a cooperative coevolutionary genetic framework and five grouping schemes are prop...
详细信息
According to the characteristics of the optimal elimination ordering problem in Bayesian networks, a heuristic-based genetic algorithm, a cooperative coevolutionary genetic framework and five grouping schemes are proposed. Based on these works, six cooperative coevolutionary genetic algorithms are constructed. Numerical experiments show that these algorithms are more robust than other existing swarm intelligence methods when solving the elimination ordering problem.
Improving energy efficiency of mass storage systems has become an important and pressing research issue in large HPC centers and data centers. New energy conservation techniques in storage systems constantly spring up...
Improving energy efficiency of mass storage systems has become an important and pressing research issue in large HPC centers and data centers. New energy conservation techniques in storage systems constantly spring up; however, there is a lack of systematic and uniform way of accurately evaluating energy-efficient storage systems and objectively comparing a wide range of energy-saving techniques. This research presents a new integrated scheme, called TRACER, for evaluating energy-efficiency of mass storage systems and judging energy-saving techniques. The TRACER scheme consists of a toolkit used to measure energy efficiency of storage systems as well as performance and energy metrics. In addition, TRACER contains a novel and accurate workload-control module to acquire power varying with workload modes and I/O load intensity. The workload generator in TRACER facilitates a block-level trace replay mechanism. The main goal of the workload-control module is to select a certain percentage (e.g., anywhere from 10% to 100%) of trace entries from a real-world I/O trace file uniformly and to replay filtered trace entries to reach any level of I/O load intensity. TRACER is experimentally validated on a general RAID5 enterprise disk array. Our experiments demonstrate that energy-efficient mass storage systems can be accurately evaluated on full scales by TRACER. We applied TRACER to investigate impacts of workload modes and load intensity on energy-efficiency of storage devices. This work shows that TRACER can enable storage system developers to evaluate energy efficiency designs for storage systems.
Poor quality and harsh condition can result in faulty and outlier data in sampling data of sensor nodes. So we need median query to reflect average level of monitoring region. First, we put forward HMA algorithm. Seco...
详细信息
Poor quality and harsh condition can result in faulty and outlier data in sampling data of sensor nodes. So we need median query to reflect average level of monitoring region. First, we put forward HMA algorithm. Second, we extend HMA algorithm and put forward HFMA algorithm. In HFMA, We only need collect data inside filter and aggregate influence coefficient during sampling period. Base station can compute median result according to the sample data inside filter and influence coefficient aggregation value. Experimental results have shown that HFMA outperforms Naive algorithm and HMA algorithm and can prolong the lifetime of sensor network.
暂无评论