检索结果-内蒙古大学图书馆

2011 International Conference on Computers, Communications, Control and Automation

作者： Xiaomin Jia Ping Huang Tianlei Zhao Shubo Qi Guitao Fu Minxuan Zhang National Laboratory for Parallel and Distributed Processing School of Computer National University of Defense Technology

Conventional replacement policy LRU(Least Recently Used) can significantly degrade the overall performance of shared cache of Chip Multi-Processors(CMPs), when the aggregate working set of multiple co-scheduled applications can not fit in cache. Different applications have different inherent cache access behavior characteristics. Replacement policy should take into account that fact so as to derive more performance benefit. This paper proposes application cache Behavior Identification based Insertion Policy(BIIP)replacement policies for managing shared cache in CMPs. BIIP seeks to make use of the cache access behavior characteristics of each co-scheduled application to smartly choose replacement policy. Our evaluation using a full system CMP simulator shows that BIIP improves the overall throughput by 14.8%,11.2%, 5.6% and 7.2% on average over baseline LRU policy,the prevailing cache partitioning scheme UCP and two other shared cache replacement policies PIPP, TADIP, respectively on a 4-core CMP with 16 SPEC CPU2006 workloads. Moreover,BIIP requires a total storage overhead of no more than several counters per core, and does not require changes to the current cache structure.

关键词： chip multi-processors(CMPs),application cache behaviors,last-level caches(LLC),insertion,replacement

来源：评论

学校读者我要写书评

暂无评论

基于移动云计算的大规模上下文流处理框架研究

基于移动云计算的大规模上下文流处理框架研究

引用

第九届中国通信学会学术年会

作者： XIAO You 肖友 SHI Dian-xi 史殿习 National Key Laboratory for Parallel and Distributed Processing School of Computer ScienceNational 国防科技大学计算机学院并行与分布处理国家重点实验室长沙410073

随着移动云计算的兴起，有关于大规模移动设备上下文信息处理的研究成为了热点。目前比较流行的解决方案是采用基于MapReduce计算模型的Hadoop框架，它虽能解决大规模数据处理的吞吐量问题但是却不能很好的解决数据处理的时效问题。因... 详细信息

随着移动云计算的兴起，有关于大规模移动设备上下文信息处理的研究成为了热点。目前比较流行的解决方案是采用基于MapReduce计算模型的Hadoop框架，它虽能解决大规模数据处理的吞吐量问题但是却不能很好的解决数据处理的时效问题。因此，本文针对存在的问题，提出了移动云计算与流处理技术相结合的解决方案，并构建与实现了一个实时处理大规模上下文的流处理框架。最后通过此框架对本文的研究内容进行了验证，且以一个道路实况监测的实例验证了框架的可行性及有效性。

关键词：移动云计算上下文信息流处理框架可行性分析

来源：评论

学校读者我要写书评

暂无评论

基于物理轨迹数据和社会网络的泛化行程推荐

基于物理轨迹数据和社会网络的泛化行程推荐

引用

第六届中国传感器网络学术会议(CWSN 2012)

作者： MENG Xiang-Xu 孟祥旭 WANG Xiao-Dong 王晓东 ZHOU Xing-Ming 周兴铭 National Key Laboratory of Parallel and Distributed Processing(College of Computer Science National 并行与分布处理国家重点实验室(国防科学技术大学计算机学院) 湖南长沙410073

人类活动行程的制定往往基于宽泛的最初意向，通过综合考虑各种约束条件加以优化而完成。当前，基于位置点名称查找的行程制定方法，不支持用户一次性提交多个具有时序关系的宽泛出行意向，更不能同时为多个地理位置点提供详细的最优驾... 详细信息

人类活动行程的制定往往基于宽泛的最初意向，通过综合考虑各种约束条件加以优化而完成。当前，基于位置点名称查找的行程制定方法，不支持用户一次性提交多个具有时序关系的宽泛出行意向，更不能同时为多个地理位置点提供详细的最优驾车方案。基于位置社交网络信息和车辆历史轨迹数据，探索了支持用户多个模糊意向输入的泛化行程推荐框架，主要工作包括：(1)对泛化的行程推荐问题进行建模;(2)设计并实现了基于分类树的地理位置点(POI)查询策略和算法;(3)提出了基于Voronoi图的GPS轨迹分析模型，并实现了任意两个位置点间最优行驶路径计算方法;(4)联合社会网络和语义交通信息图，基于蚁群算法进行行程的推荐，并实现了原型系统。实验及问卷调查结果表明，推荐结果的用户满意度可达80％。

关键词：行程规划泛化模型优化设计物理轨迹数据社会网络

来源：评论

学校读者我要写书评

暂无评论

Filtering error log as time series in complex service-based storage systems

Filtering error log as time series in complex service-based ...

引用

International Conference on Networked Computing and Advanced Information Management (NCM)

作者： Xiang Rao Gang Yin Huaimin Wang Dianxi Shi Yanxu Zhu National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha China

Mining log pattern to analyze the faults in large scale distributed system is affected by the existence of redundant and ambiguous noisy error logs. While existing works try to compress logs in a coarse granularity from temporal and spatial view to remove the redundancy, they fail to reserve those ambiguous logs that might truly relate to a fault, which misleads the fault characterizing result. By modeling error logs as time series and examining the similarity between trash error log template and target error log, the ambiguous error logs are kept and the affected patterns can be effectively removed. Experiments in a practical complex service-based storage show that up to 92% of the affected patterns can be filtered.

关键词： Time series analysis Approximation methods Computer crashes Libraries Matched filters Transforms

来源：评论

学校读者我要写书评

暂无评论

Cache Miss Analysis for GPU Programs Based on Stack Distance Profile

Cache Miss Analysis for GPU Programs Based on Stack Distance...

引用

International Conference on distributed Computing Systems

作者： Tao Tang Xuejun Yang Yisong Lin National Laboratory of Parallel and Distributed Processing National University of Defense Technology Changsha China

Using the graphics processing unit (GPU) to accelerate the general purpose computation has attracted much attention from both the academia and industry due to GPU's powerful computing capacity. Thus optimization of GPU programs has become a popular research direction. In order to support the general purpose computing more efficiently, GPU has integrated the general data cache to replace the existing software-managed on-chip memory. Consequently, improving the usage of the data cache becomes of vital importance to improve the performance of the GPU programs. The foundation of cache locality optimizations is efficient analysis and prediction of the cache behavior. Unfortunately, existing cache miss analysis models are based on sequential programs and thus cannot be used to analyze the GPU programs directly. In this paper, based on the deep analysis of GPU's execution model, we propose, for the first time, a cache miss analysis model for the GPU programs. We divide the problem into two subproblems: stack distance profile analysis of single thread block and cache contention analysis of multiple thread blocks. The experimental results from nine typical application kernels in the scientific computing field illustrate that our method is efficient and can be used to guide the cache locality optimizations for the GPU programs.

关键词： Graphics processing unit Instruction sets Kernel Analytical models Computational modeling Silicon Optimization

来源：评论

学校读者我要写书评

暂无评论

Fault Recovery Based on parallel Recomputing in Transactional Memory System

Fault Recovery Based on Parallel Recomputing in Transactiona...

引用

The 2011 International Conference on Electric and Electronics(EEIC 2011)

作者： Wei Song Jia Jia National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology

This paper addresses the issue of fault recovery in transactional memory,and proposes a method of fault recovery based on parallel recomputing in transactional memory *** method utilizes the dataversioning mechanism of transactional memory system to avoid the extra cost of state saving,rolls back a single transaction to avoid wasting the computing time of the fault-free transactions,and adopts the parallel recomputing method to reduce the cost of fault *** paper applies this method to Open TM programs,and proposes the implementation method of parallel recomputing in Open *** last,this paper tests the performance of this method through a test *** experimental results show that,compared with the fault recovery method of rolling back a single transaction,the parallel recomputing method in transactional memory system can execute the fault recovery quickly and accurately and the method has a well scalability.

关键词： Fault tolerance parallel recomputing Fault Recovery Transactional memory

来源：评论

学校读者我要写书评

暂无评论

Special-purposed VLIW architecture for IEEE-754 quadruple precision elementary functions on FPGA

Special-purposed VLIW architecture for IEEE-754 quadruple pr...

引用

IEEE International Conference on Computer Design: VLSI in Computers and Processors, (ICCD)

作者： Yuanwu Lei Yong Dou Li Shen Jie Zhou Song Guo National Laboratory of Parallel & Distributed Processing National University of Defense Technology Changsha China

This work explores the feasibility to implement IEEE-754-2008 standard quadruple precision (Quad) elementary functions on recent FPGAs with plenty of embedded memories and DSP blocks. First, we analysis the implementation algorithm of Quad elementary functions in detail. Then, we present a special-purpose Very Large Instruction Word (VLIW) architecture for Quad elementary function (QE-Processor). The proposed processor uses a unified hardware structure, equipped with multiple basic arithmetic units, to implement various Quad algebraic and transcendental functions, in which several tradeoffs between latency and resource usage are carefully planned to avoid unbalanced resource utilization. The performance is improved through the explicitly parallel technology of custom VLIW instruction. Finally, we create a prototype of QE-Processor into Xilinx Virtex-5 and Virtex-6 FPGA chips. The experimental results show that our design can guarantee that the percentage of correct rounding is more than 99.9%. Moreover, the FPGA implementation on Virtex-6 XC6VLX760-2FF1760 FPGA, running at 220 MHz, outperforms the parallel software approach based on OpenMP running on an Intel Xeon E5620 CPU at 2.40GHz by a factor of 13X-20X for special function applications in Boost library.

关键词： Strontium Lead Random access memory System-on-a-chip Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Approximating Quantified SMT-Solving with SAT

Approximating Quantified SMT-Solving with SAT

引用

International Conference on Secure Software Integration and Reliability Improvement Companion (SSIRI-C)

作者： Xianjin Fu Wanwei Liu Jing Li National Laboratory for Parallel & Distributed Processing National University of Defense Technology Changsha China

Satisfiability Modulo Theories (SMT) is an extension of SAT towards FOL. SMT solvers have proven highly scalable and efficient for problems based on some ground theorems. However, SMT problems involving quantifiers and combination of theorems is a long-standing challenge, which has been a major bottleneck of practical application of SMT solvers in some fields. We reveal a decidable fragment of FOL involving quantifiers, which could not be solved by SMT solvers such as Z3, CVC3, etc., and show how to convert them into model checking problems.

关键词： Connectors Semantics Labeling Benchmark testing Cognition Encoding Computational modeling

来源：评论

学校读者我要写书评

暂无评论

The prediction model based on RBF network in achieving elastic cloud

引用

Advances in Information sciences and Service sciences 2011年第11期3卷 67-78页

作者： Shi, Peichang Wang, Huaimin Ding, Bo Liu, Ran Wang, Tianzuo National Laboratory for Parallel and Distributed Processing School of Computer and Science National University of Defense Technology China Department of Electronics and Information Engineering Huazhong University of Science Technology China

Cloud needs to have rapid and elastic resources supply capability, because of the fluctuant resources demand of end-users. Multi-scale resources elastic binding is an important method to provide cloud services with rapid and elastic service capability. The most challenging problem in multi-scale resources elastic binding is how to predict the dynamic resource demand of end-users, and then decide when and to what extent multi-scale resources need elastic binding based on the prediction. In this paper, we present the prediction model based on RBF (Radial Basis Function) Network, which is used to predict end-users resource demand in advance. Compared with current prediction methods, it has faster prediction speed and higher prediction accuracy. Then we use traces data (the bandwidth demand of Web type of cloud services) collected from a real-world cloud provider: ChinaCache, as the training and testing data set to validate the method. Finally, we evaluate the predicted results using general prediction accuracy metrics. The results prove that the prediction model based on RBF network is able to resolve the decision problem in multi-scale resources elastic binding.

关键词： Radial basis function networks

来源：评论

学校读者我要写书评

暂无评论

A Task-Mapping Algorithm for Performance Simulation Based on Simulated Annealing

A Task-Mapping Algorithm for Performance Simulation Based on...

引用

International Conference on Computational and Information sciences (ICCIS)

作者： Xiaowei Guo Yufei Lin Xinhai Xu Xin Zhang National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha Hunan China

Nowadays, the scale of parallel computer systems is increasing, and simulation technology has become an important tool for performance prediction in the system development process. Task mapping approach is an important aspect affecting the performance of simulation. In this paper, in order to solve the task mapping problem in performance simulation, a task mapping algorithm based on simulated annealing is proposed, and we verified the correctness and effectiveness of the algorithm by experiments. Experimental results show that the algorithm has high efficiency, and can solve the large-scale problem with lower time cost.

关键词： Simulated annealing Algorithm design and analysis Computational modeling Load modeling Approximation algorithms Analytical models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：