检索结果-内蒙古大学图书馆

Proceedings of the Biannual World Automation Congress

作者： Peichang Shi Yiying Li Bo Ding Longquan Jiang Hui Liu Jie Zhang National Key Laboratory of Parallel and Distributed Processing School of Computer Science National University of Defense Technology Changsha Hunan CN National Key Laboratory of Parallel and Distributed Processing School of Computer Science National University of Defense Technology Changsha 410073 China National University of Defense Technology Changsha Hunan CN China Electr. Equip. & Syst. Eng. Co. Ltd. China

As The integration of Physical space and cyberspace, the large-scale data distributing to diversification terminal which is geographical distribution of mass has become a huge challenge. When the data size can't be processed by the technology for traditional scope, how to deal with the user quality of service and efficient use of system resources has become an important issue of concern, with the resources becoming limited. This paper presents a data-driven mechanism for large-scale data distribution which is consists of four core part of the data production, data collection and pre-processing, data analysis engine, data consumption, aims to excavate the valuable information to improve the efficiency of resource use and accurate fault location for the Large-scale data distribution system. At the same time, this paper studies the resource scheduling optimization with analyzing data driven for the system behavior and Fault location with analyzing data-driven environment, which proves the effectiveness for the operation of the Large-scale data distribution system optimization by the data-driven working.

关键词： Servers Monitoring distributed databases Big data Real-time systems Business

来源：评论

学校读者我要写书评

暂无评论

High Performance Interconnect Network for Tianhe System

引用

Journal of Computer Science & technology 2015年第2期30卷 259-272页

作者：廖湘科庞征王克非卢宇彤谢旻夏军董德尊所光 College of Computer National University of Defense Technology Changsha 410073 Science and Technology on Parallel and Distributed Processing Laboratory National Changsha 410073 China China University of Defense Technology State Key Laboratory of High Performance Computing National University of Defense Technology Changsha 410073 China

In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.

关键词： Tianhe-2 supercomputer interconnect network router architecture network interface architecture user-level message passing

来源：评论

学校读者我要写书评

暂无评论

DFIS: A scalable distributed fingerprint identification system 15th

DFIS: A scalable distributed fingerprint identification syst...

引用

15th International Conference on Algorithms and Architectures for parallel processing, ICA3PP 2015

作者： Zhao, Yunxiang Zhang, Wanxin Li, Dongsheng Huang, Zhen National Laboratory for Parallel and Distributed Processing College of Computer National University of Defence Technology ChangshaHunan410073 China

ISBN: (纸本)9783319271361

Fingerprint has been widely used in a variety of biometric identification systems. However, with the rapid development of fingerprint identification systems, the amount of fingerprints information stored in systems has been rising sharply, making it challenging to process and store fingerprints efficiently and robustly with traditional standalone systems and relational databases. In this paper, we propose a scalable distributed fingerprint identification system, named DFIS. It combines the feature extraction procedure with HIPI library and optimizes the load balance strategy of MongoDB to construct a much more robust and stable system. Related experiments and simulations have been carried out and results show that DFIS can reduce the time expense by 70% during the features extraction procedural. For load balance of MongoDB, DFIS can decrease the difference of access load to less than 5% and meanwhile decrease 50% data migration to gain more reasonable distribution of operation load and data load among shards compared with the default load balance strategy in MongoDB. © Springer International Publishing Switzerland 2015.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Enabling Tissue-Scale Cardiac Simulations Using Heterogeneous Computing on Tianhe-2

Enabling Tissue-Scale Cardiac Simulations Using Heterogeneou...

引用

International Conference on parallel and distributed Systems (ICPADS)

作者： Johannes Langguth Qiang Lan Namit Gaur Xing Cai Mei Wen Chun-Yuan Zhang Simula Research Laboratory Lysaker Norway College of Computer National University of Defense Technology Changsha China National Key Laboratory of Parallel and Distributed Processing Changsha China Department of Informatics University of Oslo Oslo Norway

ISBN: (纸本)9781509053827

We develop a simulator for 3D tissue of the human cardiac ventricle with a physiologically realistic cell model and deploy it on the supercomputer Tianhe-2. In order to attain the full performance of the heterogeneous CPU-Xeon Phi design, we use carefully optimized codes for both devices and combine them to obtain suitable load balancing. Using a large number of nodes, we are able to perform tissue-scale simulations of the electrical activity and calcium handling in millions of cells, at a level of detail that tracks the states of trillions of ryanodine receptors. We can thus simulate arrythmogenic spiral waves and other complex arrhythmogenic patterns which arise from calcium handling deficiencies in human cardiac ventricle tissue. Due to extensive code tuning and parallelization via OpenMP, MPI, and SCIF/COI, large scale simulations of 10 heartbeats can be performed in a matter of hours. Test results indicate excellent scalability, thus paving the way for detailed whole-heart simulations in future generations of leadership class supercomputers.

关键词： Calcium Computational modeling Mathematical model Performance evaluation Hardware Instruction sets Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Improving performance portability for GPU-specific Open CL kernels on multi-core/many-core CPUs by analysis-based transformations

引用

Frontiers of Information technology & Electronic Engineering 2015年第11期16卷 899-916页

作者： Mei WEN Da-fei HUANG Chang-qing XUN Dong CHEN School of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Processing

OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus has been extensively used. However, locality concerns exposed in GPU-specific OpenCL code are usually inherited without analysis, which may give side-effects on the CPU performance. Typi- cally, the use of OpenCL＇s local memory on multi-core/many-core CPUs may lead to an opposite performance effect, because local-memory arrays no longer match well with the hardware and the associated synchronizations are costly. To solve this dilemma, we actively analyze the memory access patterns using array-access descriptors derived from GPU-specific kernels, which can thus be adapted for CPUs by （1） removing all the unwanted local-memory arrays together with the obsolete barrier statements and （2） optimizing the coalesced kernel code with vectorization and locality re-exploitation. Moreover, we have developed an automated tool chain that makes this transformation of GPU-specific OpenCL kernels into a CPU-friendly form, which is accompanied with a scheduler that forms a new OpenCL runtime. Experiments show that the automated transformation can improve OpenCL kernel performance on a multi-core CPU by an average factor of 3.24. Satisfactory performance improvements axe also achieved on Intel＇s many-integrated-core coprocessor. The resultant performance on both architectures is better than or comparable with the corresponding OpenMP performance.

关键词： OpenCL Performance portability Multi-core/many-core CPU Analysis-based transformation

来源：评论

学校读者我要写书评

暂无评论

RScam: Cloud-based anti-malware via reversible sketch 11th

RScam: Cloud-based anti-malware via reversible sketch

引用

11th International Conference Security and Privacy in Communication Networks, SecureComm 2015

作者： Sun, Hao Wang, Xiaofeng Su, Jinshu Chen, Peixin College of Computer National University of Defense Technology Changsha China National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha China

ISBN: (纸本)9783319288642

Cybercrime caused by malware becomes a persistent and damaging threat which makes the trusted security solution urgently demanded, especially for resource-constrained ends. The existing industry and academic approaches provide available anti-malware systems based on different perspectives. However, it is hard to achieve high performance detection and data privacy protection simultaneously. This paper proposes a cloud-based anti-malware system, called RScam, which provides fast and trusted security service for the resource-constrained ends. In RScam, we present suspicious bucket filtering, a novel signature-based detection mechanism based on the reversible sketch structure, which provides retrospective and accurate orientations of malicious signature fragments. Then we design a lightweight client which utilizes the digest of signature fragments to sharply reduce detection range. Finally, we design balanced interaction mechanism, which transmits sketch coordinates of suspicious file fragments and transformation of malicious signature fragments between the client and cloud server to protect data privacy and reduce traffic volume. We evaluate the performance of RScam with campus suspicious traffic and normal files. The results demonstrate validity and veracity of the proposed mechanism. Our system can outperform other existing systems with less time and traffic consumption. © Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2015.

关键词： Malware

来源：评论

学校读者我要写书评

暂无评论

Unified virtual memory support for deep CNN accelerator on soC FPGA 1

引用

15th International Conference on Algorithms and Architectures for parallel processing, ICA3PP 2015

作者： Xiao, Tao Qiao, Yuran Shen, Junzhong Yang, Qianming Wen, Mei College of Computer National University of Defense Technology Changsha410073 China National Key Laboratory of Parallel and Distributed Processing National University of Defense Technology Changsha410073 China

ISBN: (数字)9783319271194

ISBN: (纸本)9783319271187

Cooperation of CPU and hardware accelerator on SoC FPGA to accomplish computational intensive tasks, provides significant advantages in performance and energy efficiency. However, current operating systems provide little support for accelerators: the OS is unaware that a computational task can be executed either on a CPU core or an accelerator, and provides no assistance in efficient management of data sharing between CPU and accelerator on the DRAM, such as zero copy, data coherence. It’s also hard for current OS to allocate large contiguous physical memory space for accelerator. In this paper, we select the Xilinx ZYNQ as target and qualitatively analyze methods of sharing data. Besides using high-performance (HP) AXI interfaces of the ZYQN device, we develop a novel memory management system for FPGA-based accelerator. It provides a unified virtual space for CPU cores and accelerator so that they can access the same memory space in the operating systems user space. For a deep convolutional neural network task, our design gains up to speed-up of 5. 34x compared to traditional processoraccelerator cooperation. © Springer International Publishing Switzerland 2015.

关键词： Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

Cooperative monitoring BGP among autonomous systems

引用

Security and Communication Networks 2015年第10期8卷 1943-1957页

作者： Hu, Ning Wang, BaoSheng Liu, Xin National Key Laboratory for Parallel and Distributed Processing College of Computer National University of Defense Technology Changsha Hunan China

As the de facto Internet inter-domain routing protocol, BGP protocol has a number of vulnerabilities and weakness. Monitoring BGP is an effective way to improve the security of inter-domain routing. This paper presents a cooperative BGP monitoring method, which is called cooperative information sharing model (CoISM). CoISM is based on self-organization and can be used to solve some problems during BGP monitoring, such as route validation and bogus route notification delivery. CoISM provides autonomous systems with a more comprehensive information view by introducing information diffuse reflection based on initiative inquiry and making use of the relativity of monitoring information. CoISM optimizes the information transmission by leveraging the data locality caused by BGP policy and implements ISP coordination with low communication and deployment cost. More specifically, CoISM provides a self-organizing and incentive mechanism, which drives autonomous systems to coordinate independently and shares information on-demand. CoISM supports incremental deployment and can also be applied to a wide range of inter-domain cooperative management applications such as inter-domain routing failure analysis and intrusion detection. © 2014 John Wiley & Sons, Ltd.

关键词： Monitoring

来源：评论

学校读者我要写书评

暂无评论

AutoMal: Automatic clustering and signature generation for malwares based on the network flow

引用

Security and Communication Networks 2015年第10期8卷 1845-1854页

作者： Hao, Sun Wang, Wen Lu, Huabiao Ren, Peige School of Computer National University of Defense Technology Changsha China National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha China

The volume of malwares is growing at an exponential speed nowadays. This huge growth makes it extremely hard to analyse malware manually. Most existing signatures extracting methods are based on string signatures, and string matching is not accurate and time consuming. Therefore, this paper presents AutoMal, a system for automatically extracting signatures from large-scale malwares. Firstly, the system proposes to represent the network flows by using feature hashing, which can dramatically reduce the high-dimensional feature spaces that are general in malware analysis. Then, we design a clustering and median filtering method to classify the malware vectors into different types. Finally, it introduces the signature generation algorithm based on Bayesian method. The system can extract both the byte signature and the hash signature of malwares from its network flow with low false positive and zero false negative. Our evaluation shows that AutoMal can generate strongly noise-resisted signatures that exactly depict the characteristics of malware. © 2014 John Wiley & Sons, Ltd.

关键词： Malware

来源：评论

学校读者我要写书评

暂无评论

MABP: an optimal resource allocation approach in data center networks

引用

Science China(Information Sciences) 2014年第10期57卷 230-245页

作者： LI XiaoLing WANG HuaiMin DING Bo LI XiaoYong National Key Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology

In data center networks, resource allocation based on workload is an effective way to allocate the infrastructure resources to diverse cloud applications and satisfy the quality of service for the users, which refers to mapping a large number of workloads provided by cloud users/tenants to substrate network provided by cloud providers. Although the existing heuristic approaches are able to find a feasible solution, the quality of the solution is not guaranteed. Concerning this issue, based on the minimum mapping cost, this paper solves the resource allocation problem by modeling it as a distributed constraint optimization problem. Then an efficient approach is proposed to solve the resource allocation problem, aiming to find a feasible solution and ensuring the optimality of the solution. Finally, theoretical analysis and extensive experiments have demonstrated the effectiveness and efficiency of our proposed approach.

关键词： data center network resource allocation workload substrate network optimality distributed constraint optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：