检索结果-内蒙古大学图书馆

5th International Workshop on Multiple Access Communications, MACOM 2012

作者： Shen, Hu Lv, Shaohe Sun, Yanqiang Dong, Xuan Wang, Xiaodong Zhou, Xingming National Key Laboratory of Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China Institute of Network Information and Security National University of Defense Technology Changsha 410073 China

ISBN: (纸本)9783642349751

In recent WLAN standards (such as IEEE 802.11n), MIMO (Multiple Input Multiple Output) is deployed to provide high data transmission rate. It is however challenging to efficiently share the channel resources among different stations/users. In this paper, we study the MAC protocol in heterogeneous MIMO-based WLAN to effectively exploit the capability of concurrent transmission. We propose a novel subcarrier encoding method, which uses frequency signatures to perform the control message exchange between the AP and multiple stations simultaneously. Afterwards, a MAC protocol, HT-MIMO MAC, is presented to support concurrent transmission in both the uplink and downlink directions. HT-MIMO MAC supports link adaptation and is completely compatible with legacy stations. We evaluate the performance of the HT-MIMO MAC protocol and find that it outperforms the existing 802.11 MAC protocol with SU-MIMO and the downlink MU-MIMO MAC protocol in [9] with a remarkable throughput gains up to 86%. © 2012 Springer-Verlag.

关键词： MIMO systems

来源：评论

学校读者我要写书评

暂无评论

Funnel: An Efficient Sparse Attention Accelerator with Multi-Dataflow Fusion

Funnel: An Efficient Sparse Attention Accelerator with Multi...

引用

International Symposium on parallel and distributed Processing with Applications, ISPA

作者： Shenghong Ma Jinwei Xu Jingfei Jiang Yaohua Wang Dongsheng Li National Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha China

ISBN: (数字)9798331509712

ISBN: (纸本)9798331509729

The self-attention mechanism is the core component of Transformer, which provides a powerful ability to understand the sequence context. However, the self-attention mechanism also suffers from a large amount of redundant computation. Model sparsification can effectively reduce computational load, but the irregularity of non-zeros introduced by sparsification significantly decreases hardware efficiency. This paper proposes Funnel, an accelerator that dynamically predicts sparse attention patterns and efficiently processes unstructured sparse data. Firstly, we adopt a fast quantization method based on lookup table to minimize the cost of sparse patterns prediction. Secondly, we propose Funnel computing Unit (FCU), a hardware architecture that efficiently handles sparse attention through multi-dataflow fusion. Sampled Dense-Dense Matrix Multiplication (SDDMM) and Sparse-Dense Matrix Multiplication (SpMM) are core components of sparse attention mechanism. FCU unifies the computation ways of matrix inner product and row-wise product to support SDDMM and SpMM at the same time, which greatly reduces the storage and movement overhead of intermediate results. Lastly, we devise a lightweight buffer and data tiling strategy tailored to the proposed accelerator, aimed at enhancing data reuse. Experiments demonstrate that our accelerator achieves 0.10-0.25 sparsity with small accuracy loss. When computing the self-attention layer, it attains hardware efficiency ranging from 60% to 85%. Compared to CPU and GPU, it achieves 5.60x and 8.20x speedup. Compared to the state-of-the-art attention accelerators A 3 , SpAtten, FTRANS, and Sanger, it achieves 7.37x, 4.52x, 9.58x, and 3.08x speedup.

关键词： Quantization (signal) Graphics processing units Computer architecture Transformer cores Transformers Prediction algorithms Hardware Spatiotemporal phenomena Sparse matrices Sorting

来源：评论

学校读者我要写书评

暂无评论

Local-Adaptive Transformer for Multivariate Time Series Anomaly Detection and Diagnosis

Local-Adaptive Transformer for Multivariate Time Series Anom...

引用

IEEE International Conference on Systems, Man and Cybernetics

作者： Xiaohui Zhou Yijie Wang Hongzuo Xu Mingyu Liu Ruyi Zhang National Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha China

Time series data are pervasive in varied real-world applications, and accurately identifying anomalies in time series is of great importance. Many current methods are insufficient to model long-term dependence, whereas some anomalies can be only identified through long temporal contextual information. This may finally lead to disastrous outcomes due to false negatives of these anomalies. Prior arts employ Transformers (i.e., a neural network architecture that has powerful capability in modeling long-term dependence and global association) to alleviate this problem; however, Transformers are insensitive in sensing local context, which may neglect subtle anomalies. Therefore, in this paper, we propose a local-adaptive Transformer based on cross-correlation for time series anomaly detection, which unifies both global and local information to capture comprehensive time series patterns. Specifically, we devise a cross-correlation mechanism by employing causal convolution to adaptively capture local pattern variation, offering diverse local information into the long-term temporal learning process. Furthermore, a novel optimization objective is utilized to jointly optimize reconstruction of the entire time series and matrix derived from cross-correlation mechanism, which prevents the cross-correlation from becoming trivial in the training phase. The generated cross-correlation matrix reveals underlying interactions between dimensions of multivariate time series, which provides valuable insights into anomaly diagnosis. Extensive experiments on six real-world datasets demonstrate that our model outperforms state-of-the-art competing methods and achieves 6.8%-27.5% $F_{1}$ score improvement. Our method also has good anomaly interpretability and is effective for anomaly diagnosis.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Rethinking the distributed DNN Training Cluster Design from the Cost-effectiveness View

Rethinking the Distributed DNN Training Cluster Design from ...

引用

IEEE International Conference on High Performance computing and Communications (HPCC)

作者： Zhiquan Lai Yujie Liu Wei Wang Yanqi Hao Dongsheng Li National Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha China

As deep learning grows rapidly, model training heavily relies on parallel methods and there exist numerous cluster configurations. However, current preferences for parallel training focus on data centers, overlooking the financial constraints faced by most researchers. To attain the best performance within the cost limitation, we introduce a throughput-cost metric to accurately characterize clusters' cost-effectiveness. Based on this metric, we design a cost-effective cluster featuring the 3090 with NVLink. The experiment results demonstrate that our cluster achieves remarkable cost-effectiveness in various distributed model training schemes.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Implementation of encrypted data for outsourced database

Implementation of encrypted data for outsourced database

引用

2010 2nd International Conference on Computational Intelligence and Natural computing, CINC 2010

作者： Wang, Zheng-Fei Tang, Ai-Guo Department of Computer Hunan Business College Changsha 410205 China National Key Laboratory for Parallel and Distributed Processing School of Computer National University of Defense Technology Changsha 410073 China

ISBN: (纸本)9781424477036

Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into consideration. To solve the problem, an encrypted schema, based on the Postgresql DBMS, is proposed Through the security dictionary and the extended SQL, the approach implements the encrypted storage and efficiently query over the encrypted data in the outsourced databases. Results of experiments validate the efficiency and feasibility of our approach. ©2010 IEEE.

关键词： Cryptography

来源：评论

学校读者我要写书评

暂无评论

Communication Analysis for Multidimensional parallel Training of Large-scale DNN Models

Communication Analysis for Multidimensional Parallel Trainin...

引用

IEEE International Conference on High Performance computing and Communications (HPCC)

作者： Zhiquan Lai Yanqi Hao Shengwei Li Dongsheng Li National Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha China

Multidimensional parallel training has been widely applied to train large-scale deep learning models like GPT-3. The efficiency of parameter communication among training devices/processes is often the performance bottleneck of large model training. Analysis of parameter communication mode and traffic has important reference significance for the research of interconnection network design and computing task scheduling to improve the training performance. In this paper, we analyze the parametric communication modes in typical 3D parallel training (data parallelism, pipeline parallelism, and tensor parallelism), and model the traffic in different communication modes. Finally, taking GPT-3 as an example, we present the communication in its 3D parallel training.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Towards building efficient content-based publish/subscribe systems over structured P2P overlays

Towards building efficient content-based publish/subscribe s...

引用

International Conference on parallel Processing

作者： Zhang, Shengdong Wang, Ji Shen, Rui Xu, Jie National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China School of Computing University of Leeds Leeds LS2 9JT United Kingdom

ISBN: (纸本)9780769541563

In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/ subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (eventoriented, subscription-oriented and hybrid) to make all the matched pairs (event, subscription) meet in a system. By theoretically analyzing the inherent problem of both eventoriented and subscription-oriented methods, we propose PEM (Popularity-based Event Matching), a variant of hybrid method. PEM can achieve better trade-off between event processing load and subscription storage load of a system. PEM has been verified through both mathematical and simulation-based evaluation. © 2010 IEEE.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

Character feature learning for named entity recognition

Character feature learning for named entity recognition

引用

作者： Zeng, Ping Tan, Qingping Zhang, Haoyu Meng, Xiankai Zhang, Zhuo Xu, Jianjun Lei, Yan College of Computer National University of Defense Technology Changsha China National Key Laboratory for Parallel and Distributed Processing Changsha China School of Software Engineering Chongqing University Chongqing China

The deep neural named entity recognition model automatically learns and extracts the features of entities and solves the problem of the traditional model relying heavily on complex feature engineering and obscure professional knowledge. This issue has become a hot topic in recent years. Existing deep neural models only involve simple character learning and extraction methods, which limit their capability. To further explore the performance of deep neural models, we propose two character feature learning models based on convolution neural network and long short-term memory network. These two models consider the local semantic and position features of word characters. Experiments conducted on the CoNLL-2003 dataset show that the proposed models outperform traditional ones and demonstrate excellent performance. © Copyright 2018 The Institute of Electronics Information and Communication Engineers.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

Area-NeRF: Area-based Neural Radiance Fields

Area-NeRF: Area-based Neural Radiance Fields

引用

Image Processing, Computer Vision and Machine Learning (ICICML), International Conference on

作者： Zonxin Ye Wenyu Li Peng Qiao Yong Dou National Key Laboratory of Parallel and Distributed Computing School of Computer National University of Defense Technology Changsha China

Neural Radiance Field (NeRF) has received widespread attention for its photo-realistic novel view synthesis quality. Current methods mainly represent the scene based on point sampling of ray casting, ignoring the influence of the observed area changing with distance. In addition, The current sampling strategies are all focused on the distribution of sampling points on the ray, without paying attention to the sampling of the ray. We found that the current ray sampling strategy for scenes with the camera moving forward severely reduces the convergence speed. In this work, we extend the point representation to area representation by using relative positional encoding, and propose a ray sampling strategy that is suitable for camera trajectory moving forward. We validated the effectiveness of our method on multiple public datasets.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A construction technique of constant degree P2P systems towards efficient complex queries

引用

Jisuanji Yanjiu yu Fazhan/Computer Research and Development 2011年第3期48卷 374-381页

作者： Wang, Xiaohai Peng, Yuxing Li, Dongsheng Key Laboratory of Science and Technology for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Constant degree peer-to-peer (P2P) system is turning into the P2P domain's promising hotspot due to constant degree digraphs having good propertis. However, it is often hard to convert a stardard constant degree digraph to a DHT schema. Thus, most researches focus on DHT's construction and maintenance, while leaving optimization and supporting to complex query behind. Underlying topology affects upper-layers' character a lot. For constant degree P2P topologies, their inherent property makes a constant degree P2P system built using classical technique be poor in the data locality, and unfit for efficient, low-cost complex queries. Aiming at this shortage, a general-purpose construction technique towards efficient complex queries is proposed, which adds an embedding transformation layer between data layer and DHT overlay. In this way, adjacent data are stored in overlay's adjacent peers and the data locality is improved, so that the number of peers referred in complex queries can be minimized with a limited time overhead. To validate this technology, the first constant degree P2P system based on Kautz digraph FissionE is reconstructed as a typical example, which includes re-allocating of resources, query algorithm and locality maintenance strategies. Experimental results show that this construction technique can ensure data locality, reduce query cost and lead to systems' efficiency without changing the underlying DHT layer.

关键词： Peer to peer networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：