检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Zhu, Xi Wang, Likang Zhou, Caifa Cao, Xiya Gong, Yue Chen, Lei Riemann Laboratory Huawei Technologies 2012 Laboratories China Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Parallel Distributed Computing Laboratory Huawei Technologies 2012 Laboratories China

The perception module of self-driving vehicles relies on a multi-sensor system to understand its environment. Recent advancements in deep learning have led to the rapid development of approaches that integrate multi-sensory measurements to enhance perception capabilities. This paper surveys the latest deep learning integration techniques applied to the perception module in autonomous driving systems, categorizing integration approaches based on "what, how, and when to integrate." A new taxonomy of integration is proposed, based on three dimensions: multi-view, multi-modality, and multi-frame. The integration operations and their pros and cons are summarized, providing new insights into the properties of an "ideal" data integration approach that can alleviate the limitations of existing methods. After reviewing hundreds of relevant papers, this survey concludes with a discussion of the key features of an optimal data integration approach. © 2023, CC BY-NC-SA.

关键词： Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

Optimizing Batched Small Matrix Multiplication on Multi-core DSP Architecture

Optimizing Batched Small Matrix Multiplication on Multi-core...

引用

International Symposium on parallel and distributed Processing with Applications, ISPA

作者： Xiaohan Zuo Chunhua Xiao Qinglin Wang Chen Shi College of Computer Science Chongqing University China Key Laboratory of Dependable Service Computing in Cyber Physical Society Ministry of Education China National Key Laboratory of Parallel and Distributed Computing National University of Defense Technology Changsha China Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology Changsha China

ISBN: (数字)9798331509712

ISBN: (纸本)9798331509729

General Matrix Multiplication (GEMM) is a critical computational operation in scientific computing and machine learning domains. While traditional GEMM performs well on large matrices, it is inefficient in terms of data transfer and computation for small matrices. Many High-Performance Computing (HPC) tasks can be decomposed into large batches of small matrix multiplication operations. Multi-core Digital Signal Processors (DSPs) are commonly used to accelerate high-performance computing. We present a design for batched fusion small matrix multiplication (BFMM) tailored for multi-core DSP architecture. To address the inefficiencies and redundancy in storage and computational operations associated with batch small matrix multiplications, we designed several strategies. We design a matrix fusion concatenation strategy, an access coordination mechanism, and a mechanism for fragment aggregation. BFMM supports an efficient K-dimension multi-core parallelization strategy. The parameter constraint model makes BFMM highly portable. BFMM also includes a performance evaluation model that facilitates assessment and verification. Experimental results demonstrate that, compared to traditional GEMM (TGEMM) on multi-core DSP and traditional GEMM with concatenated data access (TGEMM Op), BFMM exhibits superior performance. For large batches of small matrices, our design achieves 1.21x to 18x higher performance than TGEMM Op on single-core DSP, while on multi-core DSP, it outperforms TGEMM Op by 1.14x to 18.1x.

关键词： Performance evaluation Scientific computing High performance computing Digital signal processors Redundancy Batch production systems Signal processing algorithms computer architecture Machine learning Matrix decomposition

来源：评论

学校读者我要写书评

暂无评论

Solving multi-objective constrained minimum weighted bipartite assignment problem: a case study on energy-aware radio broadcast scheduling

引用

science China(Information sciences) 2022年第8期65卷 23-38页

作者： Yupeng ZHOU Mingjie FAN Feifei MA Minghao YIN College of Information Science and Technology Northeast Normal University Laboratory of Parallel Software and Computational Science Institute of Software Chinese Academy of Sciences State Key laboratory of Computer Science Institute of Software Chinese Academy of Sciences

This paper proposes a multi-objective constrained minimum weighted bipartite assignment problem(MCMWBAP), which is considered an extension of the classical bipartite matching problem(BMP). We first provide the formulation of the MCMWBAP and prove that it is an NP-hard combinatorial optimization problem. Based on this formulation, multi-objective energy-aware shortwave radio broadcast resource allocation problem(MSRBRAP) application is studied. The goal of this problem is to allocate radio programs to transmission devices to broadcast all radio programs felicitously with a maximized objective of total qualified monitoring sites and a minimized objective of energy consumption. Then, a novel multi-objective hybrid evolutionary algorithm(MOHEA), which is integrated with push and pull initialization, the dynamic resource allocation strategy, and the aggregate local search procedure, is developed to solve the *** proposed method is evaluated using two categories of benchmarks for MCMWBAP together with a real scenario case study for MSRBRAP. Furthermore, the key components of MOHEA are analyzed, and the experimental results demonstrate that MOHEA outperforms two classical multi-objective evolutionary algorithms(NSGA-II and MOEA/D), improving working efficiency.

关键词： constrained minimum weighted bipartite assignment multi-objective energy-aware radio broadcast scheduling evolutionary algorithms dynamic resource allocation aggregate local search

来源：评论

学校读者我要写书评

暂无评论

Cooperative Air-Ground Instant Delivery by UAVs and Crowdsourced Taxis

Cooperative Air-Ground Instant Delivery by UAVs and Crowdsou...

引用

International Conference on Data Engineering

作者： Junhui Gao Qianru Wang Xin Zhang Juan Shi Xiang Zhao Qingye Han Yan Pan School of Computer Science Northwestern Polytechnical University School of Computer Science and Technology Xidian University National Key Laboratory of Information Systems Engineering National University of Defense Technology Air Force Engineering University Laboratory for Big Data and Decision National University of Defense Technology School of Management Science and Real Estate Chongqing University National Key Laboratory of Parallel and Distributed Computing National University of Defense Technology

ISBN: (数字)9798350317152

ISBN: (纸本)9798350317169

Instant delivery has become a fundamental service in people's daily lives. Different from the traditional express service, the instant delivery has a strict shipping time constraint after being ordered. However, the labor shortage makes it challenging to realize efficient instant delivery. To tackle the problem, researchers have studied to introduce vehicles (i.e., taxis) or Unmanned Aerial Vehicles (UAVs or drones) into instant delivery tasks. Unfortunately, the delivery detour of taxis and the limited battery of UAVs make it hard to meet the rapidly increasing instant delivery demands. Under this circumstance, this paper proposes an air-ground cooperative instant delivery paradigm to maximize the delivery performance and meanwhile minimize the negative effects on the taxi passengers. Specifically, a data-driven delivery potential-demands-aware cooperative strategy is designed to improve the overall delivery performance of both UAVs and taxis as well as the taxi passengers' experience. The experimental results show that the proposed method improves the delivery number by 30.1% and 114.5% compared to the taxi-based and UAV-based instant delivery respectively, and shortens the delivery time by 35.7% compared to the taxi-based instant delivery.

关键词： Autonomous aerial vehicles Data engineering Air to ground communication Trajectory Batteries Time factors Task analysis

来源：评论

学校读者我要写书评

暂无评论

Optimizing Irregular-Shaped Matrix-Matrix Multiplication on Multi-Core DSPs

arXiv

引用

arXiv 2022年

作者： Yin, Shangfei Wang, Qinglin Hao, Ruochen Zhou, Tianyang Mei, Songzhu Liu, Jie Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China College of Computer National University of Defense Technology Changsha410073 China

General Matrix Multiplication (GEMM) has a wide range of applications in scientific simulation and artificial intelligence. Although traditional libraries can achieve high performance on large regular-shaped GEMMs, they often behave not well on irregular-shaped GEMMs, which are often found in new algorithms and applications of high-performance computing (HPC). Due to energy efficiency constraints, low-power multi-core digital signal processors (DSPs) have become an alternative architecture in HPC systems. Targeting multi-core DSPs in FTm7032, a prototype CPU-DSPs heterogeneous processor for HPC, an efficient implementation - ftIMM - for three types of irregular-shaped GEMMs is proposed. FtIMM supports automatic generation of assembly micro-kernels, two parallelization strategies, and auto-tuning of block sizes and parallelization strategies. The experiments show that ftIMM can get better performance than the traditional GEMM implementations on multi-core DSPs in FT-m7032, yielding on up to 7.2× performance improvement, when performing on irregular-shaped GEMMs. And ftIMM on multi-core DSPs can also far outperform the open source library on multi-core CPUs in FT-m7032, delivering up to 3.1× higher efficiency. Copyright © 2022, The Authors. All rights reserved.

关键词： Digital signal processors

来源：评论

学校读者我要写书评

暂无评论

Feature and Performance Comparison of FaaS Platforms

Feature and Performance Comparison of FaaS Platforms

引用

IEEE International Conference on Software Engineering and Service sciences (ICSESS)

作者： Penghui Ma Peichang Shi Guodong Yi National Key Laboratory of Parallel and Distributed Processing College of Computer Science National University of Defense Technology Changsha China Key Laboratory of Software Engineering for Complex Systems College of Computer Science National University of Defense Technology Changsha China Xiangjiang Lab Changsha China School of Advanced Interdisciplinary Studies Hunan University Of Technology and Business Changsha China

With serverless computing offering more efficient and cost-effective application deployment, the diversity of serverless platforms presents challenges to users, including platform lock-in and costly migration. Moreover, due to the black box nature of function computing, traditional performance benchmarking methods are not applicable, necessitating new studies. This article presents a detailed comparison of six major public cloud function computing platforms and introduces a benchmarking framework for function computing performance. This framework aims to help users make comprehensive comparisons and select the most suitable platform for their specific needs.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Effective Anomaly Detection Based on Reinforcement Learning in Network Traffic Data

Effective Anomaly Detection Based on Reinforcement Learning ...

引用

International Conference on parallel and distributed Systems (ICPADS)

作者： Zhongyang Wang Yijie Wang Hongzuo Xu Yongjun Wang Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology

Mixed-type data with both categorical and numerical features are ubiquitous in network security, but the existing methods are minimal to deal with them. Existing methods usually process mixed-type data through feature conversion, whereas their performance is downgraded by information loss and noise caused by the transformation. Meanwhile, existing methods usually superimpose domain knowledge and machine learning in which fixed thresholds are used. It cannot dynamically adjust the anomaly threshold to the actual scenario, resulting in inaccurate anomalies obtained, which results in poor performance. To address these issues, this paper proposes a novel Anomaly Detection method based on Reinforcement Learning, termed ADRL, which uses reinforcement learning to dynamically search for thresholds and accurately obtain anomaly candidate sets, fusing domain knowledge and machine learning fully and promoting each other. Specifically, ADRL uses prior domain knowledge to label known anomalies and uses entropy and deep autoencoder in the categorical and numerical feature spaces, respectively, to obtain anomaly scores combining with known anomaly information, which are integrated to get the overall anomaly scores via a dynamic integration strategy. To obtain accurate anomaly candidate sets, ADRL uses reinforcement learning to search for the best threshold. Detailedly, it initializes the anomaly threshold to get the initial anomaly candidate set and carries on the frequent rule mining to the anomaly candidate set to form the new knowledge. Then, ADRL uses the obtained knowledge to adjust the anomaly score and get the score modification rate. According to the modification rate, different threshold modification strategies are executed, and the best threshold, that is, the threshold under the maximum modification rate, is finally obtained, and the modified anomaly scores are obtained. The scores are used to re-carry out machine learning to improve the algorithm's accuracy for anomalo

关键词： Knowledge engineering Machine learning algorithms Conferences distributed databases Reinforcement learning Telecommunication traffic Network security

来源：评论

学校读者我要写书评

暂无评论

Entropy and Autoencoder-Based Outlier Detection in Mixed-Type Network Traffic Data

Entropy and Autoencoder-Based Outlier Detection in Mixed-Typ...

引用

IEEE International Conference on Big Data and Cloud Computing (BdCloud)

作者： Zhongyang Wang Yijie Wang Zhenyu Huang Yongjun Wang Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology

Mixed-type data containing categorical and numerical features are pervasive in real life, but very limited outlier detection methods are available for these data. Some existing methods handle mixed-type data by feature converting, whereas their performance is downgraded by information loss and noise caused by the transformation. Meanwhile, the existing general algorithms cannot combine the characteristics of outliers in specific fields, leading to an unsatisfying performance in actual scenarios, such as the field of network security. This paper proposes a novel Entropy and Autoencoder-based Outlier Detection in mixed-type network traffic data, termed EAOD, which combines characteristics of outliers in specific fields and machine learning models to detect outliers. EAOD utilizes the expert rules made by domain knowledge summarized based on characteristics of existing outliers to label known outlier data. It performs holoentropy and deep autoencoder for the category and numerical feature spaces, respectively, in unlabeled data to obtain outlier scores integrated to get the final outlier scores via a dynamic integration strategy. Especially in the numerical feature space, to fully mine known outlier behavior patterns, deep autoencoders of outlier and normal types are constructed separately to capture unknown outliers jointly. Experiments show that EAOD significantly outperforms eight state-of-the-art outlier detectors on seven real network traffic datasets.

关键词： Telecommunication traffic Machine learning Detectors Network security Feature extraction Entropy Data models

来源：评论

学校读者我要写书评

暂无评论

FasterBERT with Double Loss Function Based on Batch Data Preprocessing 2

FasterBERT with Double Loss Function Based on Batch Data Pre...

引用

2nd International Conference on Electronics, Communications and Information technology, CECIT 2021

作者： Duan, Pengqi Zheng, Yudong Lu, Jun College of Computer Science and Technology Heilongjiang University HLJU Key Laboratory of Database and Parallel Computing of Heilongjiang Province Harbin China

ISBN: (纸本)9781665437578

Even though pre-trained language models like BERT and XLNet have produced significant consequences on a variety of tasks of natural language processing, they are difficult to deploy in practical applications due to their large number of parameters, and huge calculation and memory costs. In order to improve the training speed and inference speed of the pre-trained model, the FasterBERT model is proposed in this paper. This model combines batch data preprocessing method with knowledge distillation. It increases the model to increase the model speed and maintains the original accuracy. Furthermore, double loss function is incorporated during the knowledge distillation process in this paper, so that the model can better learn the knowledge of the teacher model and the knowledge of the real labels. The accuracy of the model is improved. The effectiveness of the FasterBERT model is demonstrated on 12 datasets in this paper. Training speed of the FasterBERT is 1.1-2.2 times higher than FastBERT and 2-43 times higher than BERT. In addition, the accuracy is up to 0.6% higher than that of FastBERT, and the accuracy is better than that of the BERT model on three datasets. The experiments show that the FasterBERT model proposed in this paper performs well on six Chinese datasets and six English datasets. The speed is greatly improved while the accuracy of the task also has a significant effect. © 2021 IEEE.

关键词： Speed

来源：评论

学校读者我要写书评

暂无评论

Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly Detection

Attention-based Encoder-Decoder Recurrent Neural Networks fo...

引用

IEEE International Conference on Big Data and Cloud Computing (BdCloud)

作者： Shang Wu Yijie Wang Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology

Attack payloads are often short segments hidden in HTTP requests; thus they can be found by HTTP payload anomaly detection. Deep learning models learn data features during training without manual feature extraction, and better performance has received more attention. Recurrent Neural Network models process sequences directly, which are widely used in payload anomaly detection. However, due to the gradient vanishing problem, RNN has limits on processing the long sequences. Meanwhile, RNN uses its final hidden state for detection and pays more attention to the content of the end of the payload. Besides, deep learning generally lacks interpretability. The paper proposes an unsupervised deep learning model for HTTP payload Anomaly Detection, namely Attention-based Encoder-Decoder Recurrent Neural Networks Anomaly Detection model (AEDRAD). AEDRAD utilizes the encoder-decoder RNN and attention mechanism to detect anomalies by reconstructing the original sequences. AEDRAD filters the fields of HTTP protocol that cannot exist anomalies, focusing on the suspicious segments. Through the encoder-decoder network, the normal payload can be well-reconstructed while the anomaly payload fails. With the attention mechanism, AEDRAD generates practical features for further anomaly detection from a global perspective. Meanwhile, it marks abnormal fragments visually, which is conducive to a subsequent analysis by experts. The experimental results show that AEDRAD significantly outperforms three state-of-the-art unsupervised algorithms on two real datasets.

关键词： Deep learning Training Recurrent neural networks Protocols Focusing Manuals Feature extraction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：