The perception module of self-driving vehicles relies on a multi-sensor system to understand its environment. Recent advancements in deep learning have led to the rapid development of approaches that integrate multi-s...
详细信息
General Matrix Multiplication (GEMM) is a critical computational operation in scientific computing and machine learning domains. While traditional GEMM performs well on large matrices, it is inefficient in terms of da...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
General Matrix Multiplication (GEMM) is a critical computational operation in scientific computing and machine learning domains. While traditional GEMM performs well on large matrices, it is inefficient in terms of data transfer and computation for small matrices. Many High-Performance Computing (HPC) tasks can be decomposed into large batches of small matrix multiplication operations. Multi-core Digital Signal Processors (DSPs) are commonly used to accelerate high-performance computing. We present a design for batched fusion small matrix multiplication (BFMM) tailored for multi-core DSP architecture. To address the inefficiencies and redundancy in storage and computational operations associated with batch small matrix multiplications, we designed several strategies. We design a matrix fusion concatenation strategy, an access coordination mechanism, and a mechanism for fragment aggregation. BFMM supports an efficient K-dimension multi-core parallelization strategy. The parameter constraint model makes BFMM highly portable. BFMM also includes a performance evaluation model that facilitates assessment and verification. Experimental results demonstrate that, compared to traditional GEMM (TGEMM) on multi-core DSP and traditional GEMM with concatenated data access (TGEMM Op), BFMM exhibits superior performance. For large batches of small matrices, our design achieves 1.21x to 18x higher performance than TGEMM Op on single-core DSP, while on multi-core DSP, it outperforms TGEMM Op by 1.14x to 18.1x.
This paper proposes a multi-objective constrained minimum weighted bipartite assignment problem(MCMWBAP), which is considered an extension of the classical bipartite matching problem(BMP). We first provide the formula...
详细信息
This paper proposes a multi-objective constrained minimum weighted bipartite assignment problem(MCMWBAP), which is considered an extension of the classical bipartite matching problem(BMP). We first provide the formulation of the MCMWBAP and prove that it is an NP-hard combinatorial optimization problem. Based on this formulation, multi-objective energy-aware shortwave radio broadcast resource allocation problem(MSRBRAP) application is studied. The goal of this problem is to allocate radio programs to transmission devices to broadcast all radio programs felicitously with a maximized objective of total qualified monitoring sites and a minimized objective of energy consumption. Then, a novel multi-objective hybrid evolutionary algorithm(MOHEA), which is integrated with push and pull initialization, the dynamic resource allocation strategy, and the aggregate local search procedure, is developed to solve the *** proposed method is evaluated using two categories of benchmarks for MCMWBAP together with a real scenario case study for MSRBRAP. Furthermore, the key components of MOHEA are analyzed, and the experimental results demonstrate that MOHEA outperforms two classical multi-objective evolutionary algorithms(NSGA-II and MOEA/D), improving working efficiency.
Instant delivery has become a fundamental service in people's daily lives. Different from the traditional express service, the instant delivery has a strict shipping time constraint after being ordered. However, t...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
Instant delivery has become a fundamental service in people's daily lives. Different from the traditional express service, the instant delivery has a strict shipping time constraint after being ordered. However, the labor shortage makes it challenging to realize efficient instant delivery. To tackle the problem, researchers have studied to introduce vehicles (i.e., taxis) or Unmanned Aerial Vehicles (UAVs or drones) into instant delivery tasks. Unfortunately, the delivery detour of taxis and the limited battery of UAVs make it hard to meet the rapidly increasing instant delivery demands. Under this circumstance, this paper proposes an air-ground cooperative instant delivery paradigm to maximize the delivery performance and meanwhile minimize the negative effects on the taxi passengers. Specifically, a data-driven delivery potential-demands-aware cooperative strategy is designed to improve the overall delivery performance of both UAVs and taxis as well as the taxi passengers' experience. The experimental results show that the proposed method improves the delivery number by 30.1% and 114.5% compared to the taxi-based and UAV-based instant delivery respectively, and shortens the delivery time by 35.7% compared to the taxi-based instant delivery.
General Matrix Multiplication (GEMM) has a wide range of applications in scientific simulation and artificial intelligence. Although traditional libraries can achieve high performance on large regular-shaped GEMMs, th...
详细信息
With serverless computing offering more efficient and cost-effective application deployment, the diversity of serverless platforms presents challenges to users, including platform lock-in and costly migration. Moreove...
With serverless computing offering more efficient and cost-effective application deployment, the diversity of serverless platforms presents challenges to users, including platform lock-in and costly migration. Moreover, due to the black box nature of function computing, traditional performance benchmarking methods are not applicable, necessitating new studies. This article presents a detailed comparison of six major public cloud function computing platforms and introduces a benchmarking framework for function computing performance. This framework aims to help users make comprehensive comparisons and select the most suitable platform for their specific needs.
Mixed-type data with both categorical and numerical features are ubiquitous in network security, but the existing methods are minimal to deal with them. Existing methods usually process mixed-type data through feature...
详细信息
Mixed-type data with both categorical and numerical features are ubiquitous in network security, but the existing methods are minimal to deal with them. Existing methods usually process mixed-type data through feature conversion, whereas their performance is downgraded by information loss and noise caused by the transformation. Meanwhile, existing methods usually superimpose domain knowledge and machine learning in which fixed thresholds are used. It cannot dynamically adjust the anomaly threshold to the actual scenario, resulting in inaccurate anomalies obtained, which results in poor performance. To address these issues, this paper proposes a novel Anomaly Detection method based on Reinforcement Learning, termed ADRL, which uses reinforcement learning to dynamically search for thresholds and accurately obtain anomaly candidate sets, fusing domain knowledge and machine learning fully and promoting each other. Specifically, ADRL uses prior domain knowledge to label known anomalies and uses entropy and deep autoencoder in the categorical and numerical feature spaces, respectively, to obtain anomaly scores combining with known anomaly information, which are integrated to get the overall anomaly scores via a dynamic integration strategy. To obtain accurate anomaly candidate sets, ADRL uses reinforcement learning to search for the best threshold. Detailedly, it initializes the anomaly threshold to get the initial anomaly candidate set and carries on the frequent rule mining to the anomaly candidate set to form the new knowledge. Then, ADRL uses the obtained knowledge to adjust the anomaly score and get the score modification rate. According to the modification rate, different threshold modification strategies are executed, and the best threshold, that is, the threshold under the maximum modification rate, is finally obtained, and the modified anomaly scores are obtained. The scores are used to re-carry out machine learning to improve the algorithm's accuracy for anomalo
Mixed-type data containing categorical and numerical features are pervasive in real life, but very limited outlier detection methods are available for these data. Some existing methods handle mixed-type data by featur...
详细信息
Mixed-type data containing categorical and numerical features are pervasive in real life, but very limited outlier detection methods are available for these data. Some existing methods handle mixed-type data by feature converting, whereas their performance is downgraded by information loss and noise caused by the transformation. Meanwhile, the existing general algorithms cannot combine the characteristics of outliers in specific fields, leading to an unsatisfying performance in actual scenarios, such as the field of network security. This paper proposes a novel Entropy and Autoencoder-based Outlier Detection in mixed-type network traffic data, termed EAOD, which combines characteristics of outliers in specific fields and machine learning models to detect outliers. EAOD utilizes the expert rules made by domain knowledge summarized based on characteristics of existing outliers to label known outlier data. It performs holoentropy and deep autoencoder for the category and numerical feature spaces, respectively, in unlabeled data to obtain outlier scores integrated to get the final outlier scores via a dynamic integration strategy. Especially in the numerical feature space, to fully mine known outlier behavior patterns, deep autoencoders of outlier and normal types are constructed separately to capture unknown outliers jointly. Experiments show that EAOD significantly outperforms eight state-of-the-art outlier detectors on seven real network traffic datasets.
Even though pre-trained language models like BERT and XLNet have produced significant consequences on a variety of tasks of natural language processing, they are difficult to deploy in practical applications due to th...
详细信息
Attack payloads are often short segments hidden in HTTP requests; thus they can be found by HTTP payload anomaly detection. Deep learning models learn data features during training without manual feature extraction, a...
详细信息
Attack payloads are often short segments hidden in HTTP requests; thus they can be found by HTTP payload anomaly detection. Deep learning models learn data features during training without manual feature extraction, and better performance has received more attention. Recurrent Neural Network models process sequences directly, which are widely used in payload anomaly detection. However, due to the gradient vanishing problem, RNN has limits on processing the long sequences. Meanwhile, RNN uses its final hidden state for detection and pays more attention to the content of the end of the payload. Besides, deep learning generally lacks interpretability. The paper proposes an unsupervised deep learning model for HTTP payload Anomaly Detection, namely Attention-based Encoder-Decoder Recurrent Neural Networks Anomaly Detection model (AEDRAD). AEDRAD utilizes the encoder-decoder RNN and attention mechanism to detect anomalies by reconstructing the original sequences. AEDRAD filters the fields of HTTP protocol that cannot exist anomalies, focusing on the suspicious segments. Through the encoder-decoder network, the normal payload can be well-reconstructed while the anomaly payload fails. With the attention mechanism, AEDRAD generates practical features for further anomaly detection from a global perspective. Meanwhile, it marks abnormal fragments visually, which is conducive to a subsequent analysis by experts. The experimental results show that AEDRAD significantly outperforms three state-of-the-art unsupervised algorithms on two real datasets.
暂无评论