检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Kun Hu Mingyu Cao Mengzhu Wang Long Lan Wenjing Yang Huibin Tan Institute for Quantum Information & State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

Discriminative correlation filter (DCF) is a highly efficient tracking technique using the circulant shifted samples of search images to update the template, so the reliability of input samples determines template quality. In this paper, we rethink the reliability problem of input samples in advance during template updating and propose an enhanced DCF tracking method regularized by a novel sparse representation based reliable sample construction term, called enhanced sparse correlation filter (ESCF). Specifically, the reconstructed reliable samples are the sparse representation of circulant shifted samples of unfiltered input samples, in which the target will approach the center to preserve target visual cues into the template when using the cosine window. Besides, we jointly perform template learning and reliable sample construction into a unified learning paradigm to benefit from each other, which further can be carried out in the frequency domain without incurring excessive time cost by skillful decomposition. Experiments on several popular visual tracking datasets verify the efficacy of ESCF and show that ESCF performs favorably against several well-established representative counterparts.

关键词： Training Visualization Correlation Costs Target tracking Frequency-domain analysis Signal processing

来源：评论

学校读者我要写书评

暂无评论

Decomposition, Interaction, Reconstruction Meets Global Context Learning In Visual Tracking

Decomposition, Interaction, Reconstruction Meets Global Cont...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Huibin Tan Kun Hu Mingyu Cao Mengzhu Wang Liyang Xu Wenjing Yang Institute for Quantum Information & State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

Tensor decomposition and reconstruction attention is a promising global context learning approach because it can remain efficient while avoiding feature compression. To exploit its potential even further in visual tracking, we redesign a 3D tensor modeling paradigm, namely tensor Decomposition, Interaction, Reconstruction attention (DIR), respectively corresponding to three function components, Tensor Decomposition Module (TDM), Tensor Interaction Module (TIM) and Context Reconstruction Module (CRM). Specifically, TDM decomposes a 3D tensor feature into rank-1 context fragments in different dimension views. The ingenuity here lies in the introduction of Circular Convolution for processing features at arbitrary scales and channel-sharing segments to enhance the interaction of the two branches in the Siamese network architecture. TIM obtains the tensor planes of each dimension by the Cross-Similarity operation of rank-1 tensors and fused cubic features, which brings more interactions between all feature dimensions. CRM reconstructs 3D context representations with the outputs of the above modules. In experiments, DIR is embedded into the tracker to verify its effectiveness.

关键词： Visualization Tensors Three-dimensional displays Target tracking Convolution Customer relationship management Time division multiplexing

来源：评论

学校读者我要写书评

暂无评论

An Efficient Vectorization Scheme for Stencil Computation

An Efficient Vectorization Scheme for Stencil Computation

引用

International Symposium on Parallel and Distributed Processing (IPDPS)

作者： Kun Li Liang Yuan Yunquan Zhang Yue Yue Hang Cao State Key Laboratory of Computer Architecture Chinese Academy of Sciences Institute of Computing Technology Beijing School of Computer Science and Technology University of Chinese Academy of Sciences Beijing

Stencil computation is one of the most important kernels in various scientific and engineering applications. A variety of work has focused on vectorization and tiling techniques, aiming at exploiting the in-core data parallelism and data locality respectively. In this paper, the downsides of existing vectorization schemes are analyzed. Briefly, they either incur data alignment conflicts or hurt the data locality when integrated with tiling. Then we propose a novel transpose layout to preserve the data locality for tiling and reduce the data reorganization overhead for vectorization simultaneously. To further improve the data reuse at the register level, a time loop unroll-and-jam strategy is designed to perform multistep stencil computation along the time dimension. Experimental results on the AVX2 and AVX-S12 CPUs show that our approach obtains a competitive performance with the classic vectorization methods (Auto Vectorization and Data Reorganization), state-of-the-art compilers (Pluto and SDSL), and highly-optimized work (DLT and Tessellation).

关键词： Concurrent computing Distributed processing Layout Parallel processing Registers Computational efficiency Kernel

来源：评论

学校读者我要写书评

暂无评论

IMPULP:A Hardware Approach for In-Process Memory Protection via User-Level Partitioning

引用

Journal of computer Science & technology 2020年第2期35卷 418-432页

作者： Yang-Yang Zhao Ming-Yu Chen Yu-Hang Liu Zong-Hao Yang Xiao-Jing Zhu Zong-Hui Hong Yun-Ge Guo State Key Laboratory of Computer Architecture Institute of Computing TechnologyChinese Academy of Sciences Beijing 100190China University of Chinese Academy of Sciences Beijing 100049China Peng Cheng Laboratory Shenzhen 518055China

In recent years many security attacks occur when malicious codes abuse in-process memory *** to the increasing complexity,an application program may call third-party code which cannot be controlled by programmers but may contain security *** a result,the users have the risk of suffering information leakage and control flow ***,current solutions like Intel memory protection extensions(MPX)severely degrade performance,while other approaches like Intel memory protection keys(MPK)lack flexibility in dividing security *** this paper,we propose IMPULP,an effective and efficient hardware approach for in-process memory *** rationale of IMPULP is user-level partitioning that user code segments are divided into different security domains according to their instruction addresses,and accessible memory spaces are specified dynamically for each domain via a set of boundary *** instruction related to memory access will be checked according to its security domain and the corresponding boundaries,and illegal in-process memory access of untrusted code segments will be *** can be leveraged to prevent a wide range of in-process memory abuse attacks,such as buffer overflows and memory *** verification,an FPGA prototype based on RISC-V instruction set architecture has been *** present eight tests to verify the effectiveness of IMPULP,including five memory protection function tests,a test to defense typical buffer overflow,a test to defense famous memory leakage attack named Heartbleed,and a test for security *** execute the SPEC CPU2006 benchmark programs to evaluate the efficiency of *** performance overhead of IMPULP is less than 0.2%runtime on average,which is ***,the resource overhead is less than 5.5%for hardware modification of IMPULP.

关键词： in-process isolation memory protection out-of-bounds user-level partitioning

来源：评论

学校读者我要写书评

暂无评论

TOWARDS RADAR EMITTER RECOGNITION IN CHANGING ENVIRONMENTS WITH DOMAIN GENERALIZATION

arXiv

引用

arXiv 2023年

作者： Wu, Honglin Li, Xueqiong Lan, Long Xu, Liyang Tang, Yuhua Institute for Quantum Information State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

Analyzing radar signals from complex Electronic Warfare (EW) environment is a non-trivial task. However, in the real world, the changing EW environment results in inconsistent signal distribution, such as the pulse repetition interval (PRI) mismatch between different detected scenes. In this paper, we propose a novel domain generalization framework to improve the adaptability of signal recognition in changing environments. Specifically, we first design several noise generators to simulate varied scenes. Different from conventional augmentation methods, our introduced generators carefully enhance the diversity of the detected signals and meanwhile maintain the semantic features of the signals. Moreover, we propose a signal scene domain classifier that works in the manner of adversarial learning. The proposed classifier guarantees the signal predictor to generalize to different scenes. Extensive comparative experiments prove the proposed method’s superiority. Copyright © 2023, The Authors. All rights reserved.

关键词： Electronic warfare

来源：评论

学校读者我要写书评

暂无评论

Memory-based Exploration-value Evaluation Model for Visual Navigation

Memory-based Exploration-value Evaluation Model for Visual N...

引用

IEEE International Conference on Robotics and Automation (ICRA)

作者： Yongquan Feng Liyang Xu Minglong Li Ruochun Jin Da Huang Shaowu Yang Wenjing Yang Institute for Quantum Information State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

We propose a hierarchical visual navigation solution, called Memory-based Exploration-value Evaluation Model (MEEM), to improve the agent's navigation performance. MEEM employs a hierarchical policy to tackle the challenge of sparse rewards, holds an episodic memory to store the historical information of the agent, and applies an Exploration-value Evaluation Model to calculate an exploration-value for action planning at each location in the observable area. We experimentally verify MEEM by navigation performance comparison on two datasets including the grid-map dataset and the 3D scenes Gibson dataset, where our approach achieves state-of-the-art performance on both. Specifically, the overall success rate of MEEM is 95% on the grid-map dataset while the best competitor reaches 68% only. As for the Gibson dataset, the success rate of ours and the best competitor SemExp are 69.8% and 54.4%, respectively. Ablation analysis on the tile-map dataset indicates that all three components of MEEM have positive effects.

关键词：

来源：评论

学校读者我要写书评

暂无评论

TileSpMSpV: A Tiled Algorithm for Sparse Matrix-Sparse Vector Multiplication on GPUs 22

TileSpMSpV: A Tiled Algorithm for Sparse Matrix-Sparse Vecto...

引用

Proceedings of the 51st International Conference on Parallel Processing

作者： Haonan Ji Huimin Song Shibo Lu Zhou Jin Guangming Tan Weifeng Liu Super Scientific Software Laboratory China University of Petroleum-Beijing China China Northeastern University United States of America State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences China

ISBN: (纸本)9781450397339

Sparse matrix-sparse vector multiplication (SpMSpV) is an important primitive for graph algorithms and machine learning applications. The sparsity of the input and output vectors makes its floating point efficiency in general lower than sparse matrix-vector multiplication (SpMV) and sparse matrix-matrix multiplication (SpGEMM). Existing parallel SpMSpV methods focused on various row- and column-wise storage formats and merging operations. However, the data locality and sparsity pattern of the input matrix and vector are largely ignored. We in this paper propose TileSpMSpV, a tiled algorithm for accelerating SpMSpV on GPUs. Firstly, tile-wise storage structures are developed for fast positioning a group of nonzeros in matrix and vectors. Then, we develop the TileSpMSpV algorithm on top of the storage structures. In addition, to accelerate directional optimization breadth-first search (BFS) by using TileSpMSpV, we propose a TileBFS algorithm including three kernels called Push-CSC, Push-CSR and Pull-CSC. In the experiments running on a high-end NVIDIA GPU and using 2757 sparse matrices, the TileSpMSpV algorithm outperforms TileSpMV, cuSPARSE and CombBLAS by a factor of on average 1.83, 17.18 and 17.20 (up to 7.68, 1050.02 and 235.90), respectively. Moreover, our TileBFS algorithm outperforms Gunrock and GSwitch by a factor of on average 2.88 and 4.52 (up to 21.35 and 1000.85), respectively.

关键词： GPU Sparse matrix BFS Tiling SpMSpV

来源：评论

学校读者我要写书评

暂无评论

Integrating Velox into TinkerPop for Graph Queries on Vectorized Engine

Integrating Velox into TinkerPop for Graph Queries on Vector...

引用

Electronic Information Engineering and computer technology (EIECT), International Conference on

作者： Zihao Li Liyang Xu Ruochun Jin Huan Chen Yuhua Tang Institute for Quantum Information & State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

To enhance the query efficiency of relational databases and build a unified computing backend, Meta has developed Velox, a vectorized execution engine library based on columnar storage, Currently, there is no standardized specification for computation engine, and storage in graph databases, leading to failed to effectively utilize the vectorized processing capability of modern CPU. In this paper, we propose a middleware that primarily focuses on (1) non-invasively integrating Velox into the TinkerPop framework to provide unified vectorized engine acceleration for all graph databases supporting the TinkerPop specification; (2) conducting graph queries based on the relational data storage model, eliminating the overhead of transforming the storage model into a graph storage model; (3) validating the acceleration effect of the vectorized engine on interactive workload of graph queries under a single-node environment.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Task2Morph: Differentiable Task-Inspired Framework for Contact-Aware Robot Design

Task2Morph: Differentiable Task-Inspired Framework for Conta...

引用

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Yishuai Cai Shaowu Yang Minglong Li Xinglin Chen Yunxin Mao Xiaodong Yi Wenjing Yang State Key Laboratory of High Performance Computing Institute for Quantum Information College of Computer Science and Technology National University of Defense Technology Changsha China

Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization problem and use search-based methods to find the optimal solution in the morphology space. However, they ignore the implicit knowledge of task-to-morphology mapping which can directly inspire robot design. For example, flipping heavier boxes tends to require more muscular robot arms. This paper proposes a novel and general differentiable task-inspired framework for contact-aware robot design called Task2Morph. We abstract task features highly related to task performance and use them to build a task-to-morphology mapping. Further, we embed the mapping into a differentiable robot design process, where the gradient information is leveraged for both the mapping learning and the whole optimization. The experiments are conducted on three scenarios, and the results validate that Task2Morph outperforms DiffHand, which lacks a task-inspired morphology module, in terms of efficiency and effectiveness.

关键词：

来源：评论

学校读者我要写书评

暂无评论

OL-CBBA: An Online Task Allocation Algorithm under Weak Communication Conditions

OL-CBBA: An Online Task Allocation Algorithm under Weak Comm...

引用

International Conference on Parallel and Distributed Systems (ICPADS)

作者： Jun Li Wanrong Huang Honglin Wu Zhongxuan Cai Yongjun Zhang Institute for Quantum Information & State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha China

Multiple unmanned aerial vehicles (UAVs) and multiple tasks allocation problem is difficult to solve. Existing task allocation algorithms assume that the UAVs’ position is static, and cannot assign tasks with the changing UAVs’ position simultaneously during task execution. Those algorithms reduce the efficiency of task allocation. In this paper, we propose an online consensus-based bundle algorithm (OL-CBBA) under weak communication for dynamic task allocation. We consider the situation that the location information of UAVs will constantly change during the dynamic execution of tasks. The algorithm first improves the static CBBA to an online algorithm by updating the task marginal score in task path. Moreover, we specify a flag for the convergence of individual tasks, allowing UAVs to start executing tasks earlier. Extensive comparative experiments prove the highly consistent efficiency of OL-CBBA under weak communication conditions. Specifically, the proposed OL-CBBA attains up to 22% improvement compared with CBBA.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：