检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Wu, Jingqian Xu, Rongtao Wood-Doughty, Zach Wang, Changwei Xu, Shibiao Lam, Edmund Y. The University of Hong Kong Pokfulam Hong Kong The State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing100190 China Northwestern University EvanstonIL60201 United States The Key Laboratory of Computing Power Network and Information Security Ministry of Education Shandong Computer Science Center National Supercomputer Center in Jinan Qilu University of Technology Shandong Academy of Sciences Jinan250013 China Shandong Provincial Key Laboratory of Computer Networks Shandong Fundamental Research Center for Computer Science Jinan China The School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

Local feature detection and description play an important role in many computer vision tasks, which are designed to detect and describe keypoints in any scene and any downstream task. Data-driven local feature learning methods need to rely on pixel-level correspondence for training. However, a vast number of existing approaches ignored the semantic information on which humans rely to describe image pixels. In addition, it is not feasible to enhance generic scene keypoints detection and description simply by using traditional common semantic segmentation models because they can only recognize a limited number of coarse-grained object classes. In this paper, we propose SAMFeat to introduce SAM (segment anything model), a foundation model trained on 11 million images, as a teacher to guide local feature learning. SAMFeat learns additional semantic information brought by SAM and thus is inspired by higher performance even with limited training samples. To do so, first, we construct an auxiliary task of Attention-weighted Semantic Relation Distillation (ASRD), which adaptively distillates feature relations with category-agnostic semantic information learned by the SAM encoder into a local feature learning network, to improve local feature description using semantic discrimination. Second, we develop a technique called Weakly Supervised Contrastive Learning Based on Semantic Grouping (WSC), which utilizes semantic groupings derived from SAM as weakly supervised signals, to optimize the metric space of local descriptors. Third, we design an Edge Attention Guidance (EAG) to further improve the accuracy of local feature detection and description by prompting the network to pay more attention to the edge region guided by SAM. SAMFeat’s performance on various tasks such as image matching on HPatches, and long-term visual localization on Aachen Day-Night showcases its superiority over previous local features. The release code is available at https://***/vignywang/SAMFeat

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

HARQ-IR Aided Short Packet Communications: BLER Analysis and Throughput Maximization

arXiv

引用

arXiv 2023年

作者： He, Fuchao Shi, Zheng Yang, Guanghua Li, Xiaofan Ye, Xinrong Ma, Shaodan the School of Intelligent Systems Science and Engineering and GBA B&R International Joint Research Center for Smart Logistics Jinan University Zhuhai519070 China the School of Physics and Electronic Information Anhui Normal University Wuhu241002 China the State Key Laboratory of Internet of Things for Smart City the Department of Electrical and Computer Engineering University of Macau China

This paper introduces hybrid automatic repeat request with incremental redundancy (HARQ-IR) to boost the reliability of short packet communications. The finite blocklength information theory and correlated decoding events tremendously preclude the analysis of average block error rate (BLER). Fortunately, the recursive form of average BLER motivates us to calculate its value through the trapezoidal approximation and Gauss–Laguerre quadrature. Moreover, the asymptotic analysis is performed to derive a simple expression for the average BLER at high signal-to-noise ratio (SNR). Then, we study the maximization of long term average throughput (LTAT) via power allocation meanwhile ensuring the power and the BLER constraints. For tractability, the asymptotic BLER is employed to solve the problem through geometric programming (GP). However, the GP-based solution underestimates the LTAT at low SNR due to a large approximation error in this case. Alternatively, we also develop a deep reinforcement learning (DRL)based framework to learn power allocation policy. In particular, the optimization problem is transformed into a constrained Markov decision process, which is solved by integrating deep deterministic policy gradient (DDPG) with subgradient method. The numerical results finally demonstrate that the DRL-based method outperforms the GP-based one at low SNR, albeit at the cost of increasing computational burden. Copyright © 2023, The Authors. All rights reserved.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Adaptive Optimal Electrical Resistance Tomography for Large-Area Tactile Sensing

Adaptive Optimal Electrical Resistance Tomography for Large-...

引用

IEEE International Conference on Robotics and Automation (ICRA)

作者： Wendong Zheng Huaping Liu Di Guo Wuqiang Yang Department of Computer Science and Technology Tsinghua University Beijing China the State Key Laboratory of Intelligent Technology and Systems Beijing National Research Center for Information Science and Technology Tsinghua University Beijing China School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing China Department of Electrical and Electronic Engineering The University of Manchester Manchester U.K.

It is critical to perceive physical contact for intelligent robots to safely interact in dynamic, unstructured environments. As physical contacts can occur at any location, a well-performing tactile sensing system should be able to deploy a large area on robotic surface. Some researchers have implemented large-area tactile sensors by using sensing arrays, but it is challenging to deploy many sensing elements. Electrical resistance tomography (ERT) has recently been introduced into tactile sensing to overcome some of the limitations with conventional tactile sensing arrays, and good results have been achieved for some robotic applications. However, a particular challenge is that spatial resolution is low. Although various attempts have been made to improve the performance of ERT-based tactile sensors, the intrinsic resolution issue remains unsolved. In this paper, we propose a novel adaptive optimal drive strategy for efficient ERT-based large-area tactile sensing for robotic applications, which can adaptively select the current injection and voltage measurement pattern for optimal tactile stimulus. In particular, regions of tactile contacts are preliminarily detected and localized by a base scanning pattern with only a few measurement data. According to this detected region, the adaptive strategy can select the optimal current injection and voltage measurement pattern to improve the sensing performance by maximizing the current density. To verify the effectiveness of the proposed strategy, the proposed method is comprehensively evaluated by simulation and experiments. The results revealed that the optimal strategy can effectively improve both spatial and temporal resolution.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Exploring Spatial-Temporal Multi-Frequency Analysis for High...

引用

Conference on computer Vision and Pattern Recognition (CVPR)

作者： Beibei Jin Yu Hu Qiankun Tang Jingyu Niu Zhiping Shi Yinhe Han Xiaowei Li Research Center for Intelligent Computing Systems State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences University of Chinese Academy of Sciences Capital Normal University

ISBN: (数字)9781728171685

ISBN: (纸本)9781728171692

Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current models, leading to image distortion and temporal inconsistency. We point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to uniformly deal with spatial and temporal information. Specifically, multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multilevel temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multifrequency motions under a fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over the state-of-the-art works. Source code and videos are available at https://***/Bei-Jin/STMFANet.

关键词： Discrete wavelet transforms Predictive models Wavelet analysis Streaming media Time-frequency analysis

来源：评论

学校读者我要写书评

暂无评论

Exploring spatial-temporal multi-frequency analysis for high-fidelity and temporal-consistency video prediction

arXiv

引用

arXiv 2020年

作者： Jin, Beibei Hu, Yu Tang, Qiankun Niu, Jingyu Shi, Zhiping Han, Yinhe Li, Xiaowei Research Center for Intelligent Computing Systems State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences University of Chinese Academy of Sciences Capital Normal University

Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and temporal inconsistency. In this paper, we point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to deal with spatial and temporal information in a unified manner. Specifically, multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multi-level temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multi-frequency motions under fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over the state-of-the-art works. Copyright © 2020, The Authors. All rights reserved.

关键词： Discrete wavelet transforms

来源：评论

学校读者我要写书评

暂无评论

Decentralized Optimization on Compact Submanifolds by Quantized Riemannian Gradient Tracking

引用

IEEE Transactions on Signal Processing 2025年 73卷 1851-1861页

作者： Chen, Jun Liu, Lina Zhu, Tianyi Liu, Yong Dai, Guang Jiang, Yunliang Tsang, Ivor W. Zhejiang Normal University National Special Education Resource Center for Children with Autism Hangzhou311231 China Zhejiang Normal University School of Computer Science and Technology Jinhua321004 China China Mobile Research Institute Beijing100032 China Zhejiang University Institute of Cyber-Systems and Control Hangzhou310027 China State Grid Corporation of China SGIT AI Lab China Zhejiang Normal University Zhejiang Key Laboratory of Intelligent Education Technology and Application Jinhua321004 China Huzhou University School of Information Engineering Huzhou313000 China Centre for Frontier Artificial Intelligence Research A*STAR Singapore

This paper considers the problem of decentralized optimization on compact submanifolds, where a finite sum of smooth (possibly non-convex) local functions is minimized by n agents forming an undirected and connected graph. However, the efficiency of distributed optimization is often hindered by communication bottlenecks. To mitigate this, we propose the Quantized Riemannian Gradient Tracking (Q-RGT) algorithm, where agents update their local variables using quantized gradients. The introduction of quantization noise allows our algorithm to bypass the constraints of the accurate Riemannian projection operator (such as retraction), further improving iterative efficiency. To the best of our knowledge, this is the first algorithm to achieve an O(1/K) convergence rate in the presence of quantization, matching the convergence rate of methods without quantization. Additionally, we explicitly derive lower bounds on decentralized consensus associated with a function of quantization levels. Numerical experiments demonstrate that Q-RGT performs comparably to non-quantized methods while reducing communication bottlenecks and computational overhead. © 1991-2012 IEEE.

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

architecture and key Technologies of Parallel Dispatching System for Railway Technical Operation Stations

Architecture and Key Technologies of Parallel Dispatching Sy...

引用

Complex systems and intelligent Science (CSIS-IAC), International Annual Conference on

作者： Gang Xiong Donghu Yang Wei Xu Runmei Li Shichao Chen Bing Song Xisong Dong Fenghua Zhu State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences Beijing China Beijing Engineering Research Center of Intelligent Systems and Technology Institute of Automation Chinese Academy of Sciences Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China Signal and Communication Research Institute China Academy of Railway Science Corporation Limited Beijing China The School of Electronic and Information Engineering Beijing Jiaotong University Beijing China Beijing Huairou Academy of Parallel Sensing Beijing China Guangdong Engineering Research Center of 3D Printing and Intelligent Manufacturing Cloud Computing Center Chinese Academy of Sciences Dongguan China

A railroad technical operation station is an extensive and complex operation system and a variety of random disturbing factors further increase the complexity of its command and dispatching. To enhance the railroad technical operation stations' intelligence, and integration level, and enhance its dispatching management ability of complex operation plans, based on the ACP approach, this paper designs the parallel dispatching system (PDS) architecture, and describes its key technologies including the artificial dispatching system (ADS), computing experiments, parallel execution, etc. On the basis of the artificial dispatching system, computational experiments can optimize various types of technical operations and the overall operational process, to achieve parallel execution between the actual dispatching system and ADS. PDS can provide more efficient and intelligent command and dispatching solutions for railroad technical work stations, and promote the development of the railroad transportation industry in the direction of higher quality and efficiency.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Hyperspectral Image Classification via Neighborhood Adaptive Graph Isomorphism Network

引用

IEEE Transactions on Geoscience and Remote Sensing 2025年 63卷

作者： Zhang, Jian Tu, Bing Liu, Bo Li, Jun Plaza, Antonio Nanjing University of Information Science and Technology Institute of Optics and Electronics State Key Laboratory Cultivation Base of Atmospheric Optoelectronic Detection and Information Fusion Jiangsu International Joint Laboratory on Meteorological Photonics and Optoelectronic Detection Jiangsu Engineering Research Center for Intelligent Optoelectronic Sensing Technology of Atmosphere Nanjing210044 China Faculty of Computer Science Wuhan430074 China University of Extremadura Hyperspectral Computing Laboratory Department of Technology of Computers and Communications Escuela Politécnica Cáceres10003 Spain

Graph convolutional network (GCN) has garnered significant attention in hyperspectral image (HSI) classification due to their ability to model non-Euclidean structured data. Compared with convolutional neural network (CNN), GCN can perform convolutions over irregular image regions and learn global dependencies among pixels in the whole image. Most existing GCN-based methods in the HSI community rely on average aggregation or weighted average aggregation strategies to aggregate neighboring node features. This process tends to obscure the differences between the nodes. However, for HSI classification tasks with obvious intra-class variability, average aggregation is a suboptimal choice. Moreover, the quality of the initial graph structure plays a crucial role in the model’s capacity to represent spectral relationships effectively. To mitigate these issues, we propose a neighborhood adaptive graph isomorphism network (NAGIN) for HSI classification to ensure that the diversified spectra representation of land-cover can be effectively captured. The neighborhood adaptive block (NAB) enhances spectral discriminability between land-cover classes via spectral reconstruction, enabling more precise removal of anomalous pixels in neighboring nodes. The graph isomorphism network (GIN) aggregates the features of neighboring nodes in an isomorphic manner to obtain multiple spectral expressions of the same type of land-cover, ensuring that the spectral features of different land-cover classes can be accurately distinguished. The Kolmogorov-Arnold network (KAN) leverages its ability to learn adaptive activation functions to better extract and refine the spectral features aggregated by GIN. Experimental results demonstrate that NAB can effectively improve the quality of the graph structure, the GIN aggregation method is competitive in HSI classification, and the proposed NAGIN outperforms the state-of-the-art methods on several public HSI datasets. © 1980-2012 IEEE.

关键词： Network theory (graphs)

来源：评论

学校读者我要写书评

暂无评论

GSLB: The Graph Structure Learning Benchmark

arXiv

引用

arXiv 2023年

作者： Li, Zhixun Wang, Liang Sun, Xin Luo, Yifan Zhu, Yanqiao Chen, Dingshuo Luo, Yingtao Zhou, Xiangxin Liu, Qiang Wu, Shu Yu, Jeffrey Xu Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Hong Kong Center for Research on Intelligent Perception Computing State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences China School of Artificial Intelligence University of Chinese Academy of Sciences China Department of Automation University of Science and Technology of China China School of Cyberspace Security Beijing University of Posts and Telecommunications China Department of Computer Science University of California Los Angeles United States Heinz College of Information Systems and Public Policy Machine Learning Department School of Computer Science Carnegie Mellon University United States

Graph Structure Learning (GSL) has recently garnered considerable attention due to its ability to optimize both the parameters of Graph Neural Networks (GNNs) and the computation graph structure simultaneously. Despite the proliferation of GSL methods developed in recent years, there is no standard experimental setting or fair comparison for performance evaluation, which creates a great obstacle to understanding the progress in this field. To fill this gap, we systematically analyze the performance of GSL in different scenarios and develop a comprehensive Graph Structure Learning Benchmark (GSLB) curated from 20 diverse graph datasets and 16 distinct GSL algorithms. Specifically, GSLB systematically investigates the characteristics of GSL in terms of three dimensions: effectiveness, robustness, and complexity. We comprehensively evaluate state-of-the-art GSL algorithms in node- and graph-level tasks, and analyze their performance in robust learning and model complexity. Further, to facilitate reproducible research, we have developed an easy-to-use library for training, evaluating, and visualizing different GSL methods. Empirical results of our extensive experiments demonstrate the ability of GSL and reveal its potential benefits on various downstream tasks, offering insights and opportunities for future research. The code of GSLB is available at: https://***/GSL-Benchmark/GSLB. © 2023, CC BY.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

S$^{\text{4}}$TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles

IEEE Transactions on Intelligent Vehicles

引用

IEEE Transactions on intelligent Vehicles 2023年第2期9卷 3220-3231页

作者： Xiao Wang Ke Tang Xingyuan Dai Jintao Xu Quancheng Du Rui Ai Yuxiao Wang Weihao Gu Engineering Research Center of Autonomous Unmanned System Technology Ministry of Education Anhui University Hefei China Haomo Technology Company Ltd. Beijing China State Key Laboratory for Management and Control of Complex Systems Institute of Automation Chinese Academy of Sciences Beijing China Qingdao Academy of Intelligent Industries Qingdao China School of Computer and Communication Engineering University of Science and Technology Beijing Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China

In public roads, autonomous vehicles (AVs) face the challenge of frequent interactions with human-driven vehicles (HDVs), which render uncertain driving behavior due to varying social characteristics among humans. To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and safety-sensitive trajectory planning (S $^{\text{4}}$ TP) framework. Specifically, S $^{\text{4}}$ TP integrates the Social-Aware Trajectory Prediction (SATP) and Social-Aware Driving Risk Field (SADRF) modules. SATP utilizes Transformers to effectively encode the driving scene and incorporates an AV's planned trajectory during the prediction decoding process. SADRF assesses the expected surrounding risk degrees during AVs-HDVs interactions, each with different social characteristics, visualized as two-dimensional heat maps centered on the AV. SADRF models the driving intentions of the surrounding HDVs and predicts trajectories based on the representation of vehicular interactions. S $^{\text{4}}$ TP employs an optimization-based approach for motion planning, utilizing the predicted HDVs' trajectories as input. With the integration of SADRF, S $^{\text{4}}$ TP executes real-time online optimization of the planned trajectory of AV within low-risk regions, thus improving the safety and the interpretability of the planned trajectory. We have conducted comprehensive tests of the proposed method using the SMARTS simulator. Experimental results in complex social scenarios, such as unprotected left-turn intersections, merging, cruising, and overtaking, validate the superiority of our proposed S $^{\text{4}}$ TP in terms of safety and rationality. S $^{\text{4}}$ TP achieves a pass rate of 100% across all scenarios, surpassing the current state-of-the-art methods Fanta of 98.25% and Predictive-Decision of 94.75%.

关键词： Trajectory Safety Vehicles Trajectory planning Autonomous vehicles Risk management Behavioral sciences

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：