检索结果-内蒙古大学图书馆

UniGNN: A unified framework for graph and hypergraph neural networks

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Huang, Jing Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://***/OneForward/UniGNN. © 2021, CC BY-NC-SA.

关键词： Graph neural networks

Unsupervised motion representation enhanced network for action recognition

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Yang, Xiaohang Kong, Lingtong Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Learning reliable motion representation between consecutive frames, such as optical flow, has proven to have great promotion to video understanding. However, the TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow. To fill the gap, we propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator. UF-TSN estimates motion cues from adjacent frames in a coarse-to-fine manner and focuses on small displacement for each level by extracting pyramid of feature and warping one to the other according to the estimated flow of the last level. Due to the lack of labeled motion for action datasets, we constrain the flow prediction with multi-scale photometric consistency and edge-aware smoothness. Compared with state-of-the-art unsupervised motion representation learning methods, our model achieves better accuracy while maintaining efficiency, which is competitive with some supervised or more complicated approaches. Copyright © 2021, The Authors. All rights reserved.

关键词： Optical flows

Scene Text Detection based on Dual-branch Multi-resolution Feature-aware Enhancement Network 10

学校读者我要写书评

暂无评论

Scene Text Detection based on Dual-branch Multi-resolution F...

10th IEEE Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2022

作者： Wang, Ruirui Li, Qishen Huang, Hua Li, Qiufeng School of Information Engineering Nanchang Hangkong University Jiangxi Nanchang China School of Software Nanchang Hangkong University Jiangxi Nanchang China Key Laboratory of Jiangxi Province for Image Processing and Pattern Rccognition Jiangxi Nanchang China Jiangxi China

ISBN: (数字)9781665422079

ISBN: (纸本)9781665422079

Arbitrary shape scene text detection becomes a challenge task due to its background complexity and shape diversity. In this paper, we propose a dual-branch multi-resolution feature-aware enhancement network (DMFE), the lower branch constructs multi-resolution features through a bidirectional feature pyramid network with weights, and the upper branch enhances the perception of multi-scale text at each level through parallel pooling modules with receptive field enhancement. The global and local co-action will integrate high-level semantic information and low-level location information, so as to generate high-quality feature map. Extensive experiments on ICDAR2015, CTW1500 and Total-Text datasets show that the proposed method effectively improves the detection performance of natural scene text. © 2022 IEEE.

关键词： Semantics

Unsupervised Motion Representation Enhanced Network for Action recognition

学校读者我要写书评

暂无评论

Unsupervised Motion Representation Enhanced Network for Acti...

IEEE International Conference on Acoustics, Speech and Signal processing

作者： Xiaohang Yang Lingtong Kong Jie Yang Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

ISBN: (纸本)9781728176055;9781728176062

关键词： Learning systems Visualization image edge detection Speech recognition Signal processing Feature extraction Reliability

Hybrid Data-Free Knowledge Distillation

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Tang, Jialiang Chen, Shuo Gong, Chen School of Computer Science and Engineering Nanjing University of Science and Technology China Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education China Jiangsu Key Laboratory of Image and Video Understanding for Social Security China Center for Advanced Intelligence Project RIKEN Japan Department of Automation Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Data-free knowledge distillation aims to learn a compact student network from a pre-trained large teacher network without using the original training data of the teacher network. Existing collection-based and generation-based methods train student networks by collecting massive real examples and generating synthetic examples, respectively. However, they inevitably become weak in practical scenarios due to the difficulties in gathering or emulating sufficient real-world data. To solve this problem, we propose a novel method called Hybrid Data-Free Distillation (HiDFD), which leverages only a small amount of collected data as well as generates sufficient examples for training student networks. Our HiDFD comprises two primary modules, i.e., the teacher-guided generation and student distillation. The teacher-guided generation module guides a Generative Adversarial Network (GAN) by the teacher network to produce high-quality synthetic examples from very few real-world collected examples. Specifically, we design a feature integration mechanism to prevent the GAN from overfitting and facilitate the reliable representation learning from the teacher network. Meanwhile, we drive a category frequency smoothing technique via the teacher network to balance the generative training of each category. In the student distillation module, we explore a data inflation strategy to properly utilize a blend of real and synthetic data to train the student network via a classifier-sharing-based feature alignment technique. Intensive experiments across multiple benchmarks demonstrate that our HiDFD can achieve state-of-the-art performance using 120 times less collected data than existing methods. Code is available at https://***/tangjialiang97/HiDFD. Copyright © 2024, The Authors. All rights reserved.

关键词： Students

OAS-Net: Occlusion aware sampling network for accurate optical flow

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Kong, Lingtong Yang, Xiaohang Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Optical flow estimation is an essential step for many real-world computer vision tasks. Existing deep networks have achieved satisfactory results by mostly employing a pyramidal coarse-to-fine paradigm, where a key process is to adopt warped target feature based on previous flow prediction to correlate with source feature for building 3D matching cost volume. However, the warping operation can lead to troublesome ghosting problem that results in ambiguity. Moreover, occluded areas are treated equally with non occluded regions in most existing works, which may cause performance degradation. To deal with these challenges, we propose a lightweight yet efficient optical flow network, named OAS-Net (occlusion aware sampling network) for accurate optical flow. First, a new sampling based correlation layer is employed without noisy warping operation. Second, a novel occlusion aware module is presented to make raw cost volume conscious of occluded regions. Third, a shared flow and occlusion awareness decoder is adopted for structure compactness. Experiments on Sintel and KITTI datasets demonstrate the effectiveness of proposed approaches. Copyright © 2021, The Authors. All rights reserved.

关键词： Optical flows

Gait Planning and Motion Control Based on Vrep Simulation for Quadruped Robot

学校读者我要写书评

暂无评论

Gait Planning and Motion Control Based on Vrep Simulation fo...

WRC Symposium on Advanced Robotics and Automation (WRC SARA)

作者： Linqi Zhou Zhihua Chen Jun Liu Zhi Liu Yumeng Chen Liting Zhang key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition and MOE Key Lab of Nondestructive Testing Technology Nanchang Hangkong University Nanchang China State Key Laboratory of Intelligent Control and Decision of Complex Systems School of Automation Beijing Institute of Technology Beijing China

Gait planning of quadruped robots plays an important role in achieving less walking, including dynamic and static gait. In this article, a static and dynamic gait control method based on center of gravity stability margin is proposed. Firstly, the robot model and kinematics modeling are introduced. Secondly, the robot’s foot static and dynamic gait were planned and the foot trajectory was designed. Finally, two types of gait of the robot were simulated using Vrep simulation software, and the differences in stability and speed between the coordinated gait with speed and stability in the static and dynamic gait of a 12 degree of freedom robot were analyzed, verifying the effectiveness of the gait control method proposed in this paper.

关键词：

Remote Sensing image Object Detection Method with Feature Denoising Fusion Module

学校读者我要写书评

暂无评论

Remote Sensing Image Object Detection Method with Feature De...

IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

作者： Penghui Chen Qishen Li Qiufeng Li Zhongyu Wu School of Information Engineering Nanchang Hangkong University Nanchang Jiangxi China Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition Nanchang Jiangxi China School of Software Nanchang Hangkong University Nanchang Jiangxi China Key Laboratory of Nondestructive Testing (Ministry of Education) Nanchang Hangkong University Nanchang Jiangxi China

Remote sensing object detection is an important research area in computer vision, widely applied in both military and civilian domains. However, challenges in remote sensing image object detection such as large image sizes, complex backgrounds, and significant variations in target scales are prevalent. To address these issues, this paper proposes a new Feature Denoising and Fusion Module (FDFM) aimed at enhancing the accuracy and robustness of object detection. This module comprises a Multi-Scale Denoising Submodule(MDS) and an Attention Optimization Submodule(AOS). The Multi-Scale Denoising Module aims to suppress lower-level texture noise by utilizing higher-level semantic features before the fusion process, reducing the impact of lower-level noise on subsequent multi-scale feature fusion. Meanwhile, the Attention Optimization Module seeks to enhance the precision of self-attention computations within the Multi-Scale Denoising Module without increasing the parameter count. The efficacy of this method was evaluated on public datasets DOTA, VisDrone, VOC and COCO, showing improvements in comparison to baseline models.

关键词：

Object Detector based on Enhanced Multi-scale Feature Fusion Pyramid Network

学校读者我要写书评

暂无评论

Object Detector based on Enhanced Multi-scale Feature Fusion...

IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

作者： Luan Zhao Xiaofeng Zhang Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition Nanchang Hangkong University Nanchang China

ISBN: (数字)9781728180281

ISBN: (纸本)9781728180298

Constructing the pyramidal architecture for the feature is currently a very effective way to obtain feature information of objects at different scales. Although the feature pyramid can realize the recognition and detection of multi-scale objects in the object detection task well, it still has some limitations. Since the feature information of different levels is often not from the same layer of the network, it is difficult to obtain the feature of different objects information at a certain scale from a certain level feature map of the pyramid network. To solve this problem, we present a novel object detection architecture, named Enhanced Multi-scale Feature Fusion Pyramid Network (EMFFPNet). Our network consists of Enhanced Multi-scale Feature Fusion Module (EMFFM) and Predictor Optimization Module (POM). In EMFFM, Features at different levels can be fused into the Enhanced features as outputs, which are more representative and deterministic. In order to enable the enhanced features to play their respective roles in the pyramid network, we assign different weights to fusion features of different levels in POM. We perform the experiments on the COCO detection benchmark. The experimental results indicate that the performance of our model is much better than the state-of-the-art model.

关键词： Object detection Predictive models Feature extraction Task analysis Information technology Optimization Standards