检索结果-内蒙古大学图书馆

Research on multi-view collaborative detection system for UAV swarms based on Pix2Pix framework and BAM attention mechanism

引用

Defence Technology(防务技术) 2025年第4期46卷 213-226页

作者： Yan Ding Qingxin Cao Bozhi Zhang Peilin Li Zhongjiao Shi School of Aerospace Engineering Beijing Institute of TechnologyBeijing 100081China

Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk *** excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future ***,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing *** angle-prior data and sparse samples further complicate *** paper presents the multi-view Collaborative detection System,which tackles the challenges of multi-view object detection in collaborative combat *** system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying ***,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is *** object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction ***,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature *** advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.

关键词： Drone swarm systems Reconnaissance and strike Image generation multi-view detection Pix2Pix framework Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

3D multi-view tumor detection in automated whole breast ultrasound using deep convolutional neural network

引用

EXPERT SYSTEMS WITH APPLICATIONS 2021年 168卷

作者： Zhou, Yue Chen, Houjin Li, Yanfeng Wang, Shu Cheng, Lin Li, Jupeng Beijing Jiaotong Univ Sch Elect & Informat Engn Beijing 100044 Peoples R China Peking Univ Peoples Hosp Beijing 100044 Peoples R China

In recent years, automated whole breast ultrasound (ABUS) has drawn attention to breast disease detection and diagnosis applications. However, reviewing ABUS volumes is a time-costing task and some subtle tumors may be missed. In this paper, a 3D multi-view tumor detection method is proposed for ABUS volumes. Firstly, a layer connected feature extraction network is designed for Faster R-CNN. Then, orthogonal multi-view slices are reconstructed and detected using this modified Faster R-CNN to extract 2D candidates. Finally, a 3D multi-view position analysis scheme is designed to fuse 2D detection results and get final 3D bounding boxes. The performance of this proposed method is evaluated on a data set of 158 volumes from 75 patients by 5-fold cross-validation. Experimental results show that our method achieves a sensitivity of 95.06% with 0.57 false positives (FPs) per volume. Compared with existing detection methods, the proposed method is more effective and general.

关键词： Automated breast ultrasound (ABUS) 3D detection multi-view detection Deep learning Majority voting Candidates fusion

来源：评论

学校读者我要写书评

暂无评论

Graph-DETR3D: Rethinking Overlapping Regions for multi-view 3D Object detection 22

Graph-DETR3D: Rethinking Overlapping Regions for Multi-View ...

引用

30th ACM International Conference on multimedia (MM)

作者： Chen, Zehui Li, Zhenyu Zhang, Shiquan Fang, Liangji Jiang, Qinhong Zhao, Feng Univ Sci & Tech China Hefei Peoples R China Harbin Inst Technol Harbin Peoples R China SenseTime Res Hong Kong Peoples R China

ISBN: (纸本)9781450392037

3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding. However, accurately detecting objects through perspective views in the 3D space is extremely difficult due to the lack of depth information. Recently, DETR3D [50] introduces a novel 3D-2D query paradigm in aggregating multi-view images for 3D object detection and achieves state-of-the-art performance. In this paper, with intensive pilot experiments, we quantify the objects located at different regions and find that the "truncated instances" (i.e., at the border regions of each image) are the main bottleneck hindering the performance of DETR3D. Although it merges multiple features from two adjacent views in the overlapping regions, DETR3D still suffers from insufficient feature aggregation, thus missing the chance to fully boost the detection performance. In an effort to tackle the problem, we propose Graph-DETR3D to automatically aggregate multi-view imagery information through graph structure learning. It constructs a dynamic 3D graph between each object query and 2D feature maps to enhance the object representations, especially at the border regions. Besides, Graph-DETR3D benefits from a novel depthinvariant multi-scale training strategy, which maintains the visual depth consistency by simultaneously scaling the image size and the object depth. Extensive experiments on the nuScenes dataset demonstrate the effectiveness and efficiency of our Graph-DETR3D. Notably, our best model achieves 49.5 NDS on the nuScenes test leaderboard, achieving new state-of-the-art in comparison with various published image-view 3D object detectors.

关键词： 3D object detection multi-view detection Transformer

来源：评论

学校读者我要写书评

暂无评论

3D Random Occlusion and multi-layer Projection for Deep multi-camera Pedestrian Localization 17th

3D Random Occlusion and Multi-layer Projection for Deep Mult...

引用

17th European Conference on Computer Vision (ECCV)

作者： Qiu, Rui Xu, Ming Yan, Yuyao Smith, Jeremy S. Yang, Xi Xian Jiaotong Liverpool Univ Sch Adv Technol Suzhou 215123 Peoples R China Univ Liverpool Dept Elect Engn & Elect Liverpool L69 3BX England

ISBN: (纸本)9783031200793;9783031200809

Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limited applications, due to the lack of annotated training samples in existing multi-view datasets, which increases the risk of overfitting. To address this problem, a data augmentation method is proposed to randomly generate 3D cylinder occlusions, on the ground plane, which are of the average size of pedestrians and projected to multiple views, to relieve the impact of overfitting in the training. Moreover, the feature map of each view is projected to multiple parallel planes at different heights, by using homographies, which allows the CNNs to fully utilize the features across the height of each pedestrian to infer the locations of pedestrians on the ground plane. The proposed 3DROM method has a greatly improved performance in comparison with the state-of-the-art deep learning based methods for multi-view pedestrian detection.

关键词： multi-view detection Deep learning Data augmentation Perspective transformations

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：