Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk *** excel in information sharing,anti-jamming cap...
详细信息
Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk *** excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future ***,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing *** angle-prior data and sparse samples further complicate *** paper presents the multi-view Collaborative detection System,which tackles the challenges of multi-view object detection in collaborative combat *** system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying ***,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is *** object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction ***,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature *** advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.
In recent years, automated whole breast ultrasound (ABUS) has drawn attention to breast disease detection and diagnosis applications. However, reviewing ABUS volumes is a time-costing task and some subtle tumors may b...
详细信息
In recent years, automated whole breast ultrasound (ABUS) has drawn attention to breast disease detection and diagnosis applications. However, reviewing ABUS volumes is a time-costing task and some subtle tumors may be missed. In this paper, a 3D multi-view tumor detection method is proposed for ABUS volumes. Firstly, a layer connected feature extraction network is designed for Faster R-CNN. Then, orthogonal multi-view slices are reconstructed and detected using this modified Faster R-CNN to extract 2D candidates. Finally, a 3D multi-view position analysis scheme is designed to fuse 2D detection results and get final 3D bounding boxes. The performance of this proposed method is evaluated on a data set of 158 volumes from 75 patients by 5-fold cross-validation. Experimental results show that our method achieves a sensitivity of 95.06% with 0.57 false positives (FPs) per volume. Compared with existing detection methods, the proposed method is more effective and general.
3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding. However, accurately detecting objects through perspective views in the 3D space is extremely difficul...
详细信息
ISBN:
(纸本)9781450392037
3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding. However, accurately detecting objects through perspective views in the 3D space is extremely difficult due to the lack of depth information. Recently, DETR3D [50] introduces a novel 3D-2D query paradigm in aggregating multi-view images for 3D object detection and achieves state-of-the-art performance. In this paper, with intensive pilot experiments, we quantify the objects located at different regions and find that the "truncated instances" (i.e., at the border regions of each image) are the main bottleneck hindering the performance of DETR3D. Although it merges multiple features from two adjacent views in the overlapping regions, DETR3D still suffers from insufficient feature aggregation, thus missing the chance to fully boost the detection performance. In an effort to tackle the problem, we propose Graph-DETR3D to automatically aggregate multi-view imagery information through graph structure learning. It constructs a dynamic 3D graph between each object query and 2D feature maps to enhance the object representations, especially at the border regions. Besides, Graph-DETR3D benefits from a novel depthinvariant multi-scale training strategy, which maintains the visual depth consistency by simultaneously scaling the image size and the object depth. Extensive experiments on the nuScenes dataset demonstrate the effectiveness and efficiency of our Graph-DETR3D. Notably, our best model achieves 49.5 NDS on the nuScenes test leaderboard, achieving new state-of-the-art in comparison with various published image-view 3D object detectors.
Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limi...
详细信息
ISBN:
(纸本)9783031200793;9783031200809
Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limited applications, due to the lack of annotated training samples in existing multi-view datasets, which increases the risk of overfitting. To address this problem, a data augmentation method is proposed to randomly generate 3D cylinder occlusions, on the ground plane, which are of the average size of pedestrians and projected to multiple views, to relieve the impact of overfitting in the training. Moreover, the feature map of each view is projected to multiple parallel planes at different heights, by using homographies, which allows the CNNs to fully utilize the features across the height of each pedestrian to infer the locations of pedestrians on the ground plane. The proposed 3DROM method has a greatly improved performance in comparison with the state-of-the-art deep learning based methods for multi-view pedestrian detection.
暂无评论