Monocular 6D pose estimation is a functional task in the field of com-puter vision and *** recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based ***,for m...
详细信息
Monocular 6D pose estimation is a functional task in the field of com-puter vision and *** recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based ***,for monocular 6D pose estimation,these methods are affected by the prediction results of the 2D-3D correspondences and the robustness of the per-spective-n-point(PnP)*** is still a difference in the distance from the expected estimation *** obtain a more effective feature representation result,edge enhancement is proposed to increase the shape information of the object by analyzing the influence of inaccurate 2D-3D matching on 6D pose regression and comparing the effectiveness of the intermediate ***,although the transformation matrix is composed of rotation and translation matrices from 3D model points to 2D pixel points,the two variables are essentially different and the same network cannot be used for both variables in the regression ***,to improve the effectiveness of the PnP algo-rithm,this paper designs a dual-branch PnP network to predict rotation and trans-lation ***,the proposed method is verified on the public LM,LM-O and YCB-Video *** ADD(S)values of the proposed method are 94.2 and 62.84 on the LM and LM-O datasets,*** AUC of ADD(-S)value on YCB-Video is *** experimental results show that the performance of the proposed method is superior to that of similar methods.
The architectural advancements in deep neural networks have led to remarkable leap-forwards across a broad array of computer vision tasks. Instead of relying on human expertise, neural architecture search (NAS) has em...
详细信息
In this correspondence, we propose a movable antenna (MA)-aided multi-user hybrid beamforming scheme with a sub-connected structure, where multiple movable sub-arrays can independently change their positions within di...
详细信息
In the field of autonomous vehicles(AVs),accurately discerning commander intent and executing linguistic commands within a visual context presents a significant *** paper introduces a sophisticated encoder-decoder fra...
详细信息
In the field of autonomous vehicles(AVs),accurately discerning commander intent and executing linguistic commands within a visual context presents a significant *** paper introduces a sophisticated encoder-decoder framework,developed to address visual grounding in *** Context-Aware Visual Grounding(CAVG)model is an advanced system that integrates five core encoders—Text,Emotion,Image,Context,and Cross-Modal—with a multimodal *** integration enables the CAVG model to adeptly capture contextual semantics and to learn human emotional features,augmented by state-of-the-art Large Language Models(LLMs)including *** architecture of CAVG is reinforced by the implementation of multi-head cross-modal attention mechanisms and a Region-Specific Dynamic(RSD)layer for attention *** architectural design enables the model to efficiently process and interpret a range of cross-modal inputs,yielding a comprehensive understanding of the correlation between verbal commands and corresponding visual *** evaluations on the Talk2Car dataset,a real-world benchmark,demonstrate that CAVG establishes new standards in prediction accuracy and operational ***,the model exhibits exceptional performance even with limited training data,ranging from 50%to 75%of the full *** feature highlights its effectiveness and potential for deployment in practical AV ***,CAVG has shown remarkable robustness and adaptability in challenging scenarios,including long-text command interpretation,low-light conditions,ambiguous command contexts,inclement weather conditions,and densely populated urban environments.
With the advancement of artificial intelligence (AI) technologies, novel and inventive approaches for addressing complex problems are coming to the forefront. Neuromorphic computing based on AI technologies stands as ...
详细信息
Field-Programmable Gate Arrays (FPGAs) and Field-Programmable Analog Arrays (FPAAs) are reconfigurable circuits that enable flexible digital and analog implementations post-manufacturing. FPGAs are widely used in tele...
详细信息
The proliferation of peer-to-peer (P2P) lending platforms has ushered in a new era of financial accessibility, but it has also brought to the forefront the growing concern of loan defaults. This paper explores the inc...
详细信息
We apply implicit neural representations—which naturally capture spectral regularity—to reconstruct color Fourier ptychographic microscopy images from spectrally-sparse measurements. We conduct experiments on real-w...
DL techniques have increased the efficiency of decision making in different areas. However, in the case of the presence of uncertainties in the data or in the environment, decision-making requires the explainability o...
详细信息
The Petri-net-based information flow analysis offers an effective approach for detecting information leakage by the concept of non-interference. Although the related studies propose efficient solutions, they lack quan...
详细信息
暂无评论