In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud d...
详细信息
Automatic game cameras are commonly used for monitoring wildlife as they allow to document of the activity of animals in a non-invasive manner. By utilizing a large number of cameras and identifying individual animals...
详细信息
In skeleton-based action recognition, graph convolutional networks (GCN) have been applied to extract features based on the dynamic of the human body and the method has achieved excellent results recently. However, GC...
详细信息
Human pose estimation is a fundamental yet challenging task in computer vision. However, difficult scenarios such as invisible keypoints, occlusions and small-scale persons are still not well-handed. In this paper, we...
详细信息
In this article, a new vision- and grating-sensor-based intelligent unmanned settlement (IUS) system is proposed for convenience stores to automatically recognize the shopping behavior of customers, record their ident...
详细信息
Monocular 6D pose estimation is a functional task in the field of com-puter vision and *** recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based ***,for m...
详细信息
Monocular 6D pose estimation is a functional task in the field of com-puter vision and *** recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based ***,for monocular 6D pose estimation,these methods are affected by the prediction results of the 2D-3D correspondences and the robustness of the per-spective-n-point(PnP)*** is still a difference in the distance from the expected estimation *** obtain a more effective feature representation result,edge enhancement is proposed to increase the shape information of the object by analyzing the influence of inaccurate 2D-3D matching on 6D pose regression and comparing the effectiveness of the intermediate ***,although the transformation matrix is composed of rotation and translation matrices from 3D model points to 2D pixel points,the two variables are essentially different and the same network cannot be used for both variables in the regression ***,to improve the effectiveness of the PnP algo-rithm,this paper designs a dual-branch PnP network to predict rotation and trans-lation ***,the proposed method is verified on the public LM,LM-O and YCB-Video *** ADD(S)values of the proposed method are 94.2 and 62.84 on the LM and LM-O datasets,*** AUC of ADD(-S)value on YCB-Video is *** experimental results show that the performance of the proposed method is superior to that of similar methods.
We present a task from the critical infrastructure field in materials engineering. We created a surrogate model for the bridge construction object to determine the material parameters' values. The work aims to use...
详细信息
Remote photoplethysmography (rPPG) aims to measure non-contact physiological signals from facial videos, which has shown great potential in many applications. Most existing methods directly extract video-based rPPG fe...
详细信息
In this short paper, we present the devised solutions for the subject identification and relapse detection tasks, which are part of the e-Prevention Challenge hosted at the ICASSP 2023 conference [1] [2] [3]. We speci...
详细信息
In this short paper, we present the devised solutions for the subject identification and relapse detection tasks, which are part of the e-Prevention Challenge hosted at the ICASSP 2023 conference [1] [2] [3]. We specifically design an ensemble scheme of six models - five transformer-based ones and a CNN model - for the identification of subjects from wearable devices, while a personalized - one for each subject - scheme is used for relapse detection in psychotic disorder. Our final submitted solutions yield top performance on both tracks of the challenge: we ranked 2 nd on the subject identification task (with an accuracy of 93.85%) and 1 st on the relapse detection task (with a ROC-AUC and PR-AUC of about 0.65). Code and details are available at https://***/perceivelab/e-prevention-icassp-2023.
Forcing the answer of the Question Answering (QA) task to be a single text span might be restrictive since the answer can be multiple spans in the context. Moreover, we found that multi-span answers often appear with ...
详细信息
暂无评论