检索结果-内蒙古大学图书馆

2023 International Conference on Algorithm, Imaging Processing, and machine vision, AIPMV 2023

作者： He, Jingze Guo, Yao Song, Qing Department of Computer Science and Technology Tsinghua University Beijing China Pattern Recognition and Intelligent Vision Laboratory Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781510672444

In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud data and trains the system using Focal Loss function to achieve fast and accurate detection of dangerous goods. In addition, in order to improve the detection accuracy, this paper also introduces the 3D region proposal network (3D RPN) and non-maximum suppression (NMS) algorithm. The experimental results show that the proposed method performs well on our self-built CT dataset, with high accuracy and low false positive rate, and is suitable for dangerous goods detection tasks in practical scenarios. © 2024 SPIE.

关键词： Computerized tomography

来源：评论

学校读者我要写书评

暂无评论

Re-identification of Saimaa Ringed Seals from Image Sequences 23rd

Re-identification of Saimaa Ringed Seals from Image Sequen...

引用

22nd Scandinavian Conference on Image Analysis, SCIA 2023

作者： Nepovinnykh, Ekaterina Vilkman, Antti Eerola, Tuomas Kälviäinen, Heikki Computer Vision and Pattern Recognition Laboratory Department of Computational Engineering Lappeenranta-Lahti University of Technology LUT Lappeenranta Finland

ISBN: (纸本)9783031314346

Automatic game cameras are commonly used for monitoring wildlife as they allow to document of the activity of animals in a non-invasive manner. By utilizing a large number of cameras and identifying individual animals from the images, it is possible to, for example, estimate the population size and study the migration patterns of the animals. Large image volumes produced by the cameras call for automated methods for the analysis. Re-identification of animals has commonly been implemented through one-to-one matching, where images are processed individually and the best match is searched from the database of known individuals one by one. Game cameras can be configured to produce a sequence of images that allows capturing the animal from multiple angles potentially improving the re-identification accuracy. In this work, the re-identification of the endangered Saimaa ringed seal (pusa hispida saimensis) from image sequences is studied. The individual identification is realized through Saimaa ringed seal’s unique pelage pattern. The proposed one-to-many and many-to-many matching methods aggregate the pelage pattern features over the whole sequence providing better embeddings for the re-identification tasks. We show that the proposed aggregation method outperforms traditional one-to-one matching based re-identification by a large margin. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Animals

来源：评论

学校读者我要写书评

暂无评论

Spatio-temporal Attention Graph Convolutions for Skeleton-based Action recognition 23rd

Spatio-temporal Attention Graph Convolutions for Skeleton-b...

引用

22nd Scandinavian Conference on Image Analysis, SCIA 2023

作者： Le, Cuong Liu, Xin Computer Vision and Pattern Recognition Laboratory School of Engineering Science Lappeenranta-Lahti University of Technology LUT Lappeenranta Finland Computer Vision Laboratory Department of Electrical Engineering Linköping University Linköping Sweden

ISBN: (纸本)9783031314346

In skeleton-based action recognition, graph convolutional networks (GCN) have been applied to extract features based on the dynamic of the human body and the method has achieved excellent results recently. However, GCN-based techniques only focus on the spatial correlations between human joints and often overlook the temporal relationships. In an action sequence, the consecutive frames in a neighborhood contain similar poses and using only temporal convolutions for extracting local features limits the flow of useful information into the calculations. In many cases, the discriminative features can present in long-range time steps and it is important to also consider them in the calculations to create stronger representations. We propose an attentional graph convolutional network, which adapts self-attention mechanisms to respectively model the correlations between human joints and between every time steps for skeleton-based action recognition. On two common datasets, the NTU-RGB+D60 and the NTU-RGB+D120, the proposed method achieved competitive classification results compared to state-of-the-art methods. The project’s GitHub page: STA-GCN. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

UformPose: A U-Shaped Hierarchical Multi-Scale Keypoint-Aware Framework for Human Pose Estimation

引用

IEEE Transactions on Circuits and Systems for Video Technology 2023年第4期33卷 1697-1709页

作者： Wang, You-Jie Luo, Yan-Min Bai, Gui-Hu Guo, Jing-Ming Huaqiao University Key Laboratory for Computer Vision and Pattern Recognition The College of Computer Science and Technology Xiamen361021 China National Taiwan University of Science and Technology Department of Electrical Engineering Taipei10607 Taiwan

Human pose estimation is a fundamental yet challenging task in computer vision. However, difficult scenarios such as invisible keypoints, occlusions and small-scale persons are still not well-handed. In this paper, we present a novel pose estimation framework named UformPose which targets to relieve these issues. UformPose has two core designs: Shared Feature Pyramid Stem (SFPS) and U-shaped hierarchical Multi-scale Keypoint-aware Attention Module (U-MKAM). SFPS is a feature pyramid stem with shared mechanism to learn stronger low-level features at the initial stage, and the shared mechanism can facilitate cross-resolution commonality learning. Our U-MKAM attempts to generate high-quality high-resolution representations by integrating all levels of feature representation of the backbone layer by layer. More importantly, we utilize the flexibility of attention operations for keypoint-aware modeling, which explicitly captures and trades-offs the dependencies between keypoints. We empirically demonstrate the effectiveness of our framework through the competitive pose estimation results on the COCO dataset. Extensive experiments and visual analysis on CrowdPose demonstrate the robustness of our model in crowd scenes. © 1991-2012 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

Design and Implementation of a vision- and Grating-Sensor-Based Intelligent Unmanned Settlement System

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2022年第2期3卷 254-264页

作者： Zhang, Hong-Bo Zhou, Yi-Zhong Dong, Li-Jia Lei, Qing Du, Ji-Xiang Department of Computer Science and Technology Huaqiao University Xiamen361000 China Xiamen Key Laboratory of Computer Vision and Pattern Recognition The Fujian Key Laboratory of Big Data Intelligence and Security Huaqiao University Xiamen361000 China

In this article, a new vision- and grating-sensor-based intelligent unmanned settlement (IUS) system is proposed for convenience stores to automatically recognize the shopping behavior of customers, record their identities, and generate invoices. First, we design a new IUS architecture, which includes a shelf module and exit module. To achieve automatic settlement for each customer, a shopping event detection method is proposed. In this method, a vision-based human pose estimation algorithm is used to detect a human form standing in front of a shelf. The hand actions of each customer are detected by a grating sensor, and an image recognition method based on a convolutional neural network (CNN) is applied to recognize the items in the hands of customers. To reduce the image annotation workload, we propose a semisupervised training method for the recognition network. Based on hand action detection and item recognition, a shopping event recognition method is designed for the system, and a facial image of the customer corresponding to each shopping behavior is captured. Finally, each detected shopping event is added to the invoice of the corresponding customer via a facial recognition method. To verify the effectiveness of the proposed IUS system, we have built a handheld item image dataset and a shopping event dataset for an unmanned convenience store. The experimental results show that the proposed system can accurately recognize shopping behaviors and generate invoices. © 2020 IEEE.

关键词： Sales

来源：评论

学校读者我要写书评

暂无评论

Dual Branch PnP Based Network for Monocular 6D Pose Estimation

引用

Intelligent Automation & Soft Computing 2023年第6期36卷 3243-3256页

作者： Jia-Yu Liang Hong-Bo Zhang Qing Lei Ji-Xiang Du Tian-Liang Lin Department of Computer Science and Technology Huaqiao UniversityXiamen361000China Xiamen Key Laboratory of Computer Vision and Pattern Recognition Huaqiao UniversityXiamen361000China Fujian Key Laboratory of Big Data Intelligence and Security Huaqiao UniversityXiamen361000China College of Mechanical Engineering and Automation Xiamen361000China

Monocular 6D pose estimation is a functional task in the field of com-puter vision and *** recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based ***,for monocular 6D pose estimation,these methods are affected by the prediction results of the 2D-3D correspondences and the robustness of the per-spective-n-point(PnP)*** is still a difference in the distance from the expected estimation *** obtain a more effective feature representation result,edge enhancement is proposed to increase the shape information of the object by analyzing the influence of inaccurate 2D-3D matching on 6D pose regression and comparing the effectiveness of the intermediate ***,although the transformation matrix is composed of rotation and translation matrices from 3D model points to 2D pixel points,the two variables are essentially different and the same network cannot be used for both variables in the regression ***,to improve the effectiveness of the PnP algo-rithm,this paper designs a dual-branch PnP network to predict rotation and trans-lation ***,the proposed method is verified on the public LM,LM-O and YCB-Video *** ADD(S)values of the proposed method are 94.2 and 62.84 on the LM and LM-O datasets,*** AUC of ADD(-S)value on YCB-Video is *** experimental results show that the performance of the proposed method is superior to that of similar methods.

关键词： 6D pose monocular RGB edge enhancement dual-branch PnP 2D-3D correspondence

来源：评论

学校读者我要写书评

暂无评论

Design of surrogate models in civil engineering by neural networks 9

Design of surrogate models in civil engineering by neural ne...

引用

9th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference, SEEDA-CECNSM 2024

作者： Drahy, Vojtech Marik, Radek Kalviainen, Heikki Czech Technical University in Prague Department of Computer Science Prague Czech Republic Czech Technical University in Prague Department of Telecommunication Engineering Prague Czech Republic Lappeenranta-Lahti University of Technology Lut Computer Vision and Pattern Recognition Laboratory Lappeenranta Finland

ISBN: (纸本)9798350342482

We present a task from the critical infrastructure field in materials engineering. We created a surrogate model for the bridge construction object to determine the material parameters' values. The work aims to use neural networks to conduct an initial investigation of the task and to find out the aspects of machine learning application. To reduce the computational complexity of the models, we designed specific neural networks whose architecture corresponds to the structure and characteristics of the processed data. Furthermore, we outcome also interpretability and justification of the model's decision-making. The main contribution of the work is the replacement of the unknown or too complex physical, mathematical description of material objects with a neural network model. © 2024 IEEE.

关键词： Neural network models

来源：评论

学校读者我要写书评

暂无评论

CodePhys: Robust Video-Based Remote Physiological Measurement Through Latent Codebook Querying

引用

IEEE Journal of Biomedical and Health Informatics 2025年 PP卷 PP页

作者： Chu, Shuyang Xia, Menghan Yuan, Mengyao Liu, Xin Seppanen, Tapio Zhao, Guoying Shi, Jingang Xi'an Jiaotong University School of Software Engineering Xi'an China Tencent Ai Lab Shenzhen China Lappeenranta-Lahti University of Technology Lut Computer Vision and Pattern Recognition Laboratory Lappeenranta53850 Finland University of Oulu Center for Machine Vision and Signal Analysis Finland

Remote photoplethysmography (rPPG) aims to measure non-contact physiological signals from facial videos, which has shown great potential in many applications. Most existing methods directly extract video-based rPPG features by designing neural networks for heart rate estimation. Although they can achieve acceptable results, the recovery of rPPG signal faces intractable challenges when interference from real-world scenarios takes place on facial video. Specifically, facial videos are inevitably affected by non-physiological factors (e.g., camera device noise, defocus, and motion blur), leading to the distortion of extracted rPPG signals. Recent rPPG extraction methods are easily affected by interference and degradation, resulting in noisy rPPG signals. In this paper, we propose a novel method named CodePhys, which innovatively treats rPPG measurement as a code query task in a noise-free proxy space (i.e., codebook) constructed by ground-truth PPG signals. We consider noisy rPPG features as queries and generate high-fidelity rPPG features by matching them with noise-free PPG features from the codebook. Our approach also incorporates a spatial-aware encoder network with a spatial attention mechanism to highlight physiologically active areas and uses a distillation loss to reduce the influence of non-periodic visual interference. Experimental results on four benchmark datasets demonstrate that CodePhys outperforms state-of-the-art methods in both intra-dataset and cross-dataset settings. © 2025 IEEE.

关键词： Heart

来源：评论

学校读者我要写书评

暂无评论

Ensemble and Personalized Transformer Models for Subject Identification and Relapse Detection in E-Prevention Challenge

Ensemble and Personalized Transformer Models for Subject Ide...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Salvatore Calcagno Raffaele Mineo Daniela Giordano Concetto Spampinato Department of Electrical Electronics and Computer Engineering Pattern Recognition and Computer Vision Laboratory (PeRCeiVe Lab) University of Catania Italy

In this short paper, we present the devised solutions for the subject identification and relapse detection tasks, which are part of the e-Prevention Challenge hosted at the ICASSP 2023 conference [1] [2] [3]. We specifically design an ensemble scheme of six models - five transformer-based ones and a CNN model - for the identification of subjects from wearable devices, while a personalized - one for each subject - scheme is used for relapse detection in psychotic disorder. Our final submitted solutions yield top performance on both tracks of the challenge: we ranked 2 nd on the subject identification task (with an accuracy of 93.85%) and 1 st on the relapse detection task (with a ROC-AUC and PR-AUC of about 0.65). Code and details are available at https://***/perceivelab/e-prevention-icassp-2023.

关键词： Performance evaluation Codes Wearable computers Signal processing Transformers Acoustics Object recognition

来源：评论

学校读者我要写书评

暂无评论

CMQA: A Dataset of Conditional Question Answering with Multiple-Span Answers 29

CMQA: A Dataset of Conditional Question Answering with Multi...

引用

29th International Conference on Computational Linguistics, COLING 2022

作者： Ju, Yiming Wang, Weikang Zhang, Yuanzhe Zheng, Suncong Liu, Kang Zhao, Jun National Laboratory of Pattern Recognition Institute of Automation CAS Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China Machine Learning Platform Department Tencent China

Forcing the answer of the Question Answering (QA) task to be a single text span might be restrictive since the answer can be multiple spans in the context. Moreover, we found that multi-span answers often appear with two characteristics when building the QA system for a real-world application. First, multi-span answers might be caused by users lacking domain knowledge and asking ambiguous questions, which makes the question need to be answered with conditions. Second, there might be hierarchical relations among multiple answer spans. Some recent span-extraction QA datasets include multi-span samples, but they only contain unconditional and parallel answers, which cannot be used to tackle this problem. To bridge the gap, we propose a new task: conditional question answering with hierarchical multi-span answers, where both the hierarchical relations and the conditions need to be extracted. Correspondingly, we introduce CMQA, a Conditional Multiple-span Chinese Question Answering dataset to study the new proposed task. The final release of CMQA consists of 7,861 QA pairs and 113,089 labels, where all samples contain multi-span answers, 50.4% of samples are conditional, and 56.6% of samples are hierarchical. CMQA can serve as a benchmark to study the new proposed task and help study building QA systems for real-world applications. The low performance of models drawn from related literature shows that the new proposed task is challenging for the community to solve. CMQA can be accessed at https://***/juyiming/CMQA. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

关键词： Domain Knowledge

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：