检索结果-内蒙古大学图书馆

Multi-task learning andjoint refinement between camera localization and object detection

Computational Visual Media 2024年第5期10卷 993-1011页

作者： Junyi Wang Yue Qi State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and EngineeringBeihang UniversityBeijing 100191China Peng Cheng Laboratory Shenzhen 518052China Qingdao Research Institute of Beihang University Qingdao 266104China School of Computer Science and Technology Shandong UniversityQingdaoChina

Visual localization and object detection both play important roles in various *** many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely ***,few researchers consider these two tasks simultaneously,because of a lack of datasets and the little attention paid to such *** this paper,we explore multi-task network design and joint refinement of detection and *** address the dataset problem,we construct a medium indoor scene of an aviation exhibition hall through a semi-automatic *** dataset provides localization and detection information,and is publicly available at https://***/drive/folders/1U28zk0N4_I0db zkqyIAK1A15k9oUKOjI?usp=sharing for benchmarking localization and object detection *** this dataset,we have designed a multi-task network,JLDNet,based on YOLO v3,that outputs a target point cloud and object bounding *** dynamic environments,the detection branch also promotes the perception of *** includes image feature learning,point feature learning,feature fusion,detection construction,and point cloud ***,object-level bundle adjustment is used to further improve localization and detection *** test JLDNet and compare it to other methods,we have conducted experiments on 7 static scenes,our constructed dataset,and the dynamic TUM RGB-D and Bonn *** results show state-of-the-art accuracy for both tasks,and the benefit of jointly working on both tasks is demonstrated.

关键词： visual localization object detection joint optimization multi-task learning

来源：评论

学校读者我要写书评

暂无评论

A CNN-based Multi-Task Regressor Network for Three-dimensional Reconstruction From 3D Point Cloud 1

A CNN-based Multi-Task Regressor Network for Three-dimension...

引用

1st International Conference on Global Aeronautical engineering and Satellite Technology, GAST 2024

作者： Hassnae, Remmach Driss, Essabbar Raja, Mouachi Yasser, Chadli Saad Mohamed, Sadgal Lamigep Laboratory Emsi Marrakesh Marrakesh Morocco Cadi Ayyad University Computer Systems Engineering Laboratory Departement of Computer Science Marrakesh Morocco

ISBN: (纸本)9798350371598

The field of three-dimensional reconstruction plays a pivotal role across diverse domains such as computer graphics, virtual reality, robotics, archaeology, and medical imaging. This paper presents a novel deep learning framework for three-dimensional object reconstruction utilizing 3D point clouds, specifically focusing on a Multi-Task Regressor within a Convolutional Neural Network (CNN) architecture. Acknowledging the challenges associated with applying convolutions directly to point clouds, we adopt an innovative approach by building upon the PointNet architecture. Our adaptation integrates a Multi-Task Regressor to achieve highly accurate reconstructions of Supershapes from 3D point clouds. Our methodology involves employing PointNet to extract features from the input 3D point cloud. Subsequently, these features are inputted into a Multi-Task Regressor, which simultaneously predicts multiple outputs. In the context of Supershape reconstruction, the Multi-Task Regressor predicts the essential parameters needed to faithfully recreate the intricate shape. Unlike traditional single-output regressors, our approach considers and predicts multiple facets of the reconstruction process concurrently, enhancing the overall efficiency and accuracy of the model. The results of our experiments surpassed expectations, demonstrating notable improvements in precision and predictive cost. This research not only contributes to the advancement of three-dimensional reconstruction techniques but also highlights the efficacy of employing a Multi-Task Regressor for handling diverse regression objectives within the context of complex shape reconstruction from 3D point clouds. © 2024 IEEE.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

UKF‐MOT:An unscented Kalman filter‐based 3D multi‐object tracker

引用

CAAI Transactions on Intelligence Technology 2024年第4期9卷 1031-1041页

作者： Meng Liu Jianwei Niu Yu Liu Collective Intelligence&Collaboration Laboratory China North Artificial Intelligence and Innovation Research InstituteBeijingChina China North Vehicle Research Institute BeijingChina State Key Laboratory of Software Development Environment School of Computer Science and EngineeringBeihang UniversityBeijingChina State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and EngineeringBeihang UniversityBeijingChina School of Computer Science and Engineering Beihang UniversityBeijingChina

Multi‐object tracking in autonomous driving is a non‐linear *** better address the tracking problem,this paper leveraged an unscented Kalman filter to predict the object's *** the association stage,the Mahalanobis distance was employed as an affinity metric,and a Non‐minimum Suppression method was designed for *** the detections fed into the tracker and continuous‘predicting‐matching’steps,the states of each object at different time steps were described as their own continuous *** conducted extensive experiments to evaluate tracking accuracy on three challenging datasets(KITTI,nuScenes and Waymo).The experimental results demon-strated that our method effectively achieved multi‐object tracking with satisfactory ac-curacy and real‐time efficiency.

关键词： autonomous vehicle transportation

来源：评论

学校读者我要写书评

暂无评论

ViGT: proposal-free video grounding with a learnable token in the transformer

引用

Science China(Information Sciences) 2023年第10期66卷 196-212页

作者： Kun LI Dan GUO Meng WANG School of Computer Science and Information Engineering Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Intelligent Interconnected Systems Laboratory of Anhui Province Institute of Artificial Intelligence Hefei Comprehensive National Science Center

The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely video grounding transformer(ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows.(1) The token is unrelated to the video or the query and avoids data bias toward the original video and query.(2) The token simultaneously performs global context aggregation from video and query ***, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention(i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets:ANet-Captions, TACoS, and YouCookⅡ. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT.

关键词： video grounding temporal sentence grounding boundary regression token learning proposal-free

来源：评论

学校读者我要写书评

暂无评论

Design of a Remote Controlled Puppet Robot Imitating a Manual Driven Puppet Using Deep Learning Pose Detection 12

Design of a Remote Controlled Puppet Robot Imitating a Manua...

引用

12th RSI International Conference on Robotics and Mechatronics, ICRoM 2024

作者： Hamzeh, Hamed Shahroozi, Kiarash Nabipour, Ahmad Kazemi, Parham Malek-Zahedi, Mohammad Hallajian, Mehdi Moradia, Hadi School of Electrical and Computer Engineering College of Engineering University of Tehran Advanced Robotics and Intelligent Systems Laboratory Iran

ISBN: (纸本)9798331529734

Traditional puppet manipulation systems often require human operators to be physically present in tight and cramped locations. This leads to challenges in the positioning and effective operation of puppets, particularly in the presence of children in educational settings. In this paper, we present the design, development, and integration of a cost-effective robotic system for remotely manipulating puppets in real-time. Additionally, the system integrates a YOLOv8 Pose Detection model, trained on a custom dataset with an accuracy of 91.4%, enabling real-time gesture recognition and allowing the robot to be controlled by imitating the motions of a manually operated puppet. The system demonstrates the potential for many applications, including interactive entertainment, teleoperation, and education. © 2024 IEEE.

关键词： Gesture recognition

来源：评论

学校读者我要写书评

暂无评论

A survey on RGB images classification using convolutional neural network (CNN) architectures: Applications and challenges

A survey on RGB images classification using convolutional ne...

引用

2024 International Conference on Circuit, systems and Communication, ICCSC 2024

作者： Khayya, El Kharrachi El Oirrak, Ahmed Datsi, Toufik Cadi Ayyad University Faculty of Sciences Semlalia Computer Systems Engineering Laboratory Marrakech Morocco

ISBN: (纸本)9798350365306

The process of classifying RGB images is a basic and noxious activity in computer vision with an array of applications in face recognition, traffic analysis, and security protocols. An important part of this mission encompasses Convolutional Neural Networks, deemed to be core technologies, as they are capable in many domains. This short survey paper aims at conducting an in-depth analysis of CNN-based RGB image classification approaches in a way that considers complex architectures, multiple uses, and the difficulties that are entailed in them. The motivation behind this survey is to talk about the ever-increasing importance of RGB image classification across different applications. It hopes to explore and attend to the challenges emerging in association with this technology while also discussing the latest architectures and techniques advancing the field. In this paper, we go through different methods used while building the CNN architectures, thereby explaining their way of handling RGB image classification tasks. Moreover, it gives a detailed discussion about various other areas where CNNs seem to be efficient, such as agriculture and veterinary. This short survey will also dwell on the main challenges faced when dealing with RGB image classification, such as dataset variability, model robustness, and computational complexity issues. We construct the comprehensive understanding of CNN-based RGB image classification by synthesizing ideas from well-known datasets, such as CIFAR-10 and others, that will provide the foundation for future research in this area © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Edge-aware Feature Aggregation Network for Polyp Segmentation

引用

Machine Intelligence Research 2025年第1期22卷 101-116页

作者： Tao Zhou Yizhe Zhang Geng Chen Yi Zhou Ye Wu Deng-Ping Fan PCA Lab Key Laboratory of Intelligent Perception and Systems for High-dimensional Information of Ministry of EducationSchool of Computer Science and EngineeringNanjing University of Science and TechnologyNanjing210094China School of Computer Science and Engineering Northwestern Polytechnical University(NPU)Xi’an710129China School of Computer Science and Engineering Southeast UniversityNanjing211189China Computer Vision Lab ETH ZürichZürich8092Switzerland

Precise polyp segmentation is vital for the early diagnosis and prevention of colorectal cancer(CRC)in clinical ***,due to scale variation and blurry polyp boundaries,it is still a challenging task to achieve satisfactory segmentation performance with different scales and *** this study,we present a novel edge-aware feature aggregation network(EFA-Net)for polyp segmentation,which can fully make use of cross-level and multi-scale features to enhance the performance of polyp ***,we first present an edge-aware guidance module(EGM)to combine the low-level features with the high-level features to learn an edge-enhanced feature,which is incorporated into each decoder unit using a layer-by-layer ***,a scale-aware convolution module(SCM)is proposed to learn scale-aware features by using dilated convolutions with different ratios,in order to effectively deal with scale ***,a cross-level fusion module(CFM)is proposed to effectively integrate the cross-level features,which can exploit the local and global contextual ***,the outputs of CFMs are adaptively weighted by using the learned edge-aware feature,which are then used to produce multiple side-out segmentation *** results on five widely adopted colonoscopy datasets show that our EFA-Net outperforms state-of-the-art polyp segmentation methods in terms of generalization and *** implementation code and segmentation maps will be publicly at https://***/taozh2017/EFANet.

关键词： Colorectal cancer polyp segmentation edge-aware guidance module scale-aware convolution module cross-level fusion module

来源：评论

学校读者我要写书评

暂无评论

Speed Harmonic based Saturation Free Inductance Modeling and Estimation of Interior PMSM Using Measurements Under One Load Condition

引用

CES Transactions on Electrical Machines and systems 2025年第1期9卷 91-99页

作者： Guodong Feng Yuting Lu Zhe Tong Beichen Ding Guishan Yan Chunyan Lai the School of Intelligent Systems Engineering Sun Yat-sen UniversityShenzhenChina the School of Advanced Manufacturing Sun Yat-sen University the Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai) China the Department of Electrical and Computer Engineering Concordia UniversityQCCanada

For permanent magnet synchronous machines(PMSMs),accurate inductance is critical for control design and condition *** to magnetic saturation,existing methods require nonlinear saturation model and measurements from multiple load/current conditions,and the estimation is relying on the accuracy of saturation model and other machine parameters in the *** harmonic produced by harmonic currents is inductance-dependent,and thus this paper explores the use of magnitude and phase angle of the speed harmonic for accurate inductance *** estimation models are built based on either the magnitude or phase angle,and the inductances can be from d-axis voltage and the magnitude or phase angle,in which the filter influence in harmonic extraction is considered to ensure the estimation *** inductances can be estimated from the measurements under one load condition,which is free of saturation ***,the inductance estimation is robust to the change of other machine *** proposed approach can effectively improve estimation accuracy especially under the condition with low current *** and comparisons are conducted on a test PMSM to validate the proposed approach.

关键词： PMSM Inductance estimation Speed harmonic High frequency signal injection

来源：评论

学校读者我要写书评

暂无评论

Lattice Design for Multiple Topologically Protected Edge Modes 16

Lattice Design for Multiple Topologically Protected Edge Mod...

引用

16th Pacific Rim Conference on Lasers and Electro-Optics, CLEO-PR 2024

作者： Kim, Gyunghun Suh, Joseph Lee, Dayeong Park, Namkyoo Yu, Sunkyu Seoul National University Intelligent Wave Systems Laboratory Department of Electrical and Computer Engineering Seoul08826 Korea Republic of Seoul National University Photonic Systems Laboratory Department of Electrical and Computer Engineering Seoul08826 Korea Republic of

We propose the lattice design that allows multiple topologically protected edge modes. The scattering between these modes, which is linear, energy preserving, and robust against local disorders, is discussed in terms ...

ISBN: (纸本)9798350372076

关键词：

来源：评论

学校读者我要写书评

暂无评论

Mutative Evolution for Calculating Maximally Localized Wannier Functions in Photonic Crystals

Mutative Evolution for Calculating Maximally Localized Wanni...

引用

2024 Conference on Lasers and Electro-Optics/Pacific Rim, CLEO-PR 2024

作者： Lee, Dayeong Lee, Gitae Park, Seungkyun Park, Hyungchul Park, Namkyoo Yu, Sunkyu Intelligent Wave Systems Laboratory Department of Electrical and Computer Engineering Seoul National University Seoul08826 Korea Republic of Photonic Systems Laboratory Department of Electrical and Computer Engineering Seoul National University Seoul08826 Korea Republic of

We propose a neuroevolutionary algorithm to calculate maximally localized Wannier functions in photonic crystals, as an alternative method of conventional gradient-based approaches. © 2024 The Author(s)

关键词： Photonic crystals

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：