Simultaneous localization and mapping (SLAM) is one of the current research hotspots. However, in visual SLAM for dynamic environments, inaccurate detection of object motion states and incomplete dynamic region cullin...
Simultaneous localization and mapping (SLAM) is one of the current research hotspots. However, in visual SLAM for dynamic environments, inaccurate detection of object motion states and incomplete dynamic region culling will lead to large localization errors. To address these issues, this paper proposes an RGB-D SLAM method based on feature association. The method has strongly correlated features in time and space according to the input image sequence. Using the moving probability of the feature points in the previous frame, the movement of the feature points in the current frame is calculated in combination with the dynamic corner points screened in the current frame. Then, the motion state of the object is determined according to the proportion of different feature points. Then combined with semantic information and object depth information, the fast search method is used to obtain accurate dynamic regions. Finally, the selected effective feature points are used to estimate the camera pose and establish a static map of the environment. This paper evaluates the robustness and accuracy of our method on the TUM dataset and real environment, and the results show that our method can significantly improve the system tracking effect and reduce the system tracking error compared with other SLAM methods in dynamic environments.
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative to the commonly used M...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative to the commonly used Multi-Layer Perceptions (MLP). Experiments in various fields demonstrated that KAN-based machine learning can achieve comparable if not better performance than MLP-based methods, but with much smaller parameter scales and are more explainable. In this paper, we explore the incorporation of KAN into the actor and critic networks for offline reinforcement learning (RL). We evaluated the performance, parameter scales, and training efficiency of various KAN and MLP-based conservative Q-learning (CQL) on the classical D4RL benchmark for offline RL. Our study demonstrates that KAN can achieve performance close to the commonly used MLP with significantly fewer parameters. This allows us to choose the base networks according to the offline RL task requirements.
Video question answering(VideoQA)is a typical cross-modal understanding task. Its challenge lies in how to learn appropriate multimodal representation and cross-modal correlation for answer inference. Most existing ...
详细信息
Direct Preference Optimization (DPO) has proven effective in complex reasoning tasks like math word problems and code generation. However, when applied to Text-to-SQL datasets, it often fails to improve performance an...
详细信息
Video-based human pose estimation has long been a fundamental yet challenging problem in computer vision. Previous studies focus on spatio-temporal modeling through the enhancement of architecture design and optimizat...
详细信息
Various algorithms of traditional visual Simultaneous Localization and Mapping (SLAM) can well match with static scenes, but mismatches will occur in dynamic scenes, which makes the positioning and mapping of the SLAM...
详细信息
Blockchain technologies pave a promising way for implementing the inter-organizational processes. Most of the current research works translate the execution logic in the process models into the smart contracts, which ...
Blockchain technologies pave a promising way for implementing the inter-organizational processes. Most of the current research works translate the execution logic in the process models into the smart contracts, which can run independently on the blockchain without the outside process engine. However, the works usually suffer from the execution and storage costs, since the translation needs to be done when the processes are deployed. In this paper, we customize a process engine for executing the inter-organizational business processes via a blockchain-style procedure, i.e., checking the validity of transactions, adding the valid transactions into the blockchain through the consensus mechanism, and then updating the process states according to the committed transactions. And then, we build a blockchain system by embedding the customized process engine into the blockchain nodes. Moreover, in order to realize the interactions between the inter-organizational processes running on blockchain and the services outside blockchain, we propose a blockchain-based approach for service registration, binding and invocation, and design a lease-based concurrency control protocol to logically isolate transactions from each other when invoking the services simultaneously. Finally, we implement a prototype system based on a permissioned blockchain platform Hyperledger Fabric and a process engine Activiti. The experimental results show the proposed blockchain system can execute the inter-organizational processes correctly and efficiently.
Person search aims to localize specific a target person from a gallery set of images with various scenes. As the scene of moving pedestrian changes, the captured person image inevitably bring in lots of background noi...
详细信息
Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) and data collection (DC) have been popular research issues. Different from existing works that consider MEC and DC scenarios separately, this paper in...
详细信息
Zero-shot learning aims to recognize unseen-classes using some seen-class samples as training set. It is challenging owing to that the feature representations of unseen-class samples are unavailable. Existing methods ...
详细信息
Zero-shot learning aims to recognize unseen-classes using some seen-class samples as training set. It is challenging owing to that the feature representations of unseen-class samples are unavailable. Existing methods transfer the mapping from seen-classes to unseen-classes with the correlation as a bridge, in which, the semantic representations are used to discriminate the classes. However, the unavailability of visual representations for unseen-classes and the insufficient discrimination of semantic representations make the zero-shot learning challenging. Therefore, the visual representations are learned as complements to semantic representations to construct a multi-modal knowledge graph (KG), and a zero-shot learning method based on multi-modal KG is proposed in this paper. Specially, a semantic KG is introduced to capture the correlation of classes, and with the correlation, the visual feature representations of all classes are learned. Then, the discriminative visual representations and the semantic representations are used together to construct a multi-modal KG. With the multi-modal KG, the classifier for seen-classes is transferred to unseen classes. Extensive experimental results show the effectiveness of our method.
暂无评论