In recent years, the country has proposed the strategic development goal of 'Made in China 2025', and the intelligent manufacturing industry has gradually received national attention. Intelligent robots rely o...
详细信息
robotics is a discipline that has experienced rapid development in recent decades. One of the main focuses in robotics development is the advancement of control systems. The primary objective is to create control syst...
详细信息
6DoF (6 Degrees of Freedom, or 6DoF or 6D) pose estimation has practical applications in robotvision, 3D scene understanding and other fields. 6D pose estimation based on RGB-D images has an important role in robotic...
详细信息
Open-ended surveillance task for a robot in an unspecified environment using only an RGB camera, has not been addressed at length in literature. This is unlike the popular scenario of path planning where both the targ...
详细信息
ISBN:
(数字)9781665482509
ISBN:
(纸本)9781665482509
Open-ended surveillance task for a robot in an unspecified environment using only an RGB camera, has not been addressed at length in literature. This is unlike the popular scenario of path planning where both the target and environments are often known. We focus on the task of a robot which needs to estimate a realistic depiction of the surrounding 3D environment, including the location of obstacles and free space to navigate in the scene within the view field. In this paper, we propose an unsupervised algorithm to iteratively compute an optimal direction for maximal unhindered movement in the scene. This task is challenging when presented with only a single RGB view of the scene, without the use of any online depth sensor. Our process combines cues from two deep-learning processes - semantic segmentation and depth map estimation, to automatically decide plausible robot movement paths while avoiding hindrance posed by objects in the scene. We make assumptions of the use of a low-end RGB USB camera, pre-set camera view direction (angle) and field of view, incremental movement of the robot in the view field, and iterative analysis of the scene, all catering to any open-ended (target-free) surveillance/patrolling applications. Inverse perspective geometry has been used to map the optimal direction estimated in the view field, to that on the floor of the scene for navigation. Results of evaluation using a dataset of videos of scenes captured from indoor (office, labs, meeting/class-rooms, corridors, lounge) environments, reveal the success of the proposed approach.
A multipurpose robot is a system for the covid-19 ward to provide support for serving food, medication, perform temperature check etc., for the covid positive patients without making more efforts by the medical or fro...
详细信息
Recent research has shown that deep learning based methods offer more accurate detection for image steganalysis than the traditional detection paradigm based on rich media models. Existing network architectures based ...
详细信息
ISBN:
(纸本)9781665405409
Recent research has shown that deep learning based methods offer more accurate detection for image steganalysis than the traditional detection paradigm based on rich media models. Existing network architectures based on deep learning, however, stack more and more convolutional layers to increase local receptive fields for image stegananlysis. Limited by hardware, the detector with several convolutional layers may not extract features of steganography images from a global perspective effectively. In this paper, we propose a Convolutional vision Transformer for image stegananlysis, which can capture both local and global dependencies among noise features. In image processing phase, our network preserves CNN frame for its capacity of producing image noise residuals. Different from previous methods, we utilize the attention mechanism of vision transformer for feature extraction and classification. The proposed network is validated on two public image datasets (BOSSbase 1.01 and ALASKA #2). Experimental results demonstrate that our network performs well over fixed-size dataset and arbitrary-size dataset.
Continual learning aims to continuously learn new tasks from new data while retaining the knowledge of tasks learned in the past. Recently, the vision Transformer, which utilizes the Transformer initially proposed in ...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
Continual learning aims to continuously learn new tasks from new data while retaining the knowledge of tasks learned in the past. Recently, the vision Transformer, which utilizes the Transformer initially proposed in natural language processing for computer vision, has shown higher accuracy than Convolutional Neural Networks (CNN) in image recognition tasks. However, there are few methods that have achieved continual learning with vision Transformer. In this paper, we compare and improve continual learning methods that can be applied to both CNN and vision Transformers. In our experiments, we compare several continual learning methods and their combinations to show the differences in accuracy and the number of parameters.
Recent research highlights the potential of multimodal foundation models in tackling complex decision-making challenges. However, their large parameters make real-world deployment resource-intensive and often impracti...
详细信息
This examination intends to enhance the overall performance of welding operations through picture processing. It's going to use an aggregate of PC vision and gadgets, getting to know to perceive better and tune we...
详细信息
The existing interior design platform can draw interior design three-dimensional images according to the shape of interior objects. On the basis of existing research, according to the location of indoor objects, gener...
详细信息
暂无评论