In order to accurately segment architectural features in highresolution remote sensing images,a semantic segmentation method based on U-net network multi-task learning is ***,a boundary distance map was generated base...
详细信息
In order to accurately segment architectural features in highresolution remote sensing images,a semantic segmentation method based on U-net network multi-task learning is ***,a boundary distance map was generated based on the remote sensing image of the ground truth map of the *** remote sensing image and its truth map were used as the input in the U-net network,followed by the addition of the building ground prediction layer at the end of the U-net *** on the ResNet network,a multi-task network with the boundary distance prediction layer was *** involving the ISPRS aerial remote sensing image building and feature annotation data set show that compared with the full convolutional network combined with the multi-layer perceptron method,the intersection ratio of VGG16 network,VGG16+boundary prediction,ResNet50 and the method in this paper were increased by 5.15%,6.946%,6.41%and 7.86%.The accuracy of the networks was increased to 94.71%,95.39%,95.30%and 96.10%respectively,which resulted in high-precision extraction of building features.
Rehabilitation robots play an important role in the motor function rehabilitation for stroke survivors with hemiplegia. However, the rehabilitation effect of current robots is still limited partly because a single tra...
Virtual human soccer is the concentrated embodiment of virtual human technology and kinematics. In order to realize the nature and reality of controlling the passing and receiving of the ball, the passing and receivin...
详细信息
Virtual human soccer is the concentrated embodiment of virtual human technology and kinematics. In order to realize the nature and reality of controlling the passing and receiving of the ball, the passing and receiving control of the virtual human and the movement track of the soccer ball is studied, and a message mechanism combined with inverse dynamics control strategy is proposed. The interaction of virtual human passing and receiving the ball is divided into execution behavior and contact ball effect. The execution behavior is controlled by the message mechanism to inform and receive the message, which is transmitted to the virtual human. The contact ball effect is in the charge of inverse dynamics, which adjusts the contact part when passing and receiving the ball. Finally, the simulation of football passing and receiving is realized, and the screen is smooth and natural. The experimental results verify the feasibility of this method. Guangxi keylaboratory of machinevision and Intelligent control.
With the breakthrough of large models, Segment Anything Model (SAM) and its extensions have been attempted to apply in diverse tasks of computer vision. Underwater salient instance segmentation is a foundational and v...
详细信息
With the breakthrough of large models, Segment Anything Model (SAM) and its extensions have been attempted to apply in diverse tasks of computer vision. Underwater salient instance segmentation is a foundational and vital step for various underwater vision tasks, which often suffer from low segmentation accuracy due to the complex underwater circumstances and the adaptive ability of models. Moreover, the lack of large-scale datasets with pixel-level salient instance annotations has impeded the development of machine learning techniques in this field. To address these issues, we construct the first large-scale underwater salient instance segmentation dataset (USIS10K), which contains 10,632 underwater images with pixel-level annotations in 7 categories from various underwater scenes. Then, we propose an Underwater Salient Instance Segmentation architecture based on Segment Anything Model (USIS-SAM) specifically for the underwater domain. We devise an Underwater Adaptive Visual Transformer (UA-ViT) encoder to incorporate underwater domain visual prompts into the segmentation network. We further design an out-of-the-box underwater Salient Feature Prompter Generator (SFPG) to automatically generate salient prompters instead of explicitly providing foreground points or boxes as prompts in SAM. Comprehensive experimental results show that our USIS-SAM method can achieve superior performance on USIS10K datasets compared to the state-of-the-art methods. Datasets and codes are released on Github. Copyright 2024 by the author(s)
Reinforcement learning is used in many applications in artificial intelligence fields such as computer vision and environmental contextualized decision-making scenarios; virtual human swarms provide research direction...
详细信息
Reinforcement learning is used in many applications in artificial intelligence fields such as computer vision and environmental contextualized decision-making scenarios; virtual human swarms provide research directions for multi-intelligent body collaboration and environmental field detection in multi-intelligent body swarms. Football is a group sport, which is characterized by its holistic nature, confrontation, versatility, and ease of implementation, as well as the characteristics of both individual intelligence and group intelligence, and is a typical application scenario for multi-intelligence collaboration. This paper takes football as the research object to study the team collaboration problem and team gaming problem of the virtual human swarm in a specific environment and solves the cooperation and competition relationship between intelligence with reinforcement learning of multi-intelligence training.
We propose to leverage the local information in image sequences to support global camera relocalization. In contrast to previous methods that regress global poses from single images, we exploit the spatial-temporal co...
详细信息
Depth estimation is an essential task for understanding the geometry of 3D scenes. Compared with multi-view-based methods, monocular depth estimation is more challenging for the requirement of integrating not only glo...
详细信息
In this paper, a virtual human dribbling strategy is proposed to realize the programmed animation of virtual human dribbling. First, the dribbling interactive elements are extracted and their shapes, sizes and positio...
详细信息
In this paper, a virtual human dribbling strategy is proposed to realize the programmed animation of virtual human dribbling. First, the dribbling interactive elements are extracted and their shapes, sizes and positions are planned. Then the balls are attributed and divided. Also, the dominant of the ball trajectory is configured to a specific virtual human to solve the control problem of the ball movement curve in the virtual environment. Finally, according to the division of dribbling behavior, the states of the feet and the ball are analyzed and calculated under different behaviors, and the action transition of the virtual human is carried out through two basic actions to solve the number of action limitation, and combined with ball curve to realize the virtual human dribbling behavior strategy. And the actions are verified by the ”Cuju” virtual environment, which confirms the feasibility of individual dribbling of the virtual human.
Remote sensing image analysis is a basic and practical research hotspot in remote sensing *** sensing images contain abundant ground object information and it can be used in urban planning,agricultural monitoring,ecol...
详细信息
Remote sensing image analysis is a basic and practical research hotspot in remote sensing *** sensing images contain abundant ground object information and it can be used in urban planning,agricultural monitoring,ecological services,geological exploration and other *** this paper,we propose a lightweight model combining vgg-16 and u-net *** combining two convolutional neural networks,we classify scenes of remote sensing *** ensuring the accuracy of the model,try to reduce the memory of *** to the experimental results of this paper,we have improved the accuracy of the model to 98%.The memory size of the model is 3.4 *** the same time,The classification and convergence speed of the model are greatly *** simultaneously take the remote sensing scene image of 64×64 as input into the designed *** the accuracy of the model is 97%,it is proved that the model designed in this paper is also suitable for remote sensing images with few target feature points and low ***,the model has a good application prospect in the classification of remote sensing images with few target feature points and low pixels.
We propose to leverage the local information in a image sequence to support global camera relocalization. In contrast to previous methods that regress global poses from single images, we exploit the spatial-temporal c...
详细信息
ISBN:
(数字)9781728148038
ISBN:
(纸本)9781728148045
We propose to leverage the local information in a image sequence to support global camera relocalization. In contrast to previous methods that regress global poses from single images, we exploit the spatial-temporal consistency in sequential images to alleviate uncertainty due to visual ambiguities by incorporating a visual odometry (VO) component. Specifically, we introduce two effective steps called content-augmented pose estimation and motion-based refinement. The content-augmentation step focuses on alleviating the uncertainty of pose estimation by augmenting the observation based on the co-visibility in local maps built by the VO stream. Besides, the motion-based refinement is formulated as a pose graph, where the camera poses are further optimized by adopting relative poses provided by the VO component as additional motion constraints. Thus, the global consistency can be guaranteed. Experiments on the public indoor 7-Scenes and outdoor Oxford RobotCar benchmark datasets demonstrate that benefited from local information inherent in the sequence, our approach outperforms state-of-the-art methods, especially in some challenging cases, e.g., insufficient texture, highly repetitive textures, similar appearances, and over-exposure.
暂无评论