When searching for a dynamic target in an unknown real world scene,search efficiency is greatly reduced if users lack information about the spatial structure of the *** target search studies,especially in robotics,foc...
详细信息
When searching for a dynamic target in an unknown real world scene,search efficiency is greatly reduced if users lack information about the spatial structure of the *** target search studies,especially in robotics,focus on determining either the shortest path when the target’s position is known,or a strategy to find the target as quickly as possible when the target’s position is ***,the target’s position is often known intermittently in the real world,e.g.,in the case of using surveillance *** goal is to help user find a dynamic target efficiently in the real world when the target’s position is intermittently *** order to achieve this purpose,we have designed an AR guidance assistance system to provide optimal current directional guidance to users,based on searching a prediction *** assume that a certain number of depth cameras are fixed in a real scene to obtain dynamic target’s *** system automatically analyzes all possible meetings between the user and the target,and generates optimal directional guidance to help the user catch up with the target.A user study was used to evaluate our method,and its results showed that compared to free search and a top-view method,our method significantly improves target search efficiency.
Background Three-dimensional(3D)shape representation using mesh data is essential in various applications,such as virtualreality and simulation *** methods for extracting features from mesh edges or faces struggle wi...
详细信息
Background Three-dimensional(3D)shape representation using mesh data is essential in various applications,such as virtualreality and simulation *** methods for extracting features from mesh edges or faces struggle with complex 3D models because edge-based approaches miss global contexts and face-based methods overlook variations in adjacent areas,which affects the overall *** address these issues,we propose the Feature Discrimination and Context Propagation Network(FDCPNet),which is a novel approach that synergistically integrates local and global features in mesh *** FDCPNet is composed of two modules:(1)the Feature Discrimination Module,which employs an attention mechanism to enhance the identification of key local features,and(2)the Context Propagation Module,which enriches key local features by integrating global contextual information,thereby facilitating a more detailed and comprehensive representation of crucial areas within the mesh *** Experiments on popular datasets validated the effectiveness of FDCPNet,showing an improvement in the classification accuracy over the baseline ***,even with reduced mesh face numbers and limited training data,FDCPNet achieved promising results,demonstrating its robustness in scenarios of variable complexity.
Motion retargeting is an active research area in computer graphics and animation, allowing for the transfer of motion from one character to another, thereby creating diverse animated character data. While this technol...
详细信息
Motion retargeting is an active research area in computer graphics and animation, allowing for the transfer of motion from one character to another, thereby creating diverse animated character data. While this technology has numerous applications in animation, games, and movies, current methods often produce unnatural or semantically inconsistent motion when applied to characters with different shapes or joint counts. This is primarily due to a lack of consideration for the geometric and spatial relationships between the body parts of the source and target characters. To tackle this challenge, we introduce a novel spatially-preserving Skinned Motion Retargeting Network (SMRNet) capable of handling motion retargeting for characters with varying shapes and skeletal structures while maintaining semantic consistency. By learning a hybrid representation of the character's skeleton and shape in a rest pose, SMRNet transfers the rotation and root joint position of the source character's motion to the target character through embedded rest pose feature alignment. Additionally, it incorporates a differentiable loss function to further preserve the spatial consistency of body parts between the source and target. Comprehensive quantitative and qualitative evaluations demonstrate the superiority of our approach over existing alternatives, particularly in preserving spatial relationships more effectively IEEE
It is a challenging task to create realistic 3D avatars that accurately replicate individuals' speech and unique talking styles for speech-driven facial animation. Existing techniques have made remarkable progress...
详细信息
It is a challenging task to create realistic 3D avatars that accurately replicate individuals' speech and unique talking styles for speech-driven facial animation. Existing techniques have made remarkable progress but still struggle to achieve lifelike mimicry. This paper proposes “TalkingStyle”, a novel method to generate personalized talking avatars while retaining the talking style of the person. Our approach uses a set of audio and animation samples from an individual to create new facial animations that closely resemble their specific talking style, synchronized with speech. We disentangle the style codes from the motion patterns, allowing our method to associate a distinct identifier with each person. To manage each aspect effectively, we employ three separate encoders for style, speech, and motion, ensuring the preservation of the original style while maintaining consistent motion in our stylized talking avatars. Additionally, we propose a new style-conditioned transformer decoder, offering greater flexibility and control over the facial avatar styles. We comprehensively evaluate TalkingStyle through qualitative and quantitative assessments, as well as user studies demonstrating its superior realism and lip synchronization accuracy compared to current state-of-the-art methods. To promote transparency and further advancements in the field, we also make the source code publicly available at https://***/wangxuanx/TalkingStyle. IEEE
Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agr...
详细信息
Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agriculture, remote sensing, and space technologies. Predominant research efforts tackle these fine-grained sub-tasks following different paradigms, while the inherent relations between these tasks are neglected. Moreover, given most of the research remains fragmented, we conduct an in-depth study of the advanced work from a new perspective of learning the part relationship. In this perspective, we first consolidate recent research and benchmark syntheses with new taxonomies. Based on this consolidation, we revisit the universal challenges in fine-grained part segmentation and recognition tasks and propose new solutions by part relationship learning for these important challenges. Furthermore, we conclude several promising lines of research in fine-grained visual parsing for future research.
Medical image segmentation has an important application value in the modern medical field, it can help doctors accurately locate and analyze the tissue structure, lesion areas, and organ boundaries in the image, which...
详细信息
Depth information can benefit various computervision tasks on both images and ***,depth maps may suffer from invalid values in many pixels,and also large *** improve such data,we propose a joint self-supervised and r...
详细信息
Depth information can benefit various computervision tasks on both images and ***,depth maps may suffer from invalid values in many pixels,and also large *** improve such data,we propose a joint self-supervised and reference-guided learning approach for depth *** the self-supervised learning strategy,we introduce an improved spatial convolutional sparse coding module in which total variation regularization is employed to enhance the structural information while preserving edge *** module alternately learns a convolutional dictionary and sparse coding from a corrupted depth ***,both the learned convolutional dictionary and sparse coding are convolved to yield an initial depth map,which is effectively smoothed using local contextual *** reference-guided learning part is inspired by the fact that adjacent pixels with close colors in the RGB image tend to have similar depth *** thus construct a hierarchical joint bilateral filter module using the corresponding color image to fill in large *** summary,our approach integrates a convolutional sparse coding module to preserve local contextual information and a hierarchical joint bilateral filter module for filling using specific adjacent *** results show that the proposed approach works well for both invalid value restoration and large hole inpainting.
Visual localization and object detection both play important roles in various *** many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely ***,few researchers ...
详细信息
Visual localization and object detection both play important roles in various *** many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely ***,few researchers consider these two tasks simultaneously,because of a lack of datasets and the little attention paid to such *** this paper,we explore multi-task network design and joint refinement of detection and *** address the dataset problem,we construct a medium indoor scene of an aviation exhibition hall through a semi-automatic *** dataset provides localization and detection information,and is publicly available at https://***/drive/folders/1U28zk0N4_I0db zkqyIAK1A15k9oUKOjI?usp=sharing for benchmarking localization and object detection *** this dataset,we have designed a multi-task network,JLDNet,based on YOLO v3,that outputs a target point cloud and object bounding *** dynamic environments,the detection branch also promotes the perception of *** includes image feature learning,point feature learning,feature fusion,detection construction,and point cloud ***,object-level bundle adjustment is used to further improve localization and detection *** test JLDNet and compare it to other methods,we have conducted experiments on 7 static scenes,our constructed dataset,and the dynamic TUM RGB-D and Bonn *** results show state-of-the-art accuracy for both tasks,and the benefit of jointly working on both tasks is demonstrated.
Time-triggered architecture,as a mainstream design of the distributed real-time system,has been successfully applied in the aerospace,automotive and mechanical ***,time-triggered scheduling is a challenging NP-hard **...
详细信息
Time-triggered architecture,as a mainstream design of the distributed real-time system,has been successfully applied in the aerospace,automotive and mechanical ***,time-triggered scheduling is a challenging NP-hard *** are few studies that could quickly solve the scheduling problem of large distributed time-triggered *** solve this problem,a communication affinity parameter is defined in this paper to describe the degree of bias of the shaper task towards sending or receiving *** on this,an innovative task-message decoupling model named D-scheduler is built to reduce the computation complexity of the scheduling problem in large-scale ***,we provide mathematical proof that our model is a convex optimization that is easy to solve with existing computational *** experiments substantiate the efficacy of the *** dramatically reduces the scheduling complexity of large-scale real-time systems with a small loss of solving space compared to the federal scheduler.
Recently,a new research trend in our video salient object detection(VSOD)research community has focused on enhancing the detection results via model self-fine-tuning using sparsely mined high-quality keyframes from th...
详细信息
Recently,a new research trend in our video salient object detection(VSOD)research community has focused on enhancing the detection results via model self-fine-tuning using sparsely mined high-quality keyframes from the given *** such a learning scheme is generally effective,it has a critical limitation,i.e.,the model learned on sparse frames only possesses weak generalization *** situation could become worse on“long”videos since they tend to have intensive scene ***,in such videos,the keyframe information from a longer time span is less relevant to the previous,which could also cause learning conflict and deteriorate the model ***,the learning scheme is usually incapable of handling complex pattern *** solve this problem,we propose a divide-and-conquer framework,which can convert a complex problem domain into multiple simple ***,we devise a novel background consistency analysis(BCA)which effectively divides the mined frames into disjoint *** for each group,we assign an individual deep model on it to capture its key attribute during the fine-tuning *** the testing phase,we design a model-matching strategy,which could dynamically select the best-matched model from those fine-tuned ones to handle the given testing *** experiments show that our method can adapt severe background appearance variation coupling with object movement and obtain robust saliency detection compared with the previous scheme and the state-of-the-art methods.
暂无评论