Recent advancements in Vision-Language Pre-training (VLP) techniques have greatly improved performance in Scene Text Detection tasks by leveraging the rich visual and textual content in scene text images. We propose a...
详细信息
We introduce a novel BiMoeFormer based on the Transformer architecture for 3D human motion prediction. Previous approaches primarily focus on the relationships between body joints in human poses, while neglecting thei...
详细信息
Motion retargeting is an active research area in computer graphics and animation, allowing for the transfer of motion from one character to another, thereby creating diverse animated character data. While this technol...
详细信息
Motion retargeting is an active research area in computer graphics and animation, allowing for the transfer of motion from one character to another, thereby creating diverse animated character data. While this technology has numerous applications in animation, games, and movies, current methods often produce unnatural or semantically inconsistent motion when applied to characters with different shapes or joint counts. This is primarily due to a lack of consideration for the geometric and spatial relationships between the body parts of the source and target characters. To tackle this challenge, we introduce a novel spatially-preserving Skinned Motion Retargeting Network (SMRNet) capable of handling motion retargeting for characters with varying shapes and skeletal structures while maintaining semantic consistency. By learning a hybrid representation of the character's skeleton and shape in a rest pose, SMRNet transfers the rotation and root joint position of the source character's motion to the target character through embedded rest pose feature alignment. Additionally, it incorporates a differentiable loss function to further preserve the spatial consistency of body parts between the source and target. Comprehensive quantitative and qualitative evaluations demonstrate the superiority of our approach over existing alternatives, particularly in preserving spatial relationships more effectively IEEE
Mammography screening is one of the important applications for the intelligent Internet of Things (IoT). Due to the efficient and personalized cyber-medicine system, early diagnosis can successfully reduce the breast ...
详细信息
Mammography screening is one of the important applications for the intelligent Internet of Things (IoT). Due to the efficient and personalized cyber-medicine system, early diagnosis can successfully reduce the breast cancer mortality rate by AI-driven healthcare. However, it is a huge challenge to extend the conventional single-center into the multicenter mammography screening, thus improving the effectiveness and robustness of intelligent IoT-based devices. To address this problem, we utilize multicenter mammograms by the modified capsule neural network and propose a novel framework called multicenter transformation between unified capsules (MLT-UniCaps) in this article. The proposed MLT-UniCaps is composed of Attentional Pose Embedding, Dynamic Source Capsule Traversal, and Adaptive Target Capsule Fusion to realize an intelligent remote assistant diagnosis. Attentional Pose Embedding extracts feature vectors via variations in position, orientation, scale, and lighting as the poses through an adversarial convolutional neural network with an attention-based layer. Based on the pose presentation, Dynamic Source Capsule Traversal deploys a dynamic routing mechanism between neurons to build a source cancer classifier for single-center mammography screening. Using the source cancer classifier, Adaptive Target Capsule Fusion integrates various centers of mammograms as the universal cancer detectors and optimizes heterogeneous distribution among them by the transformation-likelihood maximization. Owing to the three components, MLT-UniCaps effectively improves the results of single-center mammography screening and works in the multicenter breast cancer diagnosis. By comprehensive experiments on 58 965 samples, the proposed MLT-UniCaps obtains 90.1% of overall classification accuracy on single-center trials and 73.8% of overall F1 score on multicenter trials. All the experimental results illustrated that our MLT-UniCaps, an intelligent IoT-based clinical tool, inures the be
Dialogue policy trains an agent to select dialogue actions frequently implemented via deep reinforcement learning (DRL). The model-based reinforcement methods built a world model to generate simulated data to alleviat...
详细信息
Knowledge Graph Completion (KGC) aims to predict the missing information in the (head entity)-[relation]-(tail entity) triplet. Deep Neural Networks have achieved significant progress in the relation prediction task. ...
详细信息
Human motion prediction aims to predict future human motion based on past observations, playing important roles in several fields. However, previous works have often focused on the temporal sequential nature of human ...
详细信息
Scene text removal is a recent development in computer vision that replaces text patches in natural images with the appropriate background. Text removal is a difficult process leading to faulty areas of text cont...
详细信息
Scene text removal is a recent development in computer vision that replaces text patches in natural images with the appropriate background. Text removal is a difficult process leading to faulty areas of text containing text strokes with their hazy backgrounds. Text in the real world uses a variety of font kinds, some of which are difficult to localize due to their chaotic shapes, varied shading degrees, and orientation *** text erasing may include the subtasks of text detection as well as text inpainting. Both subtasks require a large amount of data to be successful;but, existing approaches were limited by insufficient real-world data for scene-text elimination. Eventhough the existing works produced considerable performance improvement in scene text removal, they often leave many text remains like text strokes, thus producinglow-quality visual outcomes. Therefore, this paper proposes an automatic text inpainting and video quality elevation model by using the Improved Convolutional Network-based ***, the video samples are collected from the diverse datasets and then converted into frames. Next, the frames are deblurred using an enhanced Convolutional Neural Network (CNN) model that has three convolutional layers for accurately localizing the texts in frames. Subsequently, the texts are detected by utilizing the CLARA-based VGG-16 network. Afterward, the text strokes are removed using a convolutional Encoder and decoder network to eliminate the presence of text on complex backgrounds and textures. Here, the coordinates of text in the deblurred frames are used to crop out the text stroke regions. So, the texts are in-painted, and then, the text in-painted regions are pasted back to their original positions in the frames. Furthermore, the video quality is elevated with the help of the DenseNet-centric Enhancement network. The experimental outcomes demonstrate that the proposed model effectively removed scene texts and enhanced the video qu
As a key task in natural language processing, the current knowledge extraction methods mostly involve joint extraction, simultaneously extracting named entities and relationships. When conducting relationship extracti...
详细信息
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive...
详细信息
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive,and privacy-aware vehicular applications in Io V result in the transformation from cloud computing to edge computing,which enables tasks to be offloaded to edge nodes(ENs) closer to vehicles for efficient execution. In ITS environment,however, due to dynamic and stochastic computation offloading requests, it is challenging to efficiently orchestrate offloading decisions for application requirements. How to accomplish complex computation offloading of vehicles while ensuring data privacy remains challenging. In this paper, we propose an intelligent computation offloading with privacy protection scheme, named COPP. In particular, an Advanced Encryption Standard-based encryption method is utilized to implement privacy protection. Furthermore, an online offloading scheme is proposed to find optimal offloading policies. Finally, experimental results demonstrate that COPP significantly outperforms benchmark schemes in the performance of both delay and energy consumption.
暂无评论