Indoor positioning is of great importance to the era of mobile computing. Currently, much attention has been paid to RSS-based location for that it can provide position information without additional equipment. Howeve...
详细信息
In a real scenario, the image is often corrupted by complex degradation, and a lot of useful information is lost, which makes super-resolution (SR) reconstruction seriously ill-posed. To effectively solve such a probl...
In a real scenario, the image is often corrupted by complex degradation, and a lot of useful information is lost, which makes super-resolution (SR) reconstruction seriously ill-posed. To effectively solve such a problem, it is crucial to correctly exploit image prior knowledge. Although existing deep learning-based methods can obtain excellent results, they cannot deal with the complex degradation effectively, which would lead to the loss of texture details and the destruction of edge details. In this paper, an efficient multi-regularization method for SR is proposed, which can simultaneously exploit both internal and external image priors within a unified framework. The hybrid Tikhonov-TV prior and deep denoiser prior are introduced to constrain the reconstruction process. That is, the proposed model combines the superiority of the piecewise-smooth prior and deep prior. Moreover, an adaptive weight parameter is employed to make the hybrid component more detail-preserving. Experimental demonstrate that the proposed method achieves better performance in image detail protection than advanced methods.
Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary ...
详细信息
In many underwater application scenarios, recognition tasks need to be executed promptly on computationally limited platforms. However, models designed for this field often exhibit spatial locality, and existing works...
In many underwater application scenarios, recognition tasks need to be executed promptly on computationally limited platforms. However, models designed for this field often exhibit spatial locality, and existing works lack the ability to capture crucial details in images. Therefore, a lightweight and detail-aware vision network (LDVNet) for resource-constrained environments is proposed to overcome the limitations of these approaches. Firstly, in order to enhance the accuracy of target image recognition, we introduce transformer modules to acquire global information, thus addressing the issue of spatial locality inherent in traditional convolutional neural networks (CNNs). Secondly, to maintain the network’s lightweight nature, we integrate the transformer module with convolutional operations, thereby mitigating the substantial parameter and floating point operations (FLOPs) overhead. Thirdly, for the efficient extraction of crucial fine-grained details from feature maps, we have devised a channel and spatial attention module (C&SA). This module aids in recognizing intricate and fine-grained visual tasks and enhances image understanding. It is seamlessly integrated into LDVNet with nearly negligible parameter overhead. The experimental results demonstrate that LDVNet outperforms other lightweight networks and hybrid networks in different recognition tasks, while being suitable for resource-constrained environments.
The objective of Multimodal Knowledge Graph Completion (MKGC) is to forecast absent entities within a knowledge graph by leveraging additional textual and visual modalities. Existing studies commonly utilize a singula...
The objective of Multimodal Knowledge Graph Completion (MKGC) is to forecast absent entities within a knowledge graph by leveraging additional textual and visual modalities. Existing studies commonly utilize a singular relationship embedding to depict all modalities within an entity pair, thus connecting several relationships derived from diverse modalities. However, this coupling may introduce interference from conflicting information between modalities, as the relationships between modalities for a given entity pair can be contradictory. Moreover, existing Ensemble Inference methods fail to dynamically adjust modal weights based on their differences and importance, despite the varying contributions of different modalities. In this paper, we propose the Multimodal Decouple and Relation-based Ensemble inference (MDRE) model. For each modality, we construct corresponding relationship embeddings and build separate triple representations to avoid interferences among modalities. During the training phase, we employ confidence-constrained training with temperature scaling to alleviate conflicting information in textual and visual modalities. For inference, we utilize the Relation-based Ensemble Inference method to adjust modal weights at the relationship level, thus achieving improved prediction results. Experimental results on two datasets demonstrate that MDRE outperforms existing single-modal and multimodal knowledge graph completion methods in terms of performance.
Image inpainting, which aims to reconstruct reasonably clear and realistic images from known pixel information, is one of the core problems in computer vision. However, due to the complexity and variability of the und...
Image inpainting, which aims to reconstruct reasonably clear and realistic images from known pixel information, is one of the core problems in computer vision. However, due to the complexity and variability of the underwater environment, the inability to extract valid pixel points and insufficient correlation between feature information in existing image inpainting techniques lead to blurring in the generated images. Therefore, a novel gated attention feature fusion image inpainting network based on generative adversarial networks (GAF-GAN) is proposed. The accuracy of feature similarity matching depends heavily on the validity of the information contained in the features. On the one hand, gating values are dynamically generated by gated convolution to reduce the interference of invalid information. On the other hand, semantic information at distant locations in an image is accurately acquired by the attention mechanism. For these reasons, we designed an improved gated attention mechanism. Gated attention mechanism make the network focus on effective information such as high-frequency texture and color fidelity of restored images. In addition, the dense feature fusion module is added to expand the overall receptive field of the network to fully learn the image features. Experimental results show that the proposed method can effectively repair defective images with complex texture structures and improve the reality and integrity of image details and structures.
The development of the Internet has made people more closely related and has put forward higher requirements for recommendation models. Most recommendation models are studied only for the long-term interests of users....
The development of the Internet has made people more closely related and has put forward higher requirements for recommendation models. Most recommendation models are studied only for the long-term interests of users. In this paper, the interaction time between the user and the item is introduced as auxiliary information in the model construction. Interaction time is used to determine users’ long-term preferences and short-term preferences. In this paper, temporal features are extracted by building a convolutional gated recurrent unit with attention neural network (CNN-GRU-Attention). Firstly, for the problem of accurate feature extraction, CNN are constructed to extract higher-level and more abstract features of themselves and transform high-dimensional data into low-dimensional data; secondly, for the problem of social temporality, GRU are used to not only extract temporal information, but also effectively reduce gradient dispersion, making model convergence and training easier; finally, Graph Attention networks are used to aggregate the social relationship information of users and items respectively, which constitute the final feature representation of users and items respectively. In particular, a modified cosine similarity is used to reduce the error caused by data insensitivity when constructing the social information of the item. In this study, simulation experiments are conducted on two publicly available datasets (Epinions and Ciao), and the experimental results show that the proposed recommended model performs better than other social recommendation models, improving the evaluation metrics of MAE and RMSE by 1.06%-1.33% and 1.19%-1.37%, respectively. The effectiveness of the model innovation is proved.
Underwater images are often affected by problems such as light attenuation, color distortion, noise and scattering, resulting in image defects. A novel image inpainting method is proposed to intelligently predict and ...
Underwater images are often affected by problems such as light attenuation, color distortion, noise and scattering, resulting in image defects. A novel image inpainting method is proposed to intelligently predict and fill damaged areas for complete and continuous visualization of the image. First, in order to effectively solve the problem of color distortion caused by light refraction in underwater environments, the improved gated attention mechanism is used. This mechanism improves the local details by learning and weighting the important features of the image. Second, gated convolution automatically determines the degree of restoration for each pixel based on local features of the original image. It eliminates distractions such as low contrast and scattering, retaining more original detailed information. By doing so, image inpainting techniques improve the quality and visualization of underwater images.
Chinese short text similarity computation stands as a pivotal task within natural language processing, garnering significant attention. However, existing models grapple with limitations in handling intricate semantic ...
详细信息
ISBN:
(数字)9781665410205
ISBN:
(纸本)9781665410212
Chinese short text similarity computation stands as a pivotal task within natural language processing, garnering significant attention. However, existing models grapple with limitations in handling intricate semantic relationships, such as the challenge of discerning subtle semantic nuances in text, inadequacies in effectively integrating diverse levels of semantic information, and the struggle to capture polysemous meanings accurately. In addressing these issues, this paper introduces an innovative Chinese short text similarity computation model, SMGC-SBERT. This model addresses shortcomings of the existing models by employing a multi-module fusion strategy, thereby enabling a more precise measurement of semantic similarity between texts. Primarily, the model incorporates SAT embedding to acquire phrase-level semantic information and leverages the MS-BERT model to encode text, improving the model's comprehension of textual polysemy and obtaining richer semantic representation. Then, the fusion of module features, including multi-branch convolutional networks and mix pooling, enables the extraction of textual features from varied levels, bolstering the model's representational capacity. Additionally, to further reduce overfitting risks while improving accuracy and other performance, a multi-layer feature adjustment network is utilized for short text similarity calculation. The final resultant experimental findings showcase the superiority of the SMGC-SBERT model over other neural network models, demonstrating significant advancements across both Chinese-SNLI and CCKS2018_Task3 Chinese short text datasets.
Image-text retrieval refers to querying the text or image of another modality given an image or text, and its key lies in the ability to accurately measure the similarity between text and image. Most of the existing r...
详细信息
ISBN:
(数字)9781665410205
ISBN:
(纸本)9781665410212
Image-text retrieval refers to querying the text or image of another modality given an image or text, and its key lies in the ability to accurately measure the similarity between text and image. Most of the existing retrieval methods only utilize the intramodal relations of each modality or the intermodal relations of the two modalities to perform the retrieval task. To address the above problems, we integrate the intra-modal and inter-modal relations and propose a Semantic Relation-based Cross Attention Network(SRCAN). In our proposed method, we first mine the possible intra-modal associations between regions in an image and between words in a text to obtain features with semantic relationships, and then capture the fine-grained associations between segments through the cross-attention mechanism. Finally, we balance the intra-modal and inter-modal relationships to improve the performance of the model. Our proposed method is experimentally validated on two datasets, Flickr30K and MS-COCO, and the experimental results show that our method achieves superior results.
暂无评论