GPT-4 (Generative Pre-Trained Transformer 4) is often heralded as a leading commercial AI offering, sparking debates over its potential as a steppingstone toward Artificial General Intelligence. But does it possess co...
详细信息
Gaze estimation technology is essential for applications such as human-computer interaction, augmented reality, and virtual reality. However, its accuracy is significantly compromised in low-light conditions due to de...
详细信息
Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online *** mainstream meth...
详细信息
Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online *** mainstream methods primarily focus on modeling the association of image-text pairs while neglecting the advantageous impact of multi-task learning on image-text *** this end,a multi-task visual semantic embedding network(MVSEN)is proposed for image-text ***,we design two auxiliary tasks,including text-text matching and multi-label classification,for semantic constraints to improve the generalization and robustness of visual semantic embedding from a training ***,we present an intra-and inter-modality interaction scheme to learn discriminative visual and textual feature representations by facilitating information flow within and between ***,we utilize multi-layer graph convolutional networks in a cascading manner to infer the correlation of image-text *** results show that MVSEN outperforms state-of-the-art methods on two publicly available datasets,Flickr30K and MSCOCO,with rSum improvements of 8.2%and 3.0%,respectively.
Nowadays, social media applications and websites have become a crucial part of people’s lives;for sharing their moments, contacting their families and friends, or even for their jobs. However, the fact that these val...
详细信息
Cervical cell segmentation is a significant task in medical image analysis and can be used for screening various cervical diseases. In recent years, substantial progress has been made in cervical cell segmentation tec...
详细信息
Conditional semantic textual similarity (C-STS) assesses the similarity between pairs of sentence representations under different conditions. The current method encounters the overestimation issue of positive and nega...
详细信息
Robot calligraphy visually reflects the motion capability of robotic *** traditional researches mainly focus on image generation and the writing of simple calligraphic strokes or characters,this article presents a gen...
详细信息
Robot calligraphy visually reflects the motion capability of robotic *** traditional researches mainly focus on image generation and the writing of simple calligraphic strokes or characters,this article presents a generative adversarial network(GAN)-based motion learning method for robotic calligraphy synthesis(Gan2CS)that can enhance the efficiency in writing complex calligraphy words and reproducing classic calligraphy *** key technologies in the proposed approach include:(1)adopting the GAN to learn the motion parameters from the robot writing operation;(2)converting the learnt motion data into the style font and realising the transition from static calligraphy images to dynamic writing demonstration;(3)reproducing high-precision calligraphy works by synthesising the writing motion data *** this study,the motion trajectories of sample calligraphy images are firstly extracted and converted into the robot *** robot performs the writing with motion planning,and the writing motion parameters of calligraphy strokes are learnt with *** the motion data of basic strokes is synthesised based on the hierarchical process of‘stroke-radicalpart-character’.And the robot re-writes the synthesised characters whose similarity with the original calligraphy characters is *** calligraphy characters have been tested in the experiments for method validation and the results validated that the robot can actualise the robotic calligraphy synthesis of writing motion data with GAN.
With the development of artificial intelligence, deep-learning-based log anomaly detection proves to be an important research topic. In this paper, we propose LogCSS, a novel log anomaly detection framework based on t...
详细信息
Multimodal hate detection aims to identify hate content across multiple modalities for promoting a harmonious online environment. Despite promising progress, three critical challenges, the absence of implicit hateful ...
详细信息
With the advent of the information security era,it is necessary to guarantee the privacy,accuracy,and dependable transfer of *** study presents a new approach to the encryption and compression of color *** is predicat...
详细信息
With the advent of the information security era,it is necessary to guarantee the privacy,accuracy,and dependable transfer of *** study presents a new approach to the encryption and compression of color *** is predicated on 2D compressed sensing(CS)and the hyperchaotic ***,an optimized Arnold scrambling algorithm is applied to the initial color images to ensure strong ***,the processed images are con-currently encrypted and compressed using 2D *** them,chaotic sequences replace traditional random measurement matrices to increase the system’s ***,the processed images are re-encrypted using a combination of permutation and diffusion *** addition,the 2D projected gradient with an embedding decryption(2DPG-ED)algorithm is used to reconstruct *** with the traditional reconstruction algorithm,the 2DPG-ED algorithm can improve security and reduce computational ***,it has better *** experimental outcome and the performance analysis indicate that this algorithm can withstand malicious attacks and prove the method is effective.
暂无评论