With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of th...
详细信息
With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content within the *** limitation restricts the interpretative capacity of the VQA models and their abil-ity to explore specific image *** address this issue,this study proposes a grounded VQA model for robotic surgery,capable of localizing a specific region during answer *** inspiration from prompt learning in language models,a dual-modality prompt model was developed to enhance precise multimodal information ***,two complementary prompters were introduced to effectively integrate visual and textual prompts into the encoding process of the model.A visual complementary prompter merges visual prompt knowl-edge with visual information features to guide accurate *** textual complementary prompter aligns vis-ual information with textual prompt knowledge and textual information,guiding textual information towards a more accurate inference of the ***,a multiple iterative fusion strategy was adopted for comprehensive answer reasoning,to ensure high-quality generation of textual and grounded *** experimental results vali-date the effectiveness of the model,demonstrating its superiority over existing methods on the EndoVis-18 and End-oVis-17 datasets.
Complex networking analysis is a powerful technique for understanding both complex networks and big graphs in ubiquitous computing. Particularly, there are several novel metrics, such as k-clique and k-core are propos...
详细信息
Mining potential and valuable medical knowledge from massive medical data to support clinical decision-making has become an important research field. Personalized medicine recommendation is an important research direc...
详细信息
Texture plays an important role in cartoon illustrations to display object materials and enrich visual experiences. Unfortunately, manually designing and drawing an appropriate texture is not easy even for proficient ...
详细信息
The Internet of Underwater Things (IoUT) has garnered significant interest due to its potential applications in monitoring underwater environments. However, the unique characteristics of acoustic communication, such a...
详细信息
The Internet of Underwater Things (IoUT) has garnered significant interest due to its potential applications in monitoring underwater environments. However, the unique characteristics of acoustic communication, such as long propagation delays and high attenuation, present considerable obstacles for achieving efficient and dependable data transmission. Opportunistic routing is a crucial technique for enhancing packet delivery ratios by selecting a set of forwarding nodes and utilizing their cooperative forwarding to boost network throughput. Nevertheless, choosing an excessive number of forwarding nodes can lead to wasteful energy usage and extended communication delays. Moreover, the overlooked trustworthiness of forwarded nodes in most research works can undermine the effectiveness of opportunistic routing. Therefore, this study presents a novel trust opportunistic routing scheme that employs reinforcement learning to achieve resilience in constantly changing underwater settings. The combination of reinforcement learning and trust management enables the proposed opportunistic routing scheme to adapt to the unstable underwater environment and unknown malicious attacks. Initially, a method is introduced for measuring environmental fitness by considering multiple trust factors, including communication success rate, data reliability, and location dynamics. The proposed scheme then uses reinforcement learning to develop a reliable opportunistic routing method based on quantified state information. This component employs the obtained state to formulate action strategies and obtains reward values from environmental inputs. The reward update equation integrates these qualities to optimize the deployment of superior action strategies, finally achieving trust opportunistic routing for underwater data collection. Fundamental experimental results demonstrate that the proposed protocol performs exceptionally well in demanding underwater conditions, outperforming existing method
Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome...
详细信息
Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome the issue of pest detection on crops. We have developed the YOLOCSP-PEST model for Pest localization and classification. With the Cross Stage Partial Network (CSPNET) backbone, the proposed model is a modified version of You Only Look Once Version 7 (YOLOv7) that is intended primarily for pest localization and classification. Our proposed model gives exceptionally good results under conditions that are very challenging for any other comparable models especially conditions where we have issues with the luminance and the orientation of the images. It helps farmers working out on their crops in distant areas to determine any infestation quickly and accurately on their crops which helps in the quality and quantity of the production yield. The model has been trained and tested on 2 datasets namely the IP102 data set and a local crop data set on both of which it has shown exceptional results. It gave us a mean average precision (mAP) of 88.40% along with a precision of 85.55% and a recall of 84.25% on the IP102 dataset meanwhile giving a mAP of 97.18% on the local data set along with a recall of 94.88% and a precision of 97.50%. These findings demonstrate that the proposed model is very effective in detecting real-life scenarios and can help in the production of crops improving the yield quality and quantity at the same time.
Knowledge of medication and disease has been rapidly accumulated. Also, an increasing number of researchers have paid more attention to predicting medicine-disease associations by machine learning methods. The associa...
详细信息
Emotion-cause pair extraction(ECPE)aims to extract all the pairs of emotions and corresponding causes in a *** generally contains three subtasks,emotions extraction,causes extraction,and causal relations detection bet...
详细信息
Emotion-cause pair extraction(ECPE)aims to extract all the pairs of emotions and corresponding causes in a *** generally contains three subtasks,emotions extraction,causes extraction,and causal relations detection between emotions and *** works adopt pipelined approaches or multi-task learning to address the ECPE ***,the pipelined approaches easily suffer from error propagation in real-world *** multi-task learning cannot optimize all tasks globally and may lead to suboptimal extraction *** address these issues,we propose a novel framework,Pairwise Tagging Framework(PTF),tackling the complete emotion-cause pair extraction in one unified tagging *** prior works,PTF innovatively transforms all subtasks of ECPE,i.e.,emotions extraction,causes extraction,and causal relations detection between emotions and causes,into one unified clause-pair tagging *** this unified tagging task,we can optimize the ECPE task globally and extract more accurate emotion-cause *** validate the feasibility and effectiveness of PTF,we design an end-to-end PTF-based neural network and conduct experiments on the ECPE benchmark *** experimental results show that our method outperforms pipelined approaches significantly and typical multi-task learning approaches.
Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application *** the introduction of end-to-end direct regression methods,the field has ent...
详细信息
Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application *** the introduction of end-to-end direct regression methods,the field has entered a new stage of ***,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal *** this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external ***,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding *** call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.
Vision-based semantic scene completion task aims to predict dense geometric and semantic 3D scene representations from 2D images. However, 3D modeling from a single view is an ill-posed problem, limited by the field o...
详细信息
暂无评论