检索结果-内蒙古大学图书馆

Dual modality prompt learning for visual question-grounded answering in robotic surgery

Visual Computing for Industry,Biomedicine,and Art 2024年第1期7卷 316-328页

作者： Yue Zhang Wanshu Fan Peixi Peng Xin Yang Dongsheng Zhou Xiaopeng Wei National and Local Joint Engineering Laboratory of Computer Aided Design School of Software EngineeringDalian UniversityDalian 116622LiaoningChina School of Computer Science and Technology Dalian University of TechnologyDalian 116081LiaoningChina

With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content within the *** limitation restricts the interpretative capacity of the VQA models and their abil-ity to explore specific image *** address this issue,this study proposes a grounded VQA model for robotic surgery,capable of localizing a specific region during answer *** inspiration from prompt learning in language models,a dual-modality prompt model was developed to enhance precise multimodal information ***,two complementary prompters were introduced to effectively integrate visual and textual prompts into the encoding process of the model.A visual complementary prompter merges visual prompt knowl-edge with visual information features to guide accurate *** textual complementary prompter aligns vis-ual information with textual prompt knowledge and textual information,guiding textual information towards a more accurate inference of the ***,a multiple iterative fusion strategy was adopted for comprehensive answer reasoning,to ensure high-quality generation of textual and grounded *** experimental results vali-date the effectiveness of the model,demonstrating its superiority over existing methods on the EndoVis-18 and End-oVis-17 datasets.

关键词： Prompt learning Visual prompt Textual prompt Grounding-answering Visual question answering

来源：评论

学校读者我要写书评

暂无评论

Research on lightweight pavement disease detection model based on YOLOv7

引用

Journal of Intelligent and Fuzzy Systems 2024年第4期46卷 10573-10589页

作者： Wang, Chishe Li, Jun Wang, Jie Zhao, Weikang School of Computer Science and Engineering Anhui University of Science and Technology Huainan China Jinling Institute of Technology Nanjing China

Rapid urbanization has made road construction and maintenance imperative, but detecting road diseases has been time-consuming with limited accuracy. To overcome these challenges, we propose an efficient YOLOv7 road disease detection model. Our approach involves integrating MobilieNetV3 as the backbone feature extraction network to reduce the network's parameters and computational requirements. Additionally, we introduce the BRA attention module into the spatial pyramid pooling module to eliminate redundant information and enhance the network's feature representation capability. Moreover, we utilize the F-ReLU activation function in the backbone network, expanding the convolutional layers' receptive field range. To optimize the model's boundary loss, we employ the Wise-IoU loss function, which places more emphasis on the quality of ordinary samples and enhances the overall performance and generalization ability of the network. Experimental results demonstrate that our improved detection algorithm achieves a higher recall rate and mean average precision (mAP) on the public dataset (RDD) and the NJdata dataset in Nanjing's urban area. Specifically, compared to YOLOv7, our model increases the recall rate and mAP on RDD by 3.3% and 2.6%, respectively. On the NJdata dataset, our model improves the recall rate and mAP by 1.9% and 1.3%, respectively. Furthermore, our model reduces parameter and computational requirements by 30% and 22.5%, respectively, striking a balance between detection accuracy and speed. In conclusion, our road disease detection model presents an effective solution to address the challenges associated with road disease detection in urban areas. It offers improved accuracy, efficiency, and generalization capabilities compared to existing models. © 2024 - IOS Press. All rights reserved.

关键词： Roads and streets

来源：评论

学校读者我要写书评

暂无评论

Domain generalization with semi-supervised learning for people-centric activity recognition

引用

science China(Information sciences) 2025年第1期68卷 171-188页

作者： Jing LIU Wei ZHU Di LI Xing HU Liang SONG Academy for Engineering & Technology Fudan University Shanghai East-bund Research Institute on Networking Systems of AI School of Optoelectronic Information and Computer Engineering University of Shanghai for Science & Technology

People-centric activity recognition is one of the most critical technologies in a wide range of real-world applications,including intelligent transportation systems, healthcare services, and brain-computer interfaces. Large-scale data collection and annotation make the application of machine learning algorithms prohibitively expensive when adapting to new tasks. One way of circumventing this limitation is to train the model in a semi-supervised learning manner that utilizes a percentage of unlabeled data to reduce the labeling burden in prediction tasks. Despite their appeal, these models often assume that labeled and unlabeled data come from similar distributions, which leads to the domain shift problem caused by the presence of distribution gaps. To address these limitations, we propose herein a novel method for people-centric activity recognition,called domain generalization with semi-supervised learning(DGSSL), that effectively enhances the representation learning and domain alignment capabilities of a model. We first design a new autoregressive discriminator for adversarial training between unlabeled and labeled source domains, extracting domain-specific features to reduce the distribution gaps. Second, we introduce two reconstruction tasks to capture the task-specific features to avoid losing information related to representation learning while maintaining task-specific consistency. Finally, benefiting from the collaborative optimization of these two tasks, the model can accurately predict both the domain and category labels of the source domains for the classification task. We conduct extensive experiments on three real-world sensing datasets. The experimental results show that DGSSL surpasses the three state-of-the-art methods with better performance and generalization.

关键词： activity recognition deep learning domain generalization semi-supervised learning adversarial training

来源：评论

学校读者我要写书评

暂无评论

A Novel Fuzzy Marine White Shark Optimization Based Efficient Routing and Enhancing Network Lifetime in MANET

引用

WIRELESS PERSONAL COMMUNICATIONS 2023年第4期132卷 2363-2385页

作者： Devi, K. Lalitha Bhat, C. Rohith Devi, K. Lalitha KSR College of Engineering Tiruchengode Namakkal India Department of Computer Science and Engineering Saveetha School of Engineering SIMATS Chennai India Department of Computer Science and Engineering Sathyabama Institute of Science and Technology Chennai India

A mobile ad hoc network (MANET) is an independent wireless temporary network established by employing a set of mobile nodes (i.e. laptops, smartphones, iPods, etc.) appropriate for the environment in which the network infrastructures are not fixed. The most common problems faced by MANET are energy efficiency, high energy consumption, low network lifetime as well as high traffic overhead which create an impact on overall network topology. Hence, it is necessary to provide an energy-effective CH election to take steps against such issues. Therefore, this paper proposes a novel model to enhance the network lifetime and energy efficiency by performing a routing strategy in MANET. In this paper, an optimal CH is selected by proposing a novel Fuzzy Marine White Shark optimization (FMWSO) algorithm which is obtained by integrating fuzzy operation with two optimization algorithms namely the marine predator algorithm and white shark optimizer. The proposed approach comprises three diverse stages namely Generation of data, Cluster Generation and CH selection. A novel FMWSO algorithm is proposed in such a way to determine the CH selection in MANET thereby enhancing the network topology, network lifetime and minimizing the overhead rate, and energy consumption. Finally, the performance of the proposed FMWSO approach is compared with various other existing techniques to determine the effectiveness of the system. The proposed FMWSO approach consumes minimum energy of 0.62 mJ which is lower than other approaches.

关键词： MANET Cluster head Fuzzy Marine White Shark Cluster generation Network lifetime Energy consumption

来源：评论

学校读者我要写书评

暂无评论

Accelerating Distributed Urban Traffic Simulation via Enhanced Stale Synchronous Parallelism 11

Accelerating Distributed Urban Traffic Simulation via Enhanc...

引用

11th IEEE International Conference on Behavioural and Social Computing, BESC 2024

作者： Zhu, Haojia Ma, Ran Jin, Jiahui Lu, Hongru School of Computer Science and Engineering Southeast University China School of Software Engineering Southeast University China School of Computer Science Nanjing Audit University Nanjing China

ISBN: (纸本)9798331531904

Modeling urban mobility behaviours with micro-scopic traffic flow simulation is now crucial for studying intel-ligent urban decision-making algorithms, such as traffic light control and road congestion charging. However, in urban-scale traffic environments, simulation is computationally intensive. Most existing studies use distributed traffic simulations based on the Bulk Synchronous Parallel (BSP) model, leading to significant time and data transfer costs due to the need for coordination and data synchronization. To address these issues, we propose an enhanced Stale Synchronous Parallel (ESSP) model. Our model reduces the waiting time of the simulation compute nodes, and we design it to control and measure errors during simulation. We also develop a distributed traffic simulator to validate our model, and the results show a performance improvement of 30% to 50% compared to conventional synchronization methods. © 2024 IEEE.

关键词： Traffic congestion

来源：评论

学校读者我要写书评

暂无评论

Exploring diversity and time-aware recommendations: an LSTM-DNN model with novel bidirectional dynamic time warping algorithm

引用

Soft Computing 2025年第4期29卷 2003-2013页

作者： Li, Te Chen, Liqiong Sun, Huaiying Hou, Mengxia Lei, Yunjie Zhi, Kaiwen School of Computer and Engineering East China University of Science and Technology Meilong road Shanghai200237 China School of Computer Science and Information Engineering Shanghai Institute of Technology Haiquan road Shanghai201400 China School of Chemistry and Environmental Engineering Shanghai Institute of Technology Haiquan road Shanghai201400 China

With the advent of the Web 3.0 era, the amount and types of data in the network have sharply increased, and the application scenarios of recommendation algorithms are continuously expanding. Location recommendation has gradually become one of the popular application scenarios in recommendation algorithms. Traditional recommendation algorithms not only ignore the temporal attribute of data when recommending information to users, but also blindly pursue the recommendation accuracy, which will cause certain "information cocoon room" problems. Therefore, this article treats user historical data as a time series and proposes an LSTM-DNN model based on the novel bidirectional Dynamic Time Warping (DTW) algorithm. Firstly, in response to the issue of different users consuming different amounts of information, this article proposes a novel bidirectional DTW algorithm to calculate the similarity between different users. Secondly, this article supplements the user dataset from three perspectives: "utilization" and "exploration" of information, and spatiotemporal attributes of data, which alleviates the problem of data sparsity and cold start in the dataset to a certain extent. Moreover, it effectively enhances the diversity of recommendation results. Finally, this paper constructs an Long Short-Term Memory-Deep Neural Networks (LSTM-DNN) to dynamically obtain user interests and preferences, and proposes a new metric Cumulative Self-System Diversity (CSSD) to measure the diversity of algorithm recommendation results. Experiments have shown that the model effectively enhances the diversity of recommendation results while ensuring recommendation accuracy. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Time series

来源：评论

学校读者我要写书评

暂无评论

ER-Net:Efficient Recalibration Network for Multi-ViewMulti-Person 3D Pose Estimation

引用

computer Modeling in engineering & sciences 2023年第8期136卷 2093-2109页

作者： Mi Zhou Rui Liu Pengfei Yi Dongsheng Zhou National and Local Joint Engineering Laboratory of Computer Aided Design School of Software EngineeringDalian UniversityDalian116622China School of Computer Science and Technology Dalian University of TechnologyDalian116024China

Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application *** the introduction of end-to-end direct regression methods,the field has entered a new stage of ***,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal *** this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external ***,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding *** call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.

关键词： Multi-view multi-person pose estimation attention mechanism computer vision

来源：评论

学校读者我要写书评

暂无评论

Location Enhancement and Multi-template Fusion for Fast Long-Term Tracking 5

Location Enhancement and Multi-template Fusion for Fast Long...

引用

5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024

作者： Chen, Shixin Software Engineering Institute of Guangzhou Department of Computer Science Guangzhou510980 China

ISBN: (纸本)9798350366174

Owing to the challenge of target occlusion leading to tracking failure during the target tracking process, achieving efficient and robust tracking of targets under occlusion scenarios has become a focal point of research. In this paper, we addresses the issue of tracking failure caused by the reduction or disappearance of target appearance information in occlusion scenarios. It proposes a study on target tracking under occlusion scenes based on deep learning methodologies. A foundation tracker is constructed that leverages location enhancement and multi-template fusion. Built upon this, graph attention mechanisms and target loss detection mechanisms are introduced. In cases where the target is lost due to occlusion, a re-detection strategy is employed to resume tracking once the target reappears. Experiments conducted on datasets demonstrate that the proposed algorithm can effectively determine whether a target is undergoing occlusion and significantly enhances tracking performance under such conditions. © 2024 IEEE.

关键词： Target tracking

来源：评论

学校读者我要写书评

暂无评论

A Method Based on Genetic Programming to Automatically Construct Factors for Annual Report Scenarios 20

A Method Based on Genetic Programming to Automatically Const...

引用

20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2024

作者： Ma, Yan Zhang, Changsheng Gao, Yan Guo, Ying School of Computer Science and Engineering Northeastern University Shenyang China Software College Northeastern University Shenyang China College of Computer Science and Engineering Ningxia Institute of Science and Technology Shizuishan China

ISBN: (纸本)9798350356328

Factors have always played an important role in stock analysis, but they are only effective for specific problems in specific scenarios. Therefore, constructing factors timely and quickly for different scenarios is an urgent problem to be solved. Although some experts have constructed factors, they need to manually construct factors for each scenario, and the construction process consumes time and effort. The annual report is an important and common scenario. It is an important form of information disclosure and financial reporting. Therefore, this paper proposes a method for automatically constructing factors based on genetic programming that combines expert experience for this scenario. It incorporates the knowledge and insights of experts in the field of stock analysis into the process of automati-cally constructing factors, and continuously adjusts and improves the combination of factors through genetic programming to adapt to the needs of the scenario. The effectiveness of this method is verified through empirical analysis and its advantages in specific scenarios are demonstrated. © 2024 IEEE.

关键词： Genetic programming

来源：评论

学校读者我要写书评

暂无评论

Theoretical Analysis of an Adaptive Closeness Centrality-Based Algorithm for Dynamic Optimization of Transportation Networks

Theoretical Analysis of an Adaptive Closeness Centrality-Bas...

引用

2024 International Conference on engineering Management of Communication and Technology, EMCTECH 2024

作者： Mann, Michael Afeka - the Academic College of Engineering School of Software Engineering and Computer Science Tel Aviv Israel

ISBN: (数字)9798331507909

ISBN: (纸本)9798331507909

Purpose: This paper presents a theoretical analysis of the DynaTrans algorithm, a novel approach for dynamic optimization of urban transportation networks. Design/methodology/approach: We introduce an Adaptive Closeness Centrality (ACC) metric and the DynaTrans algorithm, providing formal proofs of correctness, convergence, and efficiency. The analysis employs graph theory, algorithmic complexity theory, and competitive analysis techniques. Findings: We prove that DynaTrans converges to a local optimal state in O(|V|/ϵ) iterations, with each iteration has limited computational and memory requirements. The algorithm achieves solutions within a factor of O(log|V|) from the global optimum, outperforming simple greedy approaches. Practical implications: DynaTrans offers a theoretically sound foundation for real-time traffic management systems, potentially improving urban mobility and reducing congestion. Originality/value: This work introduces a new paradigm for dynamic transportation network optimization, combining adaptive centrality measures with efficient graph algorithms. The rigorous theoretical analysis provides a solid basis for practical implementation and future research in urban traffic management. © 2024 IEEE.

关键词： Traffic congestion

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：