咨询与建议

限定检索结果

文献类型

  • 30 篇 期刊文献
  • 23 篇 会议

馆藏范围

  • 53 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 48 篇 工学
    • 43 篇 计算机科学与技术...
    • 14 篇 电气工程
    • 10 篇 信息与通信工程
    • 9 篇 软件工程
    • 6 篇 控制科学与工程
    • 3 篇 机械工程
    • 3 篇 生物医学工程(可授...
    • 2 篇 电子科学与技术(可...
    • 2 篇 测绘科学与技术
    • 2 篇 环境科学与工程(可...
    • 2 篇 网络空间安全
    • 1 篇 仪器科学与技术
  • 7 篇 管理学
    • 6 篇 管理科学与工程(可...
    • 1 篇 图书情报与档案管...
  • 6 篇 医学
    • 4 篇 临床医学
    • 2 篇 特种医学
  • 5 篇 理学
    • 2 篇 数学
    • 2 篇 地球物理学
    • 1 篇 生物学
  • 1 篇 教育学
    • 1 篇 教育学

主题

  • 53 篇 vision language ...
  • 14 篇 large language m...
  • 4 篇 multimodal learn...
  • 4 篇 multimodal model
  • 4 篇 foundation model
  • 4 篇 training
  • 3 篇 deep learning
  • 3 篇 visual reasoning
  • 3 篇 visualization
  • 3 篇 multimodal
  • 3 篇 prompt learning
  • 2 篇 vision transform...
  • 2 篇 tuning
  • 2 篇 visual question ...
  • 2 篇 smart manufactur...
  • 2 篇 multimodal retri...
  • 2 篇 computational mo...
  • 2 篇 semantics
  • 2 篇 multimodal large...
  • 2 篇 data models

机构

  • 2 篇 peng cheng lab p...
  • 1 篇 natl univ singap...
  • 1 篇 east china norma...
  • 1 篇 fudan univ sch c...
  • 1 篇 karlsruhe inst t...
  • 1 篇 shanghai univ en...
  • 1 篇 tulane univ dept...
  • 1 篇 cent south univ ...
  • 1 篇 univ connecticut...
  • 1 篇 google res mount...
  • 1 篇 univ auckland ct...
  • 1 篇 qualcomm ai res ...
  • 1 篇 meta fair menlo ...
  • 1 篇 calif state univ...
  • 1 篇 nanjing universi...
  • 1 篇 south china univ...
  • 1 篇 univ calif merce...
  • 1 篇 univ s florida c...
  • 1 篇 src biosci tampa...
  • 1 篇 hitachi ltd

作者

  • 3 篇 zhu xinliang
  • 3 篇 dhua arnab
  • 2 篇 jayaprakash s.l.
  • 2 篇 gray douglas
  • 2 篇 tran son
  • 2 篇 fan haolin
  • 2 篇 mukesh k.
  • 2 篇 fuh jerry ying h...
  • 2 篇 li bingbing
  • 1 篇 zheng hongwei
  • 1 篇 zhang youzhi
  • 1 篇 gao yongbin
  • 1 篇 li yonghao
  • 1 篇 zhang hongji
  • 1 篇 chen hang
  • 1 篇 peng chengli
  • 1 篇 liu zhe
  • 1 篇 komamizu takahir...
  • 1 篇 yu runsheng
  • 1 篇 kang byung ok

语言

  • 53 篇 英文
检索条件"主题词=Vision Language Model"
53 条 记 录,以下是1-10 订阅
排序:
Soccer-CLIP: vision language model for Soccer Action Spotting
收藏 引用
IEEE ACCESS 2025年 13卷 44354-44365页
作者: Shin, Yoonho Park, Sanghoon Han, Youngsub Jeon, Byoung-Ki Lee, Soonyoung Kang, Byung Jun LG UPlus Seoul 07795 South Korea LG AI Res Seoul 07336 South Korea
In the rapidly advancing field of computer vision, the application of multimodal models-specifically, vision-language frameworks-has shown substantial promise for complex tasks such as video-based action spotting. Thi... 详细信息
来源: 评论
vision language model for interpretable and fine-grained detection of safety compliance in diverse workplaces
收藏 引用
EXPERT SYSTEMS WITH APPLICATIONS 2025年 265卷
作者: Chen, Zhiling Chen, Hanning Imani, Mohsen Chen, Ruimin Imani, Farhad Univ Connecticut Sch Mech Aerosp & Mfg Engn Storrs CT 06269 USA Univ Calif Irvine Dept Comp Sci Irvine CA USA
Workplace accidents due to personal protective equipment (PPE) non-compliance raise serious safety concerns and lead to legal liabilities, financial penalties, and reputational damage. While object detection models ha... 详细信息
来源: 评论
Sample Efficient Reinforcement Learning via Large vision language model Distillation
Sample Efficient Reinforcement Learning via Large Vision Lan...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Lee, Donghoon Luu, Tung M. Lee, Younghwan Yoo, Chang D. Robotics Program KAIST Daejeon Korea Republic of Electrical Engineering KAIST Daejeon Korea Republic of
Recent research highlights the potential of multimodal foundation models in tackling complex decision-making challenges. However, their large parameters make real-world deployment resource-intensive and often impracti... 详细信息
来源: 评论
VLM-MSGraph: vision language model-enabled Multi-hierarchical Scene Graph for robotic assembly
收藏 引用
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING 2025年 94卷
作者: Li, Shufei Yan, Zhijie Wang, Zuoxu Gao, Yiping Beihang Univ Sch Mech Engn & Automat Beijing Peoples R China City Univ Hong Kong Dept Syst Engn Hong Kong Peoples R China Huazhong Univ Sci & Technol State Key Lab Digital Mfg Equipment & Technol Wuhan Peoples R China
Intelligent robotic assembly is becoming a pivotal component of the manufacturing sector, driven by growing demands for flexibility, sustainability, and resilience. Robots in manufacturing environments need perception... 详细信息
来源: 评论
Alzheimer's disease recognition using graph neural network by leveraging image-text similarity from vision language model
收藏 引用
SCIENTIFIC REPORTS 2025年 第1期15卷 1-14页
作者: Lee, Byounghwa Bang, Jeong-Uk Song, Hwa Jeon Kang, Byung Ok Elect & Telecommun Res Inst ETRI Integrated Intelligence Res Sect Daejeon 34129 South Korea
Alzheimer's disease (AD), a progressive neurodegenerative condition, notably impacts cognitive functions and daily activity. One method of detecting dementia involves a task where participants describe a given pic... 详细信息
来源: 评论
vision language model Empowered Surgical Planning
Vision Language Model Empowered Surgical Planning
收藏 引用
2024 International Conference on Intelligent Robotics and Automatic Control
作者: Chen, Yihe Yu, Runsheng Wang, Xin Wang, Wensheng Tan, Ning Zhang, Youzhi Nanjing Univ Sch Artificial Intelligence Nanjing Peoples R China Hong Kong Univ Sci & Technol Dept Comp Sci Hong Kong Peoples R China Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou Peoples R China Chinese Acad Sci Hong Kong Inst Sci & Innovat Ctr Artificial Intelligence & Robot Hong Kong Peoples R China
The integration of a flexible endoscope with a surgical manipulator is crucial in minimally invasive surgery (MIS), facilitating detailed visualization of the operative field within the patient's body. During MIS,... 详细信息
来源: 评论
ZEN-IQA: Zero-Shot Explainable and No-Reference Image Quality Assessment With vision language model
收藏 引用
IEEE ACCESS 2024年 12卷 70973-70983页
作者: Miyata, Takamichi Chiba Inst Technol Chiba 2750016 Japan
No-reference image quality assessment (NR-IQA), which aims to estimate the perceptual quality of a degraded image without accessing the corresponding original image, is a key challenge in low-level computer vision. Re... 详细信息
来源: 评论
Enhancing metal additive manufacturing training with the advanced vision language model: A pathway to immersive augmented reality training for non-experts
收藏 引用
JOURNAL OF MANUFACTURING SYSTEMS 2024年 75卷 257-269页
作者: Fan, Haolin Zhang, Hongji Ma, Changyu Wu, Tongzi Fuh, Jerry Ying Hsi Li, Bingbing Natl Univ Singapore Dept Mech Engn Singapore 117575 Singapore Calif State Univ Northridge Auton Res Ctr STEAHM ARCS Northridge CA 91330 USA
This paper introduces an innovative training system for the Renishaw AM400 metal printer, leveraging the synergy of the advanced vision language model (VLM) with Augmented Reality (AR) within the Digital Twins (DT) fr... 详细信息
来源: 评论
Rugby Scene Classification Enhanced by vision language model
Rugby Scene Classification Enhanced by Vision Language Model
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Nonaka, Naoki Fujihira, Ryo Koshiba, Toshiki Maeda, Akira Seita, Jun RIKEN Informat R&D & Strategy Headquarters Adv Data Sci Project Wako Saitama Japan Hakata Knee & Sports Clin Fukuoka Japan
This study investigates the integration of vision language models (VLM) to enhance the classification of situations within rugby match broadcasts. The importance of accurately identifying situations in sports videos i... 详细信息
来源: 评论
QViLa: Quantum Infused vision-language model for Enhanced Multimodal Understanding
收藏 引用
SN Computer Science 2024年 第8期5卷 1023页
作者: Mukesh, K. Jayaprakash, S.L. Kumar, R. Prasanna Department of Computer Science and Engineering Amrita School of Computing Amrita Vishwa Vidyapeetham Tamilnadu Chennai 601103 India
vision-language models have emerged as transformative tools, revolutionizing the integration of visual and textual information, forging pathways for nuanced interpretations across various applications. The evolution o... 详细信息
来源: 评论