咨询与建议

限定检索结果

文献类型

  • 90 篇 会议
  • 70 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 161 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 153 篇 工学
    • 124 篇 计算机科学与技术...
    • 35 篇 电气工程
    • 15 篇 软件工程
    • 12 篇 信息与通信工程
    • 12 篇 控制科学与工程
    • 9 篇 测绘科学与技术
    • 8 篇 电子科学与技术(可...
    • 7 篇 生物医学工程(可授...
    • 4 篇 机械工程
    • 4 篇 仪器科学与技术
    • 4 篇 材料科学与工程(可...
    • 2 篇 交通运输工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 环境科学与工程(可...
    • 1 篇 生物工程
  • 31 篇 医学
    • 22 篇 临床医学
    • 8 篇 特种医学
    • 4 篇 基础医学(可授医学...
    • 1 篇 中西医结合
    • 1 篇 医学技术(可授医学...
  • 23 篇 理学
    • 10 篇 地球物理学
    • 8 篇 物理学
    • 6 篇 化学
    • 5 篇 生物学
    • 3 篇 地理学
    • 1 篇 天文学
    • 1 篇 地质学
  • 5 篇 管理学
    • 4 篇 管理科学与工程(可...
    • 1 篇 图书情报与档案管...
  • 1 篇 哲学
    • 1 篇 哲学
  • 1 篇 农学

主题

  • 161 篇 vision-language ...
  • 16 篇 large language m...
  • 13 篇 prompt learning
  • 11 篇 clip
  • 11 篇 few-shot learnin...
  • 10 篇 visualization
  • 7 篇 contrastive lear...
  • 6 篇 foundation model...
  • 6 篇 remote sensing
  • 6 篇 training
  • 6 篇 adaptation model...
  • 5 篇 object detection
  • 5 篇 deep learning
  • 5 篇 feature extracti...
  • 5 篇 image classifica...
  • 4 篇 long-tailed reco...
  • 4 篇 computational mo...
  • 4 篇 artificial intel...
  • 4 篇 computer vision
  • 4 篇 domain generaliz...

机构

  • 4 篇 chinese acad sci...
  • 4 篇 carnegie mellon ...
  • 4 篇 univ chinese aca...
  • 3 篇 inesc tec porto
  • 3 篇 sichuan univ col...
  • 3 篇 univ chinese aca...
  • 3 篇 zhejiang univ pe...
  • 3 篇 chinese univ hon...
  • 2 篇 shanghai ai lab ...
  • 2 篇 ecole polytech f...
  • 2 篇 tsinghua univ de...
  • 2 篇 harbin inst tech...
  • 2 篇 univ porto fac e...
  • 2 篇 cent south univ ...
  • 2 篇 beijing univ pos...
  • 2 篇 city univ hong k...
  • 2 篇 china univ geosc...
  • 2 篇 sichuan univ col...
  • 2 篇 tech univ munich...
  • 2 篇 westlake univ sc...

作者

  • 4 篇 banerjee biplab
  • 4 篇 zhang yi
  • 4 篇 jha ankit
  • 3 篇 wang donglin
  • 3 篇 singha mainak
  • 3 篇 ding kun
  • 3 篇 zhang ce
  • 3 篇 tuia devis
  • 2 篇 men aidong
  • 2 篇 li haifeng
  • 2 篇 zhang min
  • 2 篇 liu xuyang
  • 2 篇 chen honggang
  • 2 篇 ma chao
  • 2 篇 guo miaotian
  • 2 篇 yang yang
  • 2 篇 ricci elisa
  • 2 篇 ye mao
  • 2 篇 tian liang
  • 2 篇 patricio cristia...

语言

  • 159 篇 英文
  • 1 篇 其他
检索条件"主题词=Vision-language Models"
161 条 记 录,以下是1-10 订阅
排序:
Large vision-language models enabled novel objects 6D pose estimation for human-robot collaboration
收藏 引用
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING 2025年 95卷
作者: Xia, Wanqing Zheng, Hao Xu, Weiliang Xu, Xun Univ Auckland Dept Mech & Mechatron Engn Auckland New Zealand
Six-Degree-of-Freedom (6D) pose estimation is essential for robotic manipulation tasks, especially in human-robot collaboration environments. Recently, 6D pose estimation has been extended from seen objects to novel o... 详细信息
来源: 评论
Consistent prompt learning for vision-language models
收藏 引用
KNOWLEDGE-BASED SYSTEMS 2025年 310卷
作者: Zhang, Yonggang Tian, Xinmei Hong Kong Baptist Univ Dept Comp Sci Hong Kong Peoples R China Univ Sci & Technol China Natl Engn Lab Brain Inspired Intelligence Technol Hefei 230000 Anhui Peoples R China
Pre-trained vision-language models, such as CLIP, have shown remarkable capabilities across various downstream tasks by learning prompts that consist of context concatenated with a class name;for example, 'a photo... 详细信息
来源: 评论
H2R Bridge: Transferring vision-language models to few-shot intention meta-perception in human robot collaboration
收藏 引用
JOURNAL OF MANUFACTURING SYSTEMS 2025年 80卷 524-535页
作者: Wu, Duidi Zhao, Qianyou Fan, Junming Qi, Jin Zheng, Pai Hu, Jie Shanghai Jiao Tong Univ Sch Mech Engn Shanghai 200240 Peoples R China State Key Lab Mech Syst & Vibrat Shanghai 200240 Peoples R China Hong Kong Polytech Univ Dept Ind & Syst Engn Hong Kong Peoples R China
Human-robot collaboration enhances efficiency by enabling robots to work alongside human operators in shared tasks. Accurately understanding human intentions is critical for achieving a high level of collaboration. Ex... 详细信息
来源: 评论
APOVIS: Automated pixel-level open-vocabulary instance segmentation through integration of pre-trained vision-language models and foundational segmentation models
收藏 引用
IMAGE AND vision COMPUTING 2025年 154卷
作者: Ma, Qiujie Yang, Shuqi Zhang, Lijuan Lan, Qing Yang, Dongdong Chen, Honghan Tan, Ying Southwest Minzu Univ State Ethn Affairs Commiss Key Lab Comp Syst Chengdu Peoples R China Hosp Chengdu Univ TCM Chengdu Peoples R China
In recent years, substantial advancements have been achieved in vision-language integration and image segmentation, particularly through the use of pre-trained models like BERT and vision Transformer (ViT). Within the... 详细信息
来源: 评论
Fine-grained multi-modal prompt learning for vision-language models
收藏 引用
NEUROCOMPUTING 2025年 636卷
作者: Liu, Yunfei Deng, Yunziwei Liu, Anqi Liu, Yanan Li, Shengyang Chinese Acad Sci Technol & Engn Ctr Space Utilizat Beijing 100094 Peoples R China Chinese Acad Sci Key Lab Space Utilizat Beijing 100094 Peoples R China Univ Chinese Acad Sci Beijing 100094 Peoples R China
Recently advanced pre-trained vision language models have demonstrated outstanding performance in many downstream tasks via prompt learning. Prompt learning provides task-specific prompt information to exploit benefic... 详细信息
来源: 评论
Advancements in vision-language models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
收藏 引用
REMOTE SENSING 2025年 第1期17卷 162-162页
作者: Tao, Lijie Zhang, Haokui Jing, Haizhao Liu, Yu Yan, Dawei Wei, Guoting Xue, Xizhe Northwestern Polytech Univ Sch Cybersecur Int Cooperat Dept Xian 710072 Peoples R China Zhejiang Lab Inst Intelligent Percept Hangzhou 311500 Peoples R China Nanjing Univ Sci & Technol Sch Comp Sci & Engn Nanjing 210094 Peoples R China Tech Univ Munich Dept Aerosp & Geodesy DE-80333 Munich Germany
Recently, the remarkable success of ChatGPT has sparked a renewed wave of interest in artificial intelligence (AI), and the advancements in vision-language models (VLMs) have pushed this enthusiasm to new heights. Dif... 详细信息
来源: 评论
Integrating With Multimodal Information for Enhancing Robotic Grasping With vision-language models
收藏 引用
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2025年 22卷 13073-13086页
作者: Zhao, Zhou Zheng, Dongyuan Chen, Yizi Luo, Jing Wang, Yanjun Huang, Panfeng Yang, Chenguang Cent China Normal Univ Sch Comp Sci Wuhan 430079 Peoples R China Hubei Engn Res Ctr Intelligent Detect & Identifica Wuhan 430205 Peoples R China ETH Inst Cartog & Geoinformat CH-8093 Zurich Switzerland Wuhan Univ Technol Sch Automat Wuhan 430070 Peoples R China Shanghai Jiao Tong Univ Inst Marine Equipment Shanghai 200240 Peoples R China Northwestern Polytech Univ Sch Astronaut Natl Key Lab Aerosp Flight Dynam Xian 710072 Peoples R China Northwestern Polytech Univ Res Ctr Intelligent Robot Sch Astronaut Xian 710072 Peoples R China Univ Liverpool Dept Comp Sci Liverpool L69 3BX England
As robots grow increasingly intelligent and utilize data from various sensors, relying solely on unimodal data sources is becoming inadequate for their operational needs. Consequently, integrating multimodal data has ... 详细信息
来源: 评论
IFShip: Interpretable fine-grained ship classification with domain knowledge-enhanced vision-language models
收藏 引用
PATTERN RECOGNITION 2025年 166卷
作者: Guo, Mingning Wu, Mengwei Shen, Yuxiang Li, Haifeng Tao, Chao Cent South Univ Sch Geosci & Info Phys Changsha 410083 Peoples R China
End-to-end interpretation currently dominates the remote sensing fine-grained ship classification (RS-FGSC) task. However, the inference process remains uninterpretable, leading to criticisms of these models as "... 详细信息
来源: 评论
Bootstrapping vision-language models for Frequency-Centric Self-Supervised Remote Physiological Measurement
收藏 引用
INTERNATIONAL JOURNAL OF COMPUTER vision 2025年 1-22页
作者: Yue, Zijie Shi, Miaojing Wang, Hanli Ding, Shuai Chen, Qijun Yang, Shanlin Tongji Univ Coll Elect & Informat Engn Shanghai Peoples R China Tongji Univ Shanghai Inst Intelligent Sci & Technol Shanghai Peoples R China Hefei Univ Technol Sch Management Hefei Peoples R China
Facial video-based remote physiological measurement is a promising research area for detecting human vital signs (e.g., heart rate, respiration frequency) in a non-contact way. Conventional approaches are mostly super... 详细信息
来源: 评论
Unleash the Power of vision-language models by Visual Attention Prompt and Multimodal Interaction
收藏 引用
IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 2399-2411页
作者: Zhang, Wenyao Wu, Letian Zhang, Zequn Yu, Tao Ma, Chao Jin, Xin Yang, Xiaokang Zeng, Wenjun Shanghai Jiao Tong Univ AI Inst MoE Key Lab Artificial Intelligence Shanghai 200240 Peoples R China Ningbo Inst Digital Twin Eastern Inst Technol Ningbo 315200 Peoples R China Southeast Univ Sch Automat Nanjing 210096 Peoples R China Univ Sci & Technol China Dept Elect Engn & Informat Sci Hefei 230026 Peoples R China
Pre-trained vision-language models (VLMs), equipped with parameter-efficient tuning (PET) methods like prompting, have shown impressive knowledge transferability on new downstream tasks, but they are still prone to be... 详细信息
来源: 评论