咨询与建议

限定检索结果

文献类型

  • 91 篇 会议
  • 62 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 154 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 146 篇 工学
    • 120 篇 计算机科学与技术...
    • 30 篇 电气工程
    • 15 篇 软件工程
    • 13 篇 控制科学与工程
    • 11 篇 信息与通信工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 生物医学工程(可授...
    • 6 篇 测绘科学与技术
    • 4 篇 机械工程
    • 4 篇 仪器科学与技术
    • 4 篇 材料科学与工程(可...
    • 3 篇 交通运输工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 环境科学与工程(可...
    • 1 篇 生物工程
    • 1 篇 安全科学与工程
  • 28 篇 医学
    • 19 篇 临床医学
    • 8 篇 特种医学
    • 4 篇 基础医学(可授医学...
  • 21 篇 理学
    • 8 篇 物理学
    • 7 篇 地球物理学
    • 6 篇 化学
    • 5 篇 生物学
    • 2 篇 地理学
    • 1 篇 数学
    • 1 篇 天文学
    • 1 篇 地质学
    • 1 篇 统计学(可授理学、...
  • 6 篇 管理学
    • 5 篇 管理科学与工程(可...
  • 1 篇 哲学
    • 1 篇 哲学
  • 1 篇 农学

主题

  • 154 篇 vision-language ...
  • 15 篇 large language m...
  • 12 篇 prompt learning
  • 10 篇 clip
  • 10 篇 few-shot learnin...
  • 6 篇 contrastive lear...
  • 6 篇 foundation model...
  • 6 篇 visualization
  • 5 篇 deep learning
  • 4 篇 multimodal learn...
  • 4 篇 object detection
  • 4 篇 long-tailed reco...
  • 4 篇 remote sensing
  • 4 篇 image classifica...
  • 4 篇 artificial intel...
  • 4 篇 computer vision
  • 4 篇 domain generaliz...
  • 4 篇 prompt tuning
  • 3 篇 representation l...
  • 3 篇 image captioning

机构

  • 4 篇 carnegie mellon ...
  • 4 篇 univ chinese aca...
  • 3 篇 inesc tec porto
  • 3 篇 sichuan univ col...
  • 3 篇 univ chinese aca...
  • 3 篇 chinese univ hon...
  • 3 篇 chinese acad sci...
  • 2 篇 shanghai ai lab ...
  • 2 篇 ecole polytech f...
  • 2 篇 tsinghua univ de...
  • 2 篇 harbin inst tech...
  • 2 篇 zhejiang univ pe...
  • 2 篇 univ porto fac e...
  • 2 篇 beijing univ pos...
  • 2 篇 city univ hong k...
  • 2 篇 sichuan univ col...
  • 2 篇 tech univ munich...
  • 2 篇 westlake univ sc...
  • 2 篇 univ elect sci &...
  • 2 篇 johns hopkins un...

作者

  • 4 篇 banerjee biplab
  • 4 篇 zhang yi
  • 4 篇 jha ankit
  • 3 篇 wang donglin
  • 3 篇 singha mainak
  • 3 篇 zhang ce
  • 3 篇 tuia devis
  • 2 篇 men aidong
  • 2 篇 zhang min
  • 2 篇 liu xuyang
  • 2 篇 chen honggang
  • 2 篇 guo miaotian
  • 2 篇 yang yang
  • 2 篇 ricci elisa
  • 2 篇 ye mao
  • 2 篇 tian liang
  • 2 篇 patricio cristia...
  • 2 篇 wang haiying
  • 2 篇 teixeira luis f.
  • 2 篇 mukhopadhyay sou...

语言

  • 152 篇 英文
  • 2 篇 其他
检索条件"主题词=Vision-language Models"
154 条 记 录,以下是1-10 订阅
排序:
Large vision-language models enabled novel objects 6D pose estimation for human-robot collaboration
收藏 引用
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING 2025年 95卷
作者: Xia, Wanqing Zheng, Hao Xu, Weiliang Xu, Xun Univ Auckland Dept Mech & Mechatron Engn Auckland New Zealand
Six-Degree-of-Freedom (6D) pose estimation is essential for robotic manipulation tasks, especially in human-robot collaboration environments. Recently, 6D pose estimation has been extended from seen objects to novel o... 详细信息
来源: 评论
Consistent prompt learning for vision-language models
收藏 引用
KNOWLEDGE-BASED SYSTEMS 2025年 310卷
作者: Zhang, Yonggang Tian, Xinmei Hong Kong Baptist Univ Dept Comp Sci Hong Kong Peoples R China Univ Sci & Technol China Natl Engn Lab Brain Inspired Intelligence Technol Hefei 230000 Anhui Peoples R China
Pre-trained vision-language models, such as CLIP, have shown remarkable capabilities across various downstream tasks by learning prompts that consist of context concatenated with a class name;for example, 'a photo... 详细信息
来源: 评论
H2R Bridge: Transferring vision-language models to few-shot intention meta-perception in human robot collaboration
收藏 引用
JOURNAL OF MANUFACTURING SYSTEMS 2025年 80卷 524-535页
作者: Wu, Duidi Zhao, Qianyou Fan, Junming Qi, Jin Zheng, Pai Hu, Jie Shanghai Jiao Tong Univ Sch Mech Engn Shanghai 200240 Peoples R China State Key Lab Mech Syst & Vibrat Shanghai 200240 Peoples R China Hong Kong Polytech Univ Dept Ind & Syst Engn Hong Kong Peoples R China
Human-robot collaboration enhances efficiency by enabling robots to work alongside human operators in shared tasks. Accurately understanding human intentions is critical for achieving a high level of collaboration. Ex... 详细信息
来源: 评论
APOVIS: Automated pixel-level open-vocabulary instance segmentation through integration of pre-trained vision-language models and foundational segmentation models
收藏 引用
IMAGE AND vision COMPUTING 2025年 154卷
作者: Ma, Qiujie Yang, Shuqi Zhang, Lijuan Lan, Qing Yang, Dongdong Chen, Honghan Tan, Ying Southwest Minzu Univ State Ethn Affairs Commiss Key Lab Comp Syst Chengdu Peoples R China Hosp Chengdu Univ TCM Chengdu Peoples R China
In recent years, substantial advancements have been achieved in vision-language integration and image segmentation, particularly through the use of pre-trained models like BERT and vision Transformer (ViT). Within the... 详细信息
来源: 评论
Fine-grained multi-modal prompt learning for vision-language models
收藏 引用
NEUROCOMPUTING 2025年 636卷
作者: Liu, Yunfei Deng, Yunziwei Liu, Anqi Liu, Yanan Li, Shengyang Chinese Acad Sci Technol & Engn Ctr Space Utilizat Beijing 100094 Peoples R China Chinese Acad Sci Key Lab Space Utilizat Beijing 100094 Peoples R China Univ Chinese Acad Sci Beijing 100094 Peoples R China
Recently advanced pre-trained vision language models have demonstrated outstanding performance in many downstream tasks via prompt learning. Prompt learning provides task-specific prompt information to exploit benefic... 详细信息
来源: 评论
Advancements in vision-language models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
收藏 引用
REMOTE SENSING 2025年 第1期17卷 162-162页
作者: Tao, Lijie Zhang, Haokui Jing, Haizhao Liu, Yu Yan, Dawei Wei, Guoting Xue, Xizhe Northwestern Polytech Univ Sch Cybersecur Int Cooperat Dept Xian 710072 Peoples R China Zhejiang Lab Inst Intelligent Percept Hangzhou 311500 Peoples R China Nanjing Univ Sci & Technol Sch Comp Sci & Engn Nanjing 210094 Peoples R China Tech Univ Munich Dept Aerosp & Geodesy DE-80333 Munich Germany
Recently, the remarkable success of ChatGPT has sparked a renewed wave of interest in artificial intelligence (AI), and the advancements in vision-language models (VLMs) have pushed this enthusiasm to new heights. Dif... 详细信息
来源: 评论
Integrating With Multimodal Information for Enhancing Robotic Grasping With vision-language models
收藏 引用
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2025年 22卷 13073-13086页
作者: Zhao, Zhou Zheng, Dongyuan Chen, Yizi Luo, Jing Wang, Yanjun Huang, Panfeng Yang, Chenguang Cent China Normal Univ Sch Comp Sci Wuhan 430079 Peoples R China Hubei Engn Res Ctr Intelligent Detect & Identifica Wuhan 430205 Peoples R China ETH Inst Cartog & Geoinformat CH-8093 Zurich Switzerland Wuhan Univ Technol Sch Automat Wuhan 430070 Peoples R China Shanghai Jiao Tong Univ Inst Marine Equipment Shanghai 200240 Peoples R China Northwestern Polytech Univ Sch Astronaut Natl Key Lab Aerosp Flight Dynam Xian 710072 Peoples R China Northwestern Polytech Univ Res Ctr Intelligent Robot Sch Astronaut Xian 710072 Peoples R China Univ Liverpool Dept Comp Sci Liverpool L69 3BX England
As robots grow increasingly intelligent and utilize data from various sensors, relying solely on unimodal data sources is becoming inadequate for their operational needs. Consequently, integrating multimodal data has ... 详细信息
来源: 评论
Bootstrapping vision-language models for Frequency-Centric Self-Supervised Remote Physiological Measurement
收藏 引用
INTERNATIONAL JOURNAL OF COMPUTER vision 2025年 1-22页
作者: Yue, Zijie Shi, Miaojing Wang, Hanli Ding, Shuai Chen, Qijun Yang, Shanlin Tongji Univ Coll Elect & Informat Engn Shanghai Peoples R China Tongji Univ Shanghai Inst Intelligent Sci & Technol Shanghai Peoples R China Hefei Univ Technol Sch Management Hefei Peoples R China
Facial video-based remote physiological measurement is a promising research area for detecting human vital signs (e.g., heart rate, respiration frequency) in a non-contact way. Conventional approaches are mostly super... 详细信息
来源: 评论
Pseudo-Prompt Generating in Pre-trained vision-language models for Multi-label Medical Image Classification  7th
Pseudo-Prompt Generating in Pre-trained Vision-Language Mode...
收藏 引用
7th Chinese Conference on Pattern Recognition and Computer vision
作者: Ye, Yaoqin Zhang, Junjie Shi, Hongwei Sichuan Univ Coll Comp Sci Chengdu Peoples R China
The task of medical image recognition is notably complicated by the presence of varied and multiple pathological indications, presenting a unique challenge in multi-label classification with unseen labels. This comple... 详细信息
来源: 评论
Adapting vision-language models to Open Classes via Test-Time Prompt Tuning  7th
Adapting Vision-Language Models to Open Classes via Test-Tim...
收藏 引用
7th Chinese Conference on Pattern Recognition and Computer vision
作者: Gao, Zhengqing Ao, Xiang Zhang, Xu-Yao Liu, Cheng-Lin Chinese Acad Sci Inst Automat MAIS Beijing Peoples R China Univ Chinese Acad Sci Sch Artificial Intelligence Beijing Peoples R China
Adapting pre-trained models to open classes is a challenging problem in machine learning. vision-language models fully explore the knowledge of text modality, demonstrating strong zero-shot recognition performance, wh... 详细信息
来源: 评论