咨询与建议

限定检索结果

文献类型

  • 91 篇 会议
  • 62 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 154 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 146 篇 工学
    • 120 篇 计算机科学与技术...
    • 30 篇 电气工程
    • 15 篇 软件工程
    • 13 篇 控制科学与工程
    • 11 篇 信息与通信工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 生物医学工程(可授...
    • 6 篇 测绘科学与技术
    • 4 篇 机械工程
    • 4 篇 仪器科学与技术
    • 4 篇 材料科学与工程(可...
    • 3 篇 交通运输工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 环境科学与工程(可...
    • 1 篇 生物工程
    • 1 篇 安全科学与工程
  • 28 篇 医学
    • 19 篇 临床医学
    • 8 篇 特种医学
    • 4 篇 基础医学(可授医学...
  • 21 篇 理学
    • 8 篇 物理学
    • 7 篇 地球物理学
    • 6 篇 化学
    • 5 篇 生物学
    • 2 篇 地理学
    • 1 篇 数学
    • 1 篇 天文学
    • 1 篇 地质学
    • 1 篇 统计学(可授理学、...
  • 6 篇 管理学
    • 5 篇 管理科学与工程(可...
  • 1 篇 哲学
    • 1 篇 哲学
  • 1 篇 农学

主题

  • 154 篇 vision-language ...
  • 15 篇 large language m...
  • 12 篇 prompt learning
  • 10 篇 clip
  • 10 篇 few-shot learnin...
  • 6 篇 contrastive lear...
  • 6 篇 foundation model...
  • 6 篇 visualization
  • 5 篇 deep learning
  • 4 篇 multimodal learn...
  • 4 篇 object detection
  • 4 篇 long-tailed reco...
  • 4 篇 remote sensing
  • 4 篇 image classifica...
  • 4 篇 artificial intel...
  • 4 篇 computer vision
  • 4 篇 domain generaliz...
  • 4 篇 prompt tuning
  • 3 篇 representation l...
  • 3 篇 image captioning

机构

  • 4 篇 carnegie mellon ...
  • 4 篇 univ chinese aca...
  • 3 篇 inesc tec porto
  • 3 篇 sichuan univ col...
  • 3 篇 univ chinese aca...
  • 3 篇 chinese univ hon...
  • 3 篇 chinese acad sci...
  • 2 篇 shanghai ai lab ...
  • 2 篇 ecole polytech f...
  • 2 篇 tsinghua univ de...
  • 2 篇 harbin inst tech...
  • 2 篇 zhejiang univ pe...
  • 2 篇 univ porto fac e...
  • 2 篇 beijing univ pos...
  • 2 篇 city univ hong k...
  • 2 篇 sichuan univ col...
  • 2 篇 tech univ munich...
  • 2 篇 westlake univ sc...
  • 2 篇 univ elect sci &...
  • 2 篇 johns hopkins un...

作者

  • 4 篇 banerjee biplab
  • 4 篇 zhang yi
  • 4 篇 jha ankit
  • 3 篇 wang donglin
  • 3 篇 singha mainak
  • 3 篇 zhang ce
  • 3 篇 tuia devis
  • 2 篇 men aidong
  • 2 篇 zhang min
  • 2 篇 liu xuyang
  • 2 篇 chen honggang
  • 2 篇 guo miaotian
  • 2 篇 yang yang
  • 2 篇 ricci elisa
  • 2 篇 ye mao
  • 2 篇 tian liang
  • 2 篇 patricio cristia...
  • 2 篇 wang haiying
  • 2 篇 teixeira luis f.
  • 2 篇 mukhopadhyay sou...

语言

  • 152 篇 英文
  • 2 篇 其他
检索条件"主题词=Vision-language Models"
154 条 记 录,以下是51-60 订阅
排序:
GraphVL: Graph-Enhanced Semantic Modeling via vision-language models for Generalized Class Discovery  24
GraphVL: Graph-Enhanced Semantic Modeling via Vision-Languag...
收藏 引用
15th Indian Conference on Computer vision Graphics and Image Processing
作者: Solanki, Bhupendra Nair, Ashwin R. Singha, Mainak Mukhopadhyay, Souradeep Jha, Ankit Banerjee, Biplab Indian Inst Technol Mumbai Maharashtra India Indian Inst Sci Educ & Res Thiruvananthapuram Thiruvananthapuram Kerala India Indian Inst Sci Bangalore Karnataka India LNM Inst Informat Technol Jaipur Rajasthan India
Generalized Category Discovery (GCD) aims to cluster unlabeled images into known and novel categories using labeled images from known classes. To address the challenge of transferring features from known to unknown cl... 详细信息
来源: 评论
Cross-Modal Concept Learning and Inference for vision-language models
收藏 引用
NEUROCOMPUTING 2024年 583卷
作者: Zhang, Yi Zhang, Ce Tang, Yushun He, Zhihai Harbin Inst Technol Harbin 150001 Peoples R China Southern Univ Sci & Technol Shenzhen 518055 Peoples R China Pengcheng Lab Shenzhen 518000 Peoples R China
Large-scale pre -trained vision -language models (VLMs), such as CLIP, establish the correlation between texts and images, achieving remarkable success on various downstream tasks with fine-tuning. In existing fine-tu... 详细信息
来源: 评论
Reflectance estimation for proximity sensing by vision-language models: utilizing distributional semantics for low-level cognition in robotics
收藏 引用
ADVANCED ROBOTICS 2024年 第18期38卷 1287-1306页
作者: Osada, Masashi Ricardez, Gustavo A. Garcia Suzuki, Yosuke Taniguchi, Tadahiro Ritsumeikan Univ Coll Informat Sci & Engn Kusatsu Japan Kanazawa Univ Coll Sci & Engn Kanazawa Japan
Large language models (LLMs) and vision-language models (VLMs) have been increasingly used in robotics for high-level cognition, but their use for low-level cognition, such as interpreting sensor information, remains ... 详细信息
来源: 评论
Reflex-based open-vocabulary navigation without prior knowledge using omnidirectional camera and multiple vision-language models
收藏 引用
ADVANCED ROBOTICS 2024年 第18期38卷 1307-1317页
作者: Kawaharazuka, Kento Obinata, Yoshiki Kanazawa, Naoaki Tsukamoto, Naoto Okada, Kei Inaba, Masayuki Univ Tokyo Grad Sch Informat Sci & Technol Dept Mechanoinformat Tokyo Japan
Various robot navigation methods have been developed, but they are mainly based on Simultaneous Localization and Mapping (SLAM), reinforcement learning, etc., which require prior map construction or learning. In this ... 详细信息
来源: 评论
Few-Shot Image Classification of Crop Diseases Based on vision-language models
收藏 引用
SENSORS 2024年 第18期24卷 6109页
作者: Zhou, Yueyue Yan, Hongping Ding, Kun Cai, Tingting Zhang, Yan China Univ Geosci Sch Informat Engn Beijing 100083 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence S Beijing 100190 Peoples R China
Accurate crop disease classification is crucial for ensuring food security and enhancing agricultural productivity. However, the existing crop disease classification algorithms primarily focus on a single image modali... 详细信息
来源: 评论
Compositional Kronecker Context Optimization for vision-language models
收藏 引用
NEUROCOMPUTING 2024年 608卷
作者: Ding, Kun Li, Xiaohui Yu, Qiang Wang, Ying Zhang, Haojian Xiang, Shiming Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence S Beijing Peoples R China Chinese Acad Sci Inst Automat Engn Lab Intelligent Ind Vis Beijing Peoples R China Chinese Acad Sci Inst Automat Res Ctr Aerosp Informat Beijing Peoples R China
Context Optimization (CoOp) has emerged as a simple yet effective technique for adapting CLIP-like vision- language models to downstream image recognition tasks. Nevertheless, learning context with satisfactory base-t... 详细信息
来源: 评论
vision-language models Learn Super Images for Efficient Partially Relevant Video Retrieval
收藏 引用
ACM Transactions on Multimedia Computing, Communications, and Applications 1000年
作者: Taichi Nishimura Shota Nakada Masayoshi Kondo LY Corporation Japan
In this paper, we propose an efficient and high-performance method for partially relevant video retrieval. The method aims to retrieve long videos that contain at least one moment relevant to the input text query. The... 详细信息
来源: 评论
A Slim Prompt-Averaged Consistency prompt learning for vision-language model
收藏 引用
KNOWLEDGE-BASED SYSTEMS 2025年 310卷
作者: He, Siyu Wang, Shengsheng Long, Sifan Jilin Univ Coll Comp Sci & Technol Changchun 130012 Peoples R China Jilin Univ Key Lab Symbol Computat & Knowledge Engn Minist Educ Changchun 130012 Peoples R China
Recent advancements in prompt tuning have enhanced the adaptation of large pre-trained models to target tasks. However, existing methods struggle to establish an effective balance between task-specific knowledge and g... 详细信息
来源: 评论
Generalized Robotic vision-language Learning Model via Linguistic Foreground-Aware Contrast
收藏 引用
INTERNATIONAL JOURNAL OF COMPUTER vision 2025年 第6期133卷 3481-3518页
作者: Liu, Kangcheng Wang, Chaoqun Han, Xiaodong Liu, Yong-Jin Chen, Baoquan Hunan Univ Coll Elect & Informat Engn Changsha Peoples R China CALTECH Div Engn & Appl Sci Pasadena CA 91125 USA Tsinghua Univ Dept Comp Sci & Technol Beijing Peoples R China Shandong Univ Sch Control Sci & Engn Jinan Peoples R China Minjiang Univ Sch Control Engn Fuzhou Peoples R China Peking Univ Sch Artificial Intelligence Beijing Peoples R China
Contrastive learning has recently demonstrated great potential for unsupervised pre-training in 3D scene understanding tasks. However, most existing work randomly selects point features as anchors while building contr... 详细信息
来源: 评论
A prompt-free vision-language model for environmental perception in automated driving systems
收藏 引用
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING 2025年
作者: Lin, Chaojun Shi, Ying Hu, Qin Zhang, Lei Wuhan Univ Technol Dept Automat 122 Luoshi Rd Wuhan 430070 Peoples R China
Environmental perception is a critical component of automated driving systems. Advancing environmental perception algorithms toward applications in open-world road scenarios is a current research trend. However, tradi... 详细信息
来源: 评论