咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 74 篇 会议

馆藏范围

  • 155 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 141 篇 工学
    • 116 篇 计算机科学与技术...
    • 48 篇 电气工程
    • 14 篇 软件工程
    • 12 篇 信息与通信工程
    • 8 篇 控制科学与工程
    • 6 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 电子科学与技术(可...
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 20 篇 医学
    • 10 篇 临床医学
    • 8 篇 特种医学
    • 5 篇 基础医学(可授医学...
  • 16 篇 理学
    • 7 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 155 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 task analysis
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 4 篇 multimodal learn...
  • 4 篇 tuning
  • 4 篇 continual learni...
  • 4 篇 contrastive lear...

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...
  • 2 篇 tsinghua univ sh...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen
  • 2 篇 guo song

语言

  • 154 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-language Model"
155 条 记 录,以下是11-20 订阅
排序:
CLIP4STR: A Simple Baseline for Scene Text Recognition With Pre-Trained vision-language model
收藏 引用
IEEE TRANSACTIONS ON IMAGE PROCESSING 2024年 33卷 6893-6904页
作者: Zhao, Shuai Quan, Ruijie Zhu, Linchao Yang, Yi Univ Technol Sydney Australian Artificial Intelligence Inst ReLER Lab Ultimo NSW 2007 Australia Nanyang Technol Univ Coll Comp & Data Sci Singapore 308232 Singapore Zhejiang Univ ReLER Lab CCAI Hangzhou 310027 Zhejiang Peoples R China
Pre-trained vision-language models (VLMs) are the de-facto foundation models for various downstream tasks. However, scene text recognition methods still prefer backbones pre-trained on a single modality, namely, the v... 详细信息
来源: 评论
INTEGRATING EXPERT KNOWLEDGE WITH vision-language model FOR MEDICAL IMAGE RETRIEVAL  21
INTEGRATING EXPERT KNOWLEDGE WITH VISION-LANGUAGE MODEL FOR ...
收藏 引用
21st IEEE International Symposium on Biomedical Imaging (ISBI)
作者: Wei, Xiaoyang Vagena, Zografoula Kurtz, Camille Cloppet, Florence Univ Paris Cite France Lab Informat Paris Descartes LIPADE Paris France Univ Paris Cite France Data Intelligence Inst Paris diiP Paris France
Content-Based Image Retrieval (CBIR) is an image search technique that can offer diagnostic guidance when facing difficult cases in radiology. State-of-the-art approaches propose to extract image features using vision... 详细信息
来源: 评论
Efficient and Long-Tailed Generalization for Pre-trained vision-language model  24
Efficient and Long-Tailed Generalization for Pre-trained Vis...
收藏 引用
30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
作者: Shi, Jiang-Xin Zhang, Chi Wei, Tong Li, Yu-Feng Nanjing Univ Natl Key Lab Novel Software Technol Sch Artificial Intelligence Nanjing Peoples R China Nanjing Univ Natl Key Lab Novel Software Technol Nanjing Peoples R China Southeast Univ Sch Comp Sci & Engn Key Lab Comp Network & Informat Integrat Minist Educ Dhaka Bangladesh
Pre-trained vision-language models like CLIP have shown powerful zero-shot inference ability via image-text matching and prove to be strong few-shot learners in various downstream tasks. However, in real-world scenari... 详细信息
来源: 评论
Recognition of Heat-Induced Food State Changes by Time-Series Use of vision-language model for Cooking Robot  1
收藏 引用
18th International Conference on Intelligent Autonomous Systems (IAS)
作者: Kanazawa, Naoaki Kawaharazuka, Kento Obinata, Yoshiki Okada, Kei Inaba, Masayuki Univ Tokyo 7-3-1 HongoBunkyo Ku Tokyo Japan
Cooking tasks are characterized by large changes in the state of the food, which is one of the major challenges in robot execution of cooking tasks. In particular, cooking using a stove to apply heat to the foodstuff ... 详细信息
来源: 评论
Exploring Interactive Semantic Alignment for Efficient HOI Detection with vision-language model
Exploring Interactive Semantic Alignment for Efficient HOI D...
收藏 引用
IEEE International Conference on Multimedia and Expo (ICME)
作者: Dong, Jihao Yang, Hua Pan, Renjie Shanghai Jiao Tong Univ Inst Image Commun & Network Engn Shanghai Peoples R China Shanghai Key Lab Digital Media Proc & Transmiss Shanghai Peoples R China Shanghai Jiao Tong Univ AI Inst China MoE Key Lab Artificial Intelligence Shanghai Peoples R China
Human-Object Interaction (HOI) detection aims to localize human-object pairs and comprehend their interactions. Recently, two-stage transformer-based methods have demonstrated competitive performance. However, these m... 详细信息
来源: 评论
Subsampling of Frequent Words in Text for Pre-training a vision-language model  1
Subsampling of Frequent Words in Text for Pre-training a Vis...
收藏 引用
1st Workshop on Large Generative models Meet Multimodal Applications (LGM3A)
作者: Liang, Mingliang Larson, Martha Radboud Univ Nijmegen Nijmegen Netherlands
In this paper, we introduce Subsampling of frequentWords for Contrastive language-Image Pre-training (SW-CLIP), a novel approach for the training vision-language models (VLMs). SW-CLIP uses frequency-based subsampling... 详细信息
来源: 评论
A vision-language model Based on Prompt Learner for Few-shot Medical Images Diagnosis  27
A Vision-language Model Based on Prompt Learner for Few-shot...
收藏 引用
27th International Conference on Computer Supported Cooperative Work in Design (CSCWD)
作者: Chang, Tianyou Chen, Shizhan Fan, Guodong Feng, Zhiyong Tianjin Univ Coll Intelligence & Comp Tianjin Peoples R China
In the real world, it can be challenging to annotate a large-scale dataset for all medical images, making few-shot medical image classification an important task. The latest advancements in pre-trained vision-language... 详细信息
来源: 评论
VLM-PL: Advanced Pseudo Labeling approach for Class Incremental Object Detection via vision-language model
VLM-PL: Advanced Pseudo Labeling approach for Class Incremen...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Kim, Junsu Ku, Yunhoe Kim, Jihyeon Cha, Junuk Baek, Seungryul UNIST Ulsan South Korea MODULABS Seoul South Korea
In the field of Class Incremental Object Detection (CIOD), creating models that can continuously learn like humans is a major challenge. Pseudo-labeling methods, although initially powerful, struggle with multi-scenar... 详细信息
来源: 评论
SWEEPMM: A HIGH-QUALITY MULTIMODAL DATASET FOR SWEEPING ROBOTS IN HOME SCENARIOS FOR vision-language model  49
SWEEPMM: A HIGH-QUALITY MULTIMODAL DATASET FOR SWEEPING ROBO...
收藏 引用
49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Xu, Weichen Xu, Xinxin Fu, Tianhao Cao, Jian Xu, Xiaoyang Huang, Yuetian Cao, Xixin Zhang, Xing Peking Univ Sch Software & Microelect Beijing Peoples R China Peking Univ Shenzhen Grad Sch Beijing Peoples R China
Embodied intelligence based on vision-language models aims to learn from interactions and derive general intelligence. However, existing generalized vision-language models cannot understand domain knowledge in home sc... 详细信息
来源: 评论
A vision-language model-Based Traffic Sign Detection Method for High-Resolution Drone Images: A Case Study in Guyuan, China
收藏 引用
SENSORS 2024年 第17期24卷 5800页
作者: Yao, Jianqun Li, Jinming Li, Yuxuan Zhang, Mingzhu Zuo, Chen Dong, Shi Dai, Zhe CCCC Infrastruct Maintenance Grp Co Ltd Beijing 100011 Peoples R China Changan Univ Sch Transportat Engn Xian 710064 Peoples R China
As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs... 详细信息
来源: 评论