咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 74 篇 会议

馆藏范围

  • 155 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 141 篇 工学
    • 116 篇 计算机科学与技术...
    • 48 篇 电气工程
    • 14 篇 软件工程
    • 12 篇 信息与通信工程
    • 8 篇 控制科学与工程
    • 6 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 电子科学与技术(可...
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 20 篇 医学
    • 10 篇 临床医学
    • 8 篇 特种医学
    • 5 篇 基础医学(可授医学...
  • 16 篇 理学
    • 7 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 155 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 task analysis
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 4 篇 multimodal learn...
  • 4 篇 tuning
  • 4 篇 continual learni...
  • 4 篇 contrastive lear...

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...
  • 2 篇 tsinghua univ sh...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen
  • 2 篇 guo song

语言

  • 154 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-language Model"
155 条 记 录,以下是21-30 订阅
排序:
vision-language model for Generating Textual Descriptions From Clinical Images: model Development and Validation Study
收藏 引用
JMIR FORMATIVE RESEARCH 2024年 第1期8卷 e32690页
作者: Ji, Jia Hou, Yongshuai Chen, Xinyu Pan, Youcheng Xiang, Yang Shenzhen Inst Informat Technol Shenzhen Peoples R China Peng Cheng Lab 2 Xingke 1st St Shenzhen 518000 Peoples R China Harbin Inst Technol Shenzhen Peoples R China
Background: The automatic generation of radiology reports, which seeks to create a free -text description from a clinical radiograph, is emerging as a pivotal intersection between clinical medicine and artificial inte... 详细信息
来源: 评论
CLIP-Llama: A New Approach for Scene Text Recognition with a Pre-Trained vision-language model and a Pre-Trained language model
收藏 引用
SENSORS 2024年 第22期24卷 7371页
作者: Zhao, Xiaoqing Xu, Miaomiao Silamu, Wushour Li, Yanbing Xinjiang Univ Coll Comp Sci & Technol 777 Huarui St Urumqi 830017 Peoples R China
This study focuses on Scene Text Recognition (STR), which plays a crucial role in various applications of artificial intelligence such as image retrieval, office automation, and intelligent transportation systems. Cur... 详细信息
来源: 评论
Unsupervised graph reasoning distillation hashing for multimodal hamming space search with vision-language model
收藏 引用
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL 2024年 第2期13卷 16-16页
作者: Sun, Lina Dong, Yumin Chongqing Normal Univ Sch Comp & Informat Sci Chongqing 401331 Peoples R China
Multimodal hash technology maps high-dimensional multimodal data into hash codes, which greatly reduces the cost of data storage and improves query speed through the Hamming similarity calculation. However, existing u... 详细信息
来源: 评论
Practical Techniques for vision-language Segmentation model in Remote Sensing
Practical Techniques for Vision-Language Segmentation Model ...
收藏 引用
ISPRS TC II Mid-term Symposium on Role of Photogrammetry for a Sustainable World
作者: Lin, Yuting Suzuki, Kumiko Sogo, Shinichiro Kokusai Kogyo Co Ltd Tokyo Japan
Traditional semantic segmentation models often struggle with poor generalizability in zero-shot scenarios such as recognizing attributes unseen in the training labels. On the other hands, language-vision models (VLMs)... 详细信息
来源: 评论
UMPA: Unified multi-modal prompt with adapter for vision-language models
收藏 引用
MULTIMEDIA SYSTEMS 2025年 第2期31卷 1-11页
作者: Jin, Zhengwei Wei, Yun Univ Shanghai Sci & Technol Sch Opt Elect & Comp Engn Shanghai 200093 Peoples R China
Large-scale multi-modal pretraining model, such as CLIP, has shown remarkable generalization in vision-language tasks. However, the transfer of large models to downstream tasks requires large-scale computing resources... 详细信息
来源: 评论
Towards label-free defect detection in additive manufacturing via dual-classifier semi-supervised learning for vision-language models
收藏 引用
JOURNAL OF INTELLIGENT MANUFACTURING 2025年 1-16页
作者: Wang, Kang Liu, Lanqing Xu, Cheng Zou, Jing Lin, Haoneng Fang, Naiyu Jiang, Jingchao Nanyang Technol Univ Singapore Singapore Hong Kong Polytech Univ Hung Hom Kowloon Hong Kong Peoples R China Univ Exeter Exeter England
Complex components can now be fabricated in innovative ways thanks to additive manufacturing (AM) technology, but it also presents a severe challenge in the detection of defects, primarily due to extensive labeling ef... 详细信息
来源: 评论
Federated fine-grained prompts for vision-language models based on open-vocabulary object detection
收藏 引用
APPLIED INTELLIGENCE 2025年 第7期55卷 1-15页
作者: Li, Yu China Univ Petr East China Sch Comp Sci & Technol Qingdao 266580 Peoples R China
vision-language models can be used for open-vocabulary object detection. The existing methods suffer from low matching accuracy between prompt and image regions, as well as limited generalization capability as they ad... 详细信息
来源: 评论
LVLM-EHub: A Comprehensive Evaluation Benchmark for Large vision-language models
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025年 第3期47卷 1877-1893页
作者: Xu, Peng Shao, Wenqi Zhang, Kaipeng Gao, Peng Liu, Shuo Lei, Meng Meng, Fanqing Huang, Siyuan Qiao, Yu Luo, Ping Shanghai AI Lab OpenGVLab Shanghai 200232 Peoples R China Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China Shanghai AI Lab OpenGVLab Shanghai 200232 Peoples R China Peking Univ Beijing 100871 Peoples R China
Large vision-language models (LVLMs) have recently played a dominant role in multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation of their efficacy. This paper presents a comp... 详细信息
来源: 评论
RelVid: Relational Learning with vision-language models for Weakly Video Anomaly Detection
收藏 引用
SENSORS 2025年 第7期25卷 2037-2037页
作者: Wang, Jingxin Li, Guohan Liu, Jiaqi Xu, Zhengyi Chen, Xinrong Wei, Jianming Chinese Acad Sci Shanghai Adv Res Inst Shanghai 201210 Peoples R China ShanghaiTech Univ Sch Informat Sci & Technol Shanghai 201210 Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing 100049 Peoples R China Fudan Univ Acad Engn & Technol Shanghai 200433 Peoples R China
Weakly supervised video anomaly detection aims to identify abnormal events in video sequences without requiring frame-level supervision, which is a challenging task in computer vision. Traditional methods typically re... 详细信息
来源: 评论
VaVLM: Toward Efficient Edge-Cloud Video Analytics With vision-language models
收藏 引用
IEEE TRANSACTIONS ON BROADCASTING 2025年
作者: Zhang, Yang Wang, Hanling Bai, Qing Liang, Haifeng Zhu, Peican Muntean, Gabriel-Miro Li, Qing Xian Technol Univ Sch Optoelect Engn Xian 710021 Shaanxi Peoples R China Pengcheng Lab Dept Adv Interdisciplinary Res Shenzhen 518055 Guangdong Peoples R China Tsinghua Univ Shenzhen Int Grad Sch Shenzhen 518055 Guangdong Peoples R China Northern Optoelect Co Ltd NORTHEO Dept Qual Safety Xian 710043 Shaanxi Peoples R China Northwestern Polytech Univ Sch Artificial Intelligence Opt & Elect Xian 710072 Shaanxi Peoples R China Dublin City Univ Sch Elect Engn Dublin 9 Ireland
The advancement of Large language models (LLMs) with vision capabilities in recent years has elevated video analytics applications to new heights. To address the limited computing and bandwidth resources on edge devic... 详细信息
来源: 评论