咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 74 篇 会议

馆藏范围

  • 155 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 141 篇 工学
    • 116 篇 计算机科学与技术...
    • 48 篇 电气工程
    • 14 篇 软件工程
    • 12 篇 信息与通信工程
    • 8 篇 控制科学与工程
    • 6 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 电子科学与技术(可...
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 20 篇 医学
    • 10 篇 临床医学
    • 8 篇 特种医学
    • 5 篇 基础医学(可授医学...
  • 16 篇 理学
    • 7 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 155 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 task analysis
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 4 篇 multimodal learn...
  • 4 篇 tuning
  • 4 篇 continual learni...
  • 4 篇 contrastive lear...

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...
  • 2 篇 tsinghua univ sh...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen
  • 2 篇 guo song

语言

  • 154 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-Language Model"
155 条 记 录,以下是51-60 订阅
排序:
language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
收藏 引用
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2025年 第4期35卷 3185-3195页
作者: Zhang, Zicheng Ke, Wei Zhu, Yi Liang, Xiaodan Liu, Jianzhuang Ye, Qixiang Zhang, Tong Xi An Jiao Tong Univ Sch Software Engn Xian 710049 Peoples R China Huawei Technol Noahs Ark Lab Shenzhen 518129 Peoples R China Sun Yat Sen Univ Sch Intelligent Syst Engn Guangzhou 510275 Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing 100049 Peoples R China Ecole Polytech Fed Lausanne Image & Visual Representat Lab CH-1015 Lausanne Switzerland
The pre-trained vision-language model, exemplified by CLIP, advances zero-shot semantic segmentation by aligning visual features with class embeddings through a transformer decoder to generate semantic masks. Despite ... 详细信息
来源: 评论
Large language model-augmented learning for auto-delineation of treatment targets in head-and-neck cancer radiotherapy
收藏 引用
RADIOTHERAPY AND ONCOLOGY 2025年 205卷 110740-110740页
作者: Rajendran, Praveenbalaji Yang, Yong Niedermayr, Thomas R. Gensheimer, Michael Beadle, Beth Le, Quynh-Thu Xing, Lei Dai, Xianjin Stanford Univ Dept Radiat Oncol Stanford CA USA
Background and Purpose: Radiation therapy (RT) is highly effective, but its success depends on accurate, manual target delineation, which is time-consuming, labor-intensive, and prone to variability. Despite AI advanc... 详细信息
来源: 评论
Visual primitives as words: Alignment and interaction for compositional zero-shot
收藏 引用
PATTERN RECOGNITION 2025年 157卷
作者: Shuang, Feng Li, Jiahuan Huang, Qingbao Zhao, Wenye Xu, Dongsheng Han, Chao Cheng, Haonan Guangxi Univ Sch Elect Engn 100 East Daxue Rd Nanning 530004 Guangxi Peoples R China Guangxi Key Lab Intelligent Control & Maintenance 100 East Daxue Rd Nanning 530004 Guangxi Peoples R China Commun Univ China State Key Lab Media Convergence & Commun 1 Dingfuzhuang East St Beijing 100024 Peoples R China
Compositional Zero-Shot Learning (CZSL) aims to recognize seen and unseen attribute-object compositions. Recently, some researchers apply vision-language models to CZSL task. However, they only roughly match the image... 详细信息
来源: 评论
AITtrack: Attention-Based Image-Text Alignment for Visual Tracking
收藏 引用
IEEE ACCESS 2025年 13卷 67095-67111页
作者: Alawode, Basit Javed, Sajid Khalifa Univ Sci Technol Dept Elect Engn & Comp Sci Abu Dhabi U Arab Emirates
vision-language models (VLMs) have recently advanced the Visual Object Tracking (VOT) performance. In VLMs, a vision encoder is employed to obtain visual representation, and a text encoder is employed to estimate the ... 详细信息
来源: 评论
Fine-Tuning of CLIP in Few-Shot Scenarios via Supervised Contrastive Learning  7th
Fine-Tuning of CLIP in Few-Shot Scenarios via Supervised Con...
收藏 引用
7th Chinese Conference on Pattern Recognition and Computer vision
作者: Luo, Jing Wu, Guangxing Liu, Hongmei Wang, Ruixuan Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou Peoples R China Peng Cheng Lab Shenzhen Peoples R China MOE Key Lab Machine Intelligence & Adv Comp Guangzhou Peoples R China
Large-scale pretrained visual-language models like CLIP have proven highly effective in learning universal representations and achieved significant success across various downstream tasks. Recently, there has been inc... 详细信息
来源: 评论
vision-language models Can Identify Distracted Driver Behavior From Naturalistic Videos
收藏 引用
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2024年 第9期25卷 11602-11616页
作者: Hasan, Md. Zahid Chen, Jiajing Wang, Jiyang Rahman, Mohammed Shaiqur Joshi, Ameya Velipasalar, Senem Hegde, Chinmay Sharma, Anuj Sarkar, Soumik Iowa State Univ Dept Elect & Comp Engn Ames IA 50010 USA Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA Iowa State Univ Dept Comp Sci Ames IA 50010 USA NYU Dept Elect & Comp Engn Brooklyn NY 11201 USA Iowa State Univ Dept Civil Construct & Environm Engn Ames IA 50010 USA Iowa State Univ Dept Mech Engn Ames IA 50010 USA
Recognizing the activities causing distraction in real-world driving scenarios is critical for ensuring the safety and reliability of both drivers and pedestrians on the roadways. Conventional computer vision techniqu... 详细信息
来源: 评论
vision-language models for vision Tasks: A Survey
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024年 第8期46卷 5625-5644页
作者: Zhang, Jingyi Huang, Jiaxing Jin, Sheng Lu, Shijian Nanyang Technol Univ Sch Comp Sci & Engn Singapore 639798 Singapore
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consum... 详细信息
来源: 评论
Semi-Automatic Labeling for Action Recognition by Diversity Preserving Sampling
Semi-Automatic Labeling for Action Recognition by Diversity ...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Ando, Ryuhei Shibata, Takashi Takahashi, Toru NEC Corporation Japan
Deep learning for action recognition is an important technology for understanding videos. However, collecting video training dataset for deep learning model with low cost while maintaining enough diversity is challeng... 详细信息
来源: 评论
Leveraging Visual Captions for Enhanced Zero-Shot HOI Detection
Leveraging Visual Captions for Enhanced Zero-Shot HOI Detect...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Zeng, Yanqing Mao, Yunyao Lu, Zhenbo Zhou, Wengang Li, Houqiang EEIS Department University of Science and Technology of China Hefei China Institute of Artificial Intelligence Hefei Comprehensive National Science Center Hefei China
Zero-shot Human-Object Interaction (HOI) detection aims to identify both seen and unseen HOI categories in an image. Most existing methods rely on semantic knowledge distilled from CLIP to find novel interactions but ... 详细信息
来源: 评论
Attention Disentanglement for Semantic Diffusion modeling in Text-to-Image Generation
Attention Disentanglement for Semantic Diffusion Modeling in...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Yu, Hsiang-Chun Chien, Jen-Tsung Institute of Electrical and Computer Engineering National Yang Ming Chiao Tung University Hsinchu Taiwan
Text-to-image model has been recently improved to generate the semantically rich high-quality images by strengthening natural language processing via transformer in a stable diffusion process. However, the challenges ... 详细信息
来源: 评论