咨询与建议

限定检索结果

文献类型

  • 91 篇 会议
  • 62 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 154 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 146 篇 工学
    • 120 篇 计算机科学与技术...
    • 30 篇 电气工程
    • 15 篇 软件工程
    • 13 篇 控制科学与工程
    • 11 篇 信息与通信工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 生物医学工程(可授...
    • 6 篇 测绘科学与技术
    • 4 篇 机械工程
    • 4 篇 仪器科学与技术
    • 4 篇 材料科学与工程(可...
    • 3 篇 交通运输工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 环境科学与工程(可...
    • 1 篇 生物工程
    • 1 篇 安全科学与工程
  • 28 篇 医学
    • 19 篇 临床医学
    • 8 篇 特种医学
    • 4 篇 基础医学(可授医学...
  • 21 篇 理学
    • 8 篇 物理学
    • 7 篇 地球物理学
    • 6 篇 化学
    • 5 篇 生物学
    • 2 篇 地理学
    • 1 篇 数学
    • 1 篇 天文学
    • 1 篇 地质学
    • 1 篇 统计学(可授理学、...
  • 6 篇 管理学
    • 5 篇 管理科学与工程(可...
  • 1 篇 哲学
    • 1 篇 哲学
  • 1 篇 农学

主题

  • 154 篇 vision-language ...
  • 15 篇 large language m...
  • 12 篇 prompt learning
  • 10 篇 clip
  • 10 篇 few-shot learnin...
  • 6 篇 contrastive lear...
  • 6 篇 foundation model...
  • 6 篇 visualization
  • 5 篇 deep learning
  • 4 篇 multimodal learn...
  • 4 篇 object detection
  • 4 篇 long-tailed reco...
  • 4 篇 remote sensing
  • 4 篇 image classifica...
  • 4 篇 artificial intel...
  • 4 篇 computer vision
  • 4 篇 domain generaliz...
  • 4 篇 prompt tuning
  • 3 篇 representation l...
  • 3 篇 image captioning

机构

  • 4 篇 carnegie mellon ...
  • 4 篇 univ chinese aca...
  • 3 篇 inesc tec porto
  • 3 篇 sichuan univ col...
  • 3 篇 univ chinese aca...
  • 3 篇 chinese univ hon...
  • 3 篇 chinese acad sci...
  • 2 篇 shanghai ai lab ...
  • 2 篇 ecole polytech f...
  • 2 篇 tsinghua univ de...
  • 2 篇 harbin inst tech...
  • 2 篇 zhejiang univ pe...
  • 2 篇 univ porto fac e...
  • 2 篇 beijing univ pos...
  • 2 篇 city univ hong k...
  • 2 篇 sichuan univ col...
  • 2 篇 tech univ munich...
  • 2 篇 westlake univ sc...
  • 2 篇 univ elect sci &...
  • 2 篇 johns hopkins un...

作者

  • 4 篇 banerjee biplab
  • 4 篇 zhang yi
  • 4 篇 jha ankit
  • 3 篇 wang donglin
  • 3 篇 singha mainak
  • 3 篇 zhang ce
  • 3 篇 tuia devis
  • 2 篇 men aidong
  • 2 篇 zhang min
  • 2 篇 liu xuyang
  • 2 篇 chen honggang
  • 2 篇 guo miaotian
  • 2 篇 yang yang
  • 2 篇 ricci elisa
  • 2 篇 ye mao
  • 2 篇 tian liang
  • 2 篇 patricio cristia...
  • 2 篇 wang haiying
  • 2 篇 teixeira luis f.
  • 2 篇 mukhopadhyay sou...

语言

  • 152 篇 英文
  • 2 篇 其他
检索条件"主题词=Vision-language Models"
154 条 记 录,以下是81-90 订阅
排序:
Enhanced Cleft Lip and Palate Classification Using SigLIP 2: A Comparative Study with vision Transformers and Siamese Networks
收藏 引用
APPLIED SCIENCES-BASEL 2025年 第9期15卷 4766-4766页
作者: Nantha, Oraphan Sathanarugsawait, Benjaporn Praneetpolgrang, Prasong Sripatum Univ Sch Informat Technol Bangkok 10900 Thailand
This paper extends our previous work on cleft lip and/or palate (CL/P) classification, which employed vision transformers (ViTs) and Siamese neural networks. We now integrate SigLIP 2, a state-of-the-art multilingual ... 详细信息
来源: 评论
Towards molecular structure discovery from cryo-ET density volumes via modelling auxiliary semantic prototypes
收藏 引用
BRIEFINGS IN BIOINFORMATICS 2025年 第1期26卷 bbae570页
作者: Nair, Ashwin Li, Xingjian Solanki, Bhupendra Mukhopadhyay, Souradeep Jha, Ankit Uddin, Mostofa Rafid Singha, Mainak Banerjee, Biplab Xu, Min Indian Inst Sci Educ & Res Dept Data Sci Vithura 695551 Kerela India Carnegie Mellon Univ Computat Biol Dept Pittsburgh PA 15213 USA Indian Inst Technol Machine Learning & Visual Comp Lab Powai 400076 Maharashtra India Indian Inst Sci Comp Sci & Automat CV Raman Rd Bengaluru 560012 Karnataka India LNM Inst Informat Technol Comp Sci & Engn Jaipur 302031 Rajasthan India
Cryo-electron tomography (cryo-ET) is confronted with the intricate task of unveiling novel structures. General class discovery (GCD) seeks to identify new classes by learning a model that can pseudo-label unannotated... 详细信息
来源: 评论
Exploring the Limits of Large language models' Ability to Distinguish Between Objects
收藏 引用
APPLIED SCIENCES-BASEL 2025年 第9期15卷 4620-4620页
作者: Ju, Hyeongjin Park, Incheol Nalcakan, Yagiz Jin, Youngwan Yeo, Sanghyeop Kim, Shiho Yonsei Univ Sch Integrated Technol Incheon 21983 South Korea Yonsei Univ BK21 Grad Program Intelligent Semicond Technol Incheon 21983 South Korea
This paper explores the capability of large language models (LLMs) to accurately classify objects in challenging visual scenarios, focusing on two main tasks: differentiating real objects from artificial replicas and ... 详细信息
来源: 评论
Enhancing open-vocabulary object detection through region-word and region-vision matching
收藏 引用
MULTIMEDIA SYSTEMS 2025年 第3期31卷 1-15页
作者: Chen, Yi Wang, Chong Li, Zhehao Lin, Sunqi Xiang, Jinhui Li, Yuqi Qian, Jiangbo Ningbo Univ Fac Elect Engn & Comp Sci Ningbo 315000 Zhejiang Peoples R China
Open-vocabulary object detection (OVOD) aims to detect novel object categories beyond the training set. Existing OVOD methods have made encouraging progress by leveraging large-scale image-caption pairs and pre-traine... 详细信息
来源: 评论
Universal and extensible language-vision models for organ segmentation and tumor detection from abdominal computed tomography
收藏 引用
MEDICAL IMAGE ANALYSIS 2024年 97卷 103226页
作者: Liu, Jie Zhang, Yixiao Wang, Kang Yavuz, Mehmet Can Chen, Xiaoxi Yuan, Yixuan Li, Haoliang Yang, Yang Yuille, Alan Tang, Yucheng Zhou, Zongwei City Univ Hong Kong Hong Kong Peoples R China Johns Hopkins Univ Baltimore MD 21218 USA Univ Calif San Francisco San Francisco CA USA Univ Illinois Champaign IL USA Chinese Univ Hong Kong Hong Kong Peoples R China NVIDIA Santa Clara CA USA
The advancement of artificial intelligence (AI) for organ segmentation and tumor detection is propelled by the growing availability of computed tomography (CT) datasets with detailed, per-voxel annotations. However, t... 详细信息
来源: 评论
Extracting Sparse Specialist models from Generalist models
Extracting Sparse Specialist Models from Generalist Models
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Yu, Tao Zhao, Xu An, Yongqi Zhu, Guibo Tang, Ming Wang, Jinqiao Institute of Automation Chinese Academy of Sciences Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China Wuhan AI Research Wuhan China
Recently, several generalist models such as Contrastive language Image Pre-training (CLIP) have demonstrated their capabilities of performing diverse downstream tasks through zero-shot or few-shot guidance. When these... 详细信息
来源: 评论
Diffusion models are Zero-Shot Generative Text-vision Retrievers
Diffusion Models are Zero-Shot Generative Text-Vision Retrie...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Li, Bao Xie, Zeke Zhang, Xiaomei Zhu, Xiangyu Lei, Zhen MAIS Institute of Automation Chinese Academy of Sciences Beijing China School of Artifical Intelligence University of Chinese Academy of Sciences Beijing China Guangzhou China CAIR HKISI Chinese Academy of Sciences Hong Kong
Large-scale text-to-image diffusion models have demonstrated impressive capabilities for downstream tasks by leveraging strong vision-language alignment from generative pre-training. Recently, a number of works have e... 详细信息
来源: 评论
KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal Integration
KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driv...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Li, Chengyuan Zhou, Suyang Kong, Jieping Qi, Lei Xue, Hui College of Software Engineering Southeast University Nanjing China School of Computer Science and Engineering Southeast University Nanjing China
Zero-shot anomaly detection (ZSAD) identifies anomalies without needing training samples from the target dataset, essential for scenarios with privacy concerns or limited data. vision-language models like CLIP show po... 详细信息
来源: 评论
Adversarial Attacks on vision-language Model-Empowered Chatbots in Consumer Electronics
收藏 引用
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 2024年 第3期70卷 6075-6083页
作者: Shang, Yingjia Liu, Zhijun Kang, Jiawen Hossain, M. Shamim Wu, Yi Heilongjiang Univ Sch Data Sci & Technol Harbin 150080 Peoples R China Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China King Saud Univ Coll Comp & Informat Sci Dept Software Engn Riyadh 13272 Saudi Arabia
Artificial Intelligence-Generated Content (AIGC) technology has revolutionized content creation, distribution, and engagement in the consumer electronics sector, propelling its applications to unprecedented heights. W... 详细信息
来源: 评论
Persian Text-Image Retrieval: A Framework Based on Image Captioning and Scalable Vector Search  29
Persian Text-Image Retrieval: A Framework Based on Image Cap...
收藏 引用
29th International Computer Conference, Computer Society of Iran, CSICC 2025
作者: Asadian, Rasoul Akhavanpour, Alireza Shenasa AI Tehran Iran
Text-image retrieval systems have seen significant advancements with CLIP-based models and vision-language models, which rely on vast datasets and powerful computational resources. However, these advancements primaril... 详细信息
来源: 评论