检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

19,438 篇 会议
46 篇 期刊文献
5 册 图书

馆藏范围

19,488 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

12,440 篇 工学
- 10,282 篇 计算机科学与技术...
- 2,395 篇 机械工程
- 2,007 篇 软件工程
- 813 篇 光学工程
- 531 篇 电气工程
- 419 篇 控制科学与工程
- 322 篇 信息与通信工程
- 210 篇 测绘科学与技术
- 80 篇 生物医学工程（可授...
- 73 篇 电子科学与技术（可...
- 70 篇 生物工程
- 60 篇 仪器科学与技术
- 38 篇 建筑学
- 36 篇 土木工程
- 33 篇 力学（可授工学、理...
- 31 篇 航空宇航科学与技...
- 26 篇 安全科学与工程
- 20 篇 材料科学与工程（可...
- 20 篇 交通运输工程
3,409 篇 医学
- 3,408 篇 临床医学
1,980 篇 理学
- 1,006 篇 数学
- 973 篇 物理学
- 359 篇 统计学（可授理学、...
- 336 篇 生物学
- 231 篇 系统科学
- 24 篇 化学
258 篇 管理学
- 138 篇 管理科学与工程(可...
- 122 篇 图书情报与档案管...
- 27 篇 工商管理
19 篇 法学
- 19 篇 社会学
14 篇 农学
8 篇 教育学
7 篇 经济学
3 篇 军事学
3 篇 艺术学

主题

7,893 篇 computer vision
2,727 篇 training
2,680 篇 pattern recognit...
1,760 篇 computational mo...
1,644 篇 visualization
1,410 篇 cameras
1,372 篇 three-dimensiona...
1,327 篇 shape
1,213 篇 face recognition
1,207 篇 image segmentati...
1,164 篇 feature extracti...
1,109 篇 robustness
1,087 篇 semantics
983 篇 layout
959 篇 object detection
949 篇 computer archite...
942 篇 benchmark testin...
931 篇 codes
902 篇 computer science
859 篇 deep learning

机构

174 篇 univ sci & techn...
161 篇 carnegie mellon ...
148 篇 univ chinese aca...
144 篇 chinese univ hon...
110 篇 microsoft resear...
106 篇 tsinghua univ pe...
103 篇 zhejiang univ pe...
99 篇 swiss fed inst t...
92 篇 tsinghua univers...
89 篇 microsoft res as...
88 篇 shanghai ai lab ...
81 篇 zhejiang univers...
76 篇 alibaba grp peop...
73 篇 university of sc...
73 篇 hong kong univ s...
72 篇 peking univ peop...
72 篇 university of ch...
68 篇 shanghai jiao to...
66 篇 univ oxford oxfo...
66 篇 shanghai jiao to...

作者

79 篇 van gool luc
70 篇 zhang lei
59 篇 timofte radu
48 篇 yang yi
47 篇 xiaoou tang
45 篇 luc van gool
43 篇 darrell trevor
43 篇 tian qi
42 篇 loy chen change
42 篇 sun jian
42 篇 li fei-fei
40 篇 qi tian
38 篇 li stan z.
36 篇 chen xilin
36 篇 torralba antonio
35 篇 vasconcelos nuno
35 篇 shan shiguang
35 篇 liu yang
34 篇 liu xiaoming
34 篇 tao dacheng

语言

19,483 篇 英文
2 篇 日文
2 篇 其他
2 篇 中文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2000"

共 19489 条记录，以下是301-310 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Material Palette: Extraction of Materials from a Single Image

Material Palette: Extraction of Materials from a Single Imag...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Lopes, Ivan Pizzati, Fabio de Charette, Raoul INRIA Paris France Univ Oxford Oxford England

ISBN: (纸本)9798350353013;9798350353006

Physically-Based Rendering (PBR) is key to modeling the interaction between light and materials, and finds extensive applications across computer graphics domains. However, acquiring PBR materials is costly and requires special apparatus. In this paper, we propose a method to extract PBR materials from a single real-world image. We do so in two steps: first, we map regions of the image to material concept tokens using a diffusion model, allowing the sampling of texture images resembling each material in the scene. Second, we leverage a separate network to decompose the generated textures into spatially varying BRDFs (SVBRDFs), offering us readily usable materials for rendering applications. Our approach relies on existing synthetic material libraries with SVBRDF ground truth. It exploits a diffusion-generated RGB texture dataset to allow generalization to new samples using unsupervised domain adaptation (UDA). Our contributions are thoroughly evaluated on synthetic and real-world datasets. We further demonstrate the applicability of our method for editing 3D scenes with materials estimated from real photographs. Along with video, we share code and models as open-source on the project page: https://***/astra-vision/MaterialPalette

关键词： brdf computer graphics generative ai inversion Material estimation

来源：评论

学校读者我要写书评

暂无评论

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained vision-Language Models

One Prompt Word is Enough to Boost Adversarial Robustness fo...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Lin, L. Guan, Haoyan Qiu, Jianing Spratling, Michael Kings Coll London London England Imperial Coll London London England

ISBN: (纸本)9798350353006

Large pre-trained vision-Language Models (VLMs) like CLIP, despite having remarkable generalization ability, are highly vulnerable to adversarial examples. This work studies the adversarial robustness of VLMs from the novel perspective of the text prompt instead of the extensively studied model weights (frozen in this work). We first show that the effectiveness of both adversarial attack and defense are sensitive to the used text prompt. Inspired by this, we propose a method to improve resilience to adversarial attacks by learning a robust text prompt for VLMs. The proposed method, named Adversarial Prompt Tuning (APT), is effective while being both computationally and data efficient. Extensive experiments are conducted across 15 datasets and 4 data sparsity schemes (from 1-shot to full training data settings) to show APT's superiority over hand-engineered prompts and other state-of-the-art adaption methods. APT demonstrated excellent abilities in terms of the in-distribution performance and the generalization under input distribution shift and across datasets. Surprisingly, by simply adding one learned word to the prompts, APT can significantly boost the accuracy and robustness ((sic)=4/255 ) over the hand-engineered prompts by +13% and +8.5% on average respectively. The improvement further increases, in our most effective setting, to +26.4% for accuracy and +16.7% for robustness. Code is available at https://***/TreeLLi/APT.

关键词： adversarial examples adversarial robustness CLIP text prompting vision-language models VLMs

来源：评论

学校读者我要写书评

暂无评论

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors

Zero-Reference Low-Light Enhancement via Physical Quadruple ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Wang, Wenjing Yang, Huan Fu, Jianlong Liu, Jiaying Peking Univ Beijing Peoples R China 01 AI Beijing Peoples R China Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350353006

Understanding illumination and reducing the need for supervision pose a significant challenge in low-light enhancement. Current approaches are highly sensitive to data usage during training and illumination-specific hyper-parameters, limiting their ability to handle unseen scenarios. In this paper, we propose a new zero-reference low-light enhancement framework trainable solely with normal light images. To accomplish this, we devise an illumination-invariant prior inspired by the theory of physical light transfer. This prior serves as the bridge between normal and low-light images. Then, we develop a prior-to-image framework trained without low-light data. During testing, this frame-work is able to restore our illumination-invariant prior back to images, automatically achieving low-light enhancement. Within this framework, we leverage a pretrained generative diffusion model for model ability, introduce a bypass decoder to handle detail distortion, as well as offer a lightweight version for practicality. Extensive experiments demonstrate our framework's superiority in various scenarios as well as good interpretability, robustness, and efficiency. Code is available on our project homepage.

关键词： diffusion image processing low-level vision Low-light enhancement zero-reference

来源：评论

学校读者我要写书评

暂无评论

Random Entangled Tokens for Adversarially Robust vision Transformer

Random Entangled Tokens for Adversarially Robust Vision Tran...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Gong, Huihui Dong, Mingjing Mao, Siqi Camtepe, Seyit Nepal, Surya Xu, Chang Univ Sydney Sydney NSW Australia CSIRO Data61 Eveleigh Australia City Univ Hong Kong Hong Kong Peoples R China Univ New South Wales Sydney NSW Australia

ISBN: (纸本)9798350353006

vision Transformers (ViTs) have emerged as a compelling alternative to Convolutional Neural Networks ( CNNs) in the realm of computer vision, showcasing tremendous potential. However, recent research has unveiled a susceptibility of ViTs to adversarial attacks, akin to their CNN counterparts. Adversarial training and randomization are two representative effective defenses for CNNs. Some researchers have attempted to apply adversarial training to ViTs and achieved comparable robustness to CNNs, while it is not easy to directly apply randomization to ViTs because of the architecture difference between CNNs and ViTs. In this paper, we delve into the structural intricacies of ViTs and propose a novel defense mechanism termed Random entangled image Transformer (ReiT), which seamlessly integrates adversarial training and randomization to bolster the adversarial robustness of ViTs. Recognizing the challenge posed by the structural disparities between ViTs and CNNs, we introduce a novel module, input-independent random entangled self-attention (II-ReSA). This module optimizes random entangled tokens that lead to "dissimilar" self-attention outputs by leveraging model parameters and the sampled random tokens, thereby synthesizing the self-attention module outputs and random entangled tokens to diminish adversarial similarity. ReiT incorporates two distinct random entangled tokens and employs dual randomization, offering an effective countermeasure against adversarial examples while ensuring comprehensive deduction guarantees. Through extensive experiments conducted on various ViT variants and benchmarks, we substantiate the superiority of our proposed method in enhancing the adversarial robustness of vision Transformers.

关键词： Adversarial Robustness Randomized Defence Self-Attention Mechanism vision Transformers

来源：评论

学校读者我要写书评

暂无评论

Connecting NeRFs, Images, and Text

Connecting NeRFs, Images, and Text

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Ballerini, Francesco Ramirez, Pierluigi Zama Mirabella, Roberto Salti, Samuele Di Stefano, Luigi Univ Bologna Bologna Italy

ISBN: (纸本)9798350365474

Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.

关键词： 3D computer vision Neural Fields

来源：评论

学校读者我要写书评

暂无评论

Learning from One Continuous Video Stream

Learning from One Continuous Video Stream

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Carreira, Joao King, Michael Patraucean, Viorica Gokal, Dilara Ionescu, Cristian Yang, Yi Zoran, Daniel Heyward, Joseph Doersch, Carl Aytar, Yusuf Damen, Di Liu Zisserman, Andrew Google DeepMind London 1 England Univ Bristol Bristol Avon England Univ Oxford Oxford England

ISBN: (纸本)9798350353006

We introduce a framework for online learning from a single continuous video stream - the way people and animals learn, without mini-batches, data augmentation or shuffling. This poses great challenges given the high correlation between consecutive video frames and there is very little prior work on it. Our framework allows us to do a first deep dive into the topic and includes a collection of streams and tasks composed from two existing video datasets, plus methodology for performance evaluation that considers both adaptation and generalization. We employ pixel-to-pixel modelling as a practical and flexible way to switch between pre-training and single-stream evaluation as well as between arbitrary tasks, without ever requiring changes to models and always using the same pixel loss. Equipped with this framework we obtained large single-stream learning gains from pre-training with a novel family of future prediction tasks, found that momentum hurts, and that the pace of weight updates matters. The combination of these insights leads to matching the performance of IID learning with batch size 1, when using the same architecture and without costly replay buffers. An overview of the paper is available online at https://***/view/one-stream-video.

关键词： computer vision continual learning developmental learning

来源：评论

学校读者我要写书评

暂无评论

Weak-to-Strong 3D Object Detection with X-Ray Distillation

Weak-to-Strong 3D Object Detection with X-Ray Distillation

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Gambashidze, Alexander Dadukin, Aleksandr Golyadkin, Maxim Razzhivina, Maria Makarov, Ilya Artificial Intelligence Res Inst Barcelona Spain HSE Univ Moscow Russia ISP RAS Moscow Russia

ISBN: (纸本)9798350353006

This paper addresses the critical challenges of sparsity and occlusion in LiDAR-based 3D object detection. Current methods often rely on supplementary modules or specific architectural designs, potentially limiting their applicability to new and evolving architectures. To our knowledge, we are the first to propose a versatile technique that seamlessly integrates into any existing framework for 3D Object Detection, marking the first instance of Weak-to-Strong generalization in 3D computer vision. We introduce a novel framework, X-Ray Distillation with Object-Complete Frames, suitable for both supervised and semi-supervised settings, that leverages the temporal aspect of point cloud sequences. This method extracts crucial information from both previous and subsequent LiDAR frames, creating Object-Complete frames that represent objects from multiple viewpoints, thus addressing occlusion and sparsity. Given the limitation of not being able to generate Object-Complete frames during online inference, we utilize Knowledge Distillation within a Teacher-Student framework. This technique encourages the strong Student model to emulate the behavior of the weaker Teacher, which processes simple and informative Object-Complete frames, effectively offering a comprehensive view of objects as if seen through X-ray vision. Our proposed methods surpass state-of-the-art in semi-supervised learning by 1-1.5 mAP and enhance the performance of five established supervised models by 1-2 mAP on standard autonomous driving datasets, even with default hyperparameters. Code for Object-Complete frames is available here: https://***/sakharok13/X-Ray-TeacherPatching-Tools.

关键词： 3D detection autonomous driving computer vision

来源：评论

学校读者我要写书评

暂无评论

DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

DiffAM: Diffusion-based Adversarial Makeup Transfer for Faci...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Sun, Yuhao Yu, Lingyun Xie, Hongtao Li, Jiaming Zhang, Yongdong Univ Sci & Technol China Hefei Peoples R China

ISBN: (纸本)9798350353006

With the rapid development of face recognition (FR) systems, the privacy of face images on social media is facing severe challenges due to the abuse of unauthorized FR systems. Some studies utilize adversarial attack techniques to defend against malicious FR systems by generating adversarial examples. However, the generated adversarial examples, i.e., the protected face images, tend to suffer from subpar visual quality and low transferability. In this paper, we propose a novel face protection approach, dubbed DiffAM, which leverages the powerful generative ability of diffusion models to generate high-quality protected face images with adversarial makeup transferred from reference images. To be specific, we first introduce a makeup removal module to generate non-makeup images utilizing a fine-tuned diffusion model with guidance of textual prompts in CLIP space. As the inverse process of makeup transfer, makeup removal can make it easier to establish the deterministic relationship between makeup domain and non-makeup domain regardless of elaborate text prompts. Then, with this relationship, a CLIP-based makeup loss along with an ensemble attack strategy is introduced to jointly guide the direction of adversarial makeup domain, achieving the generation of protected face images with natural-looking makeup and high black-box transferability. Extensive experiments demonstrate that DiffAM achieves higher visual quality and attack success rates with a gain of 12.98% under black-box setting compared with the state of the arts. The code will be available at https://***/HansSunY/DiffAM.

关键词： adversarial attack diffusion models face recognition

来源：评论

学校读者我要写书评

暂无评论

Contextual Augmented Global Contrast for Multimodal Intent recognition

Contextual Augmented Global Contrast for Multimodal Intent R...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Sun, Kaili Xie, Zhiwen Ye, Mang Zhang, Huyin Wuhan Univ Sch Comp Sci Wuhan Peoples R China Cent China Normal Univ Sch Comp Sci Wuhan Peoples R China

ISBN: (纸本)9798350353006

Multimodal intent recognition (MIR) aims to perceive the human intent polarity via language, visual, and acoustic modalities. The inherent intent ambiguity makes it challenging to recognize in multimodal scenarios. Existing MIR methods tend to model the individual video independently, ignoring global contextual information across videos. This learning manner inevitably introduces perception biases, exacerbated by the inconsistencies of the multimodal representation, amplifying the intent uncertainty. This challenge motivates us to explore effective global context modeling. Thus, we propose a context-augmented global contrast (CAGC) method to capture rich global context features by mining both intra-and cross-video context interactions for MIR. Concretely, we design a context-augmented transformer module to extract global context dependencies across videos. To further alleviate error accumulation and interference, we develop a cross-video bank that retrieves effective video sources by considering both intentional tendency and video similarity. Furthermore, we introduce a global context-guided contrastive learning scheme, designed to mitigate inconsistencies arising from global context and individual modalities in different feature spaces. This scheme incorporates global cues as the supervision to capture robust the multimodal intent representation. Experiments demonstrate CAGC obtains superior performance than state-of-the-art MIR methods. We also generalize our approach to a closely related task, multimodal sentiment analysis, achieving the comparable performance.

关键词： Contrastive learning Global contextual information Multimodal intent recognition

来源：评论

学校读者我要写书评

暂无评论

EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object recognition

EventDance: Unsupervised Source-free Cross-modal Adaptation ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zheng, Xu Wang, Lin HKUST GZ AI Thrust Guangzhou Peoples R China HKUST Dept CSE Guangzhou Peoples R China

ISBN: (纸本)9798350353006

In this paper, we make the first attempt at achieving the cross-modal (i.e., image-to-events) adaptation for event-based object recognition without accessing any labeled source image data owning to privacy and commercial issues. Tackling this novel problem is non-trivial due to the novelty of event cameras and the distinct modality gap between images and events. In particular, as only the source model is available, a hurdle is how to extract the knowledge from the source model by only using the unlabeled target event data while achieving knowledge transfer. To this end, we propose a novel framework, dubbed EventDance for this unsupervised source-free cross-modal adaptation problem. Importantly, inspired by event-to-video reconstruction methods, we propose a reconstruction-based modality bridging (RMB) module, which reconstructs intensity frames from events in a self-supervised manner. This makes it possible to build up the surrogate images to extract the knowledge (i.e., labels) from the source model. We then propose a multi-representation knowledge adaptation ( MKA) module that transfers the knowledge to target models learning events with multiple representation types for fully exploring the spatiotemporal information of events. The two modules connecting the source and target models are mutually updated so as to achieve the best performance. Experiments on three benchmark datasets with two adaption settings show that EventDance is on par with prior methods utilizing the source data.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 27 28 29 30 31 32 33 34 35 36 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：