检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

23,136 篇 会议
90 篇 期刊文献
15 册 图书

馆藏范围

23,240 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,631 篇 工学
- 11,162 篇 计算机科学与技术...
- 3,338 篇 软件工程
- 2,414 篇 机械工程
- 1,663 篇 光学工程
- 1,203 篇 电气工程
- 973 篇 控制科学与工程
- 738 篇 信息与通信工程
- 381 篇 仪器科学与技术
- 322 篇 生物工程
- 239 篇 生物医学工程（可授...
- 188 篇 电子科学与技术（可...
- 109 篇 化学工程与技术
- 104 篇 安全科学与工程
- 99 篇 测绘科学与技术
- 85 篇 建筑学
- 83 篇 交通运输工程
- 82 篇 土木工程
- 56 篇 力学（可授工学、理...
3,696 篇 医学
- 3,684 篇 临床医学
- 76 篇 基础医学(可授医学...
3,138 篇 理学
- 1,880 篇 物理学
- 1,605 篇 数学
- 547 篇 统计学（可授理学、...
- 466 篇 生物学
- 243 篇 系统科学
- 107 篇 化学
491 篇 管理学
- 290 篇 图书情报与档案管...
- 212 篇 管理科学与工程(可...
- 74 篇 工商管理
252 篇 艺术学
- 251 篇 设计学（可授艺术学...
58 篇 法学
38 篇 农学
25 篇 教育学
19 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,395 篇 computer vision
3,892 篇 pattern recognit...
3,101 篇 training
2,104 篇 computational mo...
1,898 篇 visualization
1,799 篇 cameras
1,487 篇 feature extracti...
1,475 篇 three-dimensiona...
1,464 篇 shape
1,447 篇 image segmentati...
1,287 篇 robustness
1,234 篇 computer archite...
1,213 篇 semantics
1,112 篇 benchmark testin...
1,111 篇 conferences
1,104 篇 layout
1,092 篇 object detection
1,084 篇 computer science
1,026 篇 codes
907 篇 face recognition

机构

137 篇 univ sci & techn...
124 篇 univ chinese aca...
121 篇 chinese univ hon...
108 篇 tsinghua univers...
108 篇 carnegie mellon ...
105 篇 microsoft resear...
97 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 university of sc...
84 篇 zhejiang univers...
81 篇 shanghai ai lab ...
79 篇 university of ch...
75 篇 shanghai jiao to...
69 篇 microsoft res as...
68 篇 alibaba grp peop...
66 篇 adobe research
65 篇 national laborat...
64 篇 peking univ peop...
61 篇 univ oxford oxfo...
59 篇 peng cheng labor...

作者

80 篇 van gool luc
71 篇 timofte radu
65 篇 zhang lei
43 篇 luc van gool
40 篇 yang yi
37 篇 loy chen change
34 篇 li stan z.
33 篇 liu yang
33 篇 xiaoou tang
33 篇 murino vittorio
33 篇 chen chen
33 篇 qi tian
33 篇 li fei-fei
32 篇 tian qi
32 篇 sun jian
30 篇 ying shan
30 篇 pascal fua
29 篇 darrell trevor
28 篇 li xin
28 篇 hanqing lu

语言

23,148 篇 英文
66 篇 其他
20 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition"

共 23241 条记录，以下是241-250 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Finding Lottery Tickets in vision Models via Data-driven Spectral Foresight Pruning

Finding Lottery Tickets in Vision Models via Data-driven Spe...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Iurada, Leonardo Ciccone, Marco Tommasi, Tatiana Politecn Torino Turin Italy

ISBN: (纸本)9798350353006

Recent advances in neural network pruning have shown how it is possible to reduce the computational costs and memory demands of deep learning models before training. We focus on this framework and propose a new pruning at initialization algorithm that leverages the Neural Tangent Kernel (NTK) theory to align the training dynamics of the sparse network with that of the dense one. Specifically, we show how the usually neglected data-dependent component in the NTK's spectrum can be taken into account by providing an analytical upper bound to the NTK's trace obtained by decomposing neural networks into individual paths. This leads to our Path eXclusion (PX), a foresight pruning method designed to preserve the parameters that mostly influence the NTK's trace. PX is able to find lottery tickets (i.e. good paths) even at high sparsity levels and largely reduces the need for additional training. When applied to pre-trained models it extracts subnetworks directly usable for several downstream tasks, resulting in performance comparable to those of the dense counterpart but with substantial cost and computational savings. Code available at: https://***/iurada/px-ntk-pruning

关键词： Efficient computer vision Neural Network Pruning Neural Tangent Kernel Pruning-at-Initialization

来源：评论

学校读者我要写书评

暂无评论

Boosting Continual Learning of vision-Language Models via Mixture-of-Experts Adapters

Boosting Continual Learning of Vision-Language Models via Mi...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Yu, Jiazuo Zhuge, Yunzhi Zhang, Lu Hu, Ping Wang, Dong Lu, Huchuan He, You Dalian Univ Technol Dalian Peoples R China Univ Elect Sci & Technol China Chengdu Peoples R China Tsinghua Univ Beijing Peoples R China

ISBN: (纸本)9798350353006

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. However, mitigating the performance degradation in large-scale models is non-trivial due to (i) parameter shifts throughout life-long learning and (ii) significant computational burdens associated with full-model tuning. In this work, we present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models. Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters in response to new tasks. To preserve the zero-shot recognition capability of vision-language models, we further introduce a Distribution Discriminative Auto-Selector (DDAS) that automatically routes in-distribution and out-of-distribution inputs to the MoE Adapter and the original CLIP, respectively. Through extensive experiments across various settings, our proposed method consistently outperforms previous state-of-the-art approaches while concurrently reducing parameter training burdens by 60%. Our code locates at https://***/JiazuoYu/MoE-Adapters4CL

关键词： class incremental learning continual learning task incremental learning vision-Language Model

来源：评论

学校读者我要写书评

暂无评论

YOLO-World: Real-Time Open-Vocabulary Object Detection

YOLO-World: Real-Time Open-Vocabulary Object Detection

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Cheng, Tianheng Sone, Lin Ge, Yixiao Liu, Wenyu Wang, Xinggang Shan, Yong Tencent AI Lab Shenzhen Guangdong Peoples R China Tencent PCG ARC Lab Shenzhen Guangdong Peoples R China Huazhong Univ Sci & Technol Sch EIC Wuhan Hubei Peoples R China

ISBN: (纸本)9798350353006

The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools. However, their reliance on predefined and trained object categories limits their applicability in open scenarios. Addressing this limitation, we introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities through vision-language modeling and pre-training on large-scale datasets. Specifically, we propose a new Re-parameterizable vision-Language Path Aggregation Network (RepVL-PAN) and region-text contrastive loss to facilitate the interaction between visual and linguistic information. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. On the challenging LVIS dataset, YOLO- World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed. Furthermore, the fine-tuned YOLO-World achieves remarkable performance on several downstream tasks, including object detection and open-vocabulary instance segmentation. Code and models are available at: https://***/AILab-CVC/YOLO-World.

关键词： Object detection open-vocabulary object detection real-time object detection vision-language modeling vision-language pre-training YOLO

来源：评论

学校读者我要写书评

暂无评论

Towards Understanding and Improving Adversarial Robustness of vision Transformers

Towards Understanding and Improving Adversarial Robustness o...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Jain, Samyak Dutta, Tanima Indian Inst Technol BHU Varanasi Varanasi Uttar Pradesh India

ISBN: (纸本)9798350353006

Recent literature has demonstrated that vision transformers (VITs) exhibit superior performance compared to convolutional neural networks (CNNs). The majority of recent research on adversarial robustness, however, has predominantly focused on CNNs. In this work, we bridge this gap by analyzing the effectiveness of existing attacks on VITs. We demonstrate that due to the softmax computations in every attention block in VITs, they are inherently vulnerable to floating point underflow errors. This can lead to a gradient masking effect resulting in suboptimal attack strength of well-known attacks, like PGD, Carlini and Wagner (CW) and GAMA. Motivated by this, we propose Adaptive Attention Scaling (AAS) attack that can automatically find the optimal scaling factors of pre-softmax outputs using gradient-based optimization. We show that the proposed simple strategy can be incorporated with any existing adversarial attacks as well as adversarial training methods and achieved improved performance. On VIT-B16, we demonstrate an improved attack strength of upto 2.2% on CIFAR10 and upto 2.9% on CIFAR100 by incorporating the proposed AAS attack with state-of-the-art single attack methods like GAMA attack. Further, we utilise the proposed AAS attack for every few epochs in existing adversarial training methods, which is termed as Adaptive Attention Scaling Adversarial Training (AAS-AT). On incorporating AAS-AT with existing methods, we outperform them on VITs over 1.3-3.5% on CIFAR10. We observe improved performance on ImageNet-100 as well.

关键词： adversarial robustness vision Transformers

来源：评论

学校读者我要写书评

暂无评论

DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly

DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Scarpellini, Gianluca Fiorini, Stefano Giuliari, Francesco Morerio, Pietro Del Bue, Alessio Ist Italiano Tecnol IIT Pattern Anal & Comp Vis PAVIS Genoa Italy

ISBN: (纸本)9798350353006

Reassembly tasks play a fundamental role in many fields and multiple approaches exist to solve specific reassembly problems. In this context, we posit that a general unified model can effectively address them all, irrespective of the input data type (images, 3D, etc.). We introduce DiffAssemble, a Graph Neural Network (GNN)-based architecture that learns to solve reassembly tasks using a diffusion model formulation. Our method treats the elements of a set, whether pieces of 2D patch or 3D object fragments, as nodes of a spatial graph. Training is performed by introducing noise into the position and rotation of the elements and iteratively denoising them to reconstruct the coherent initial pose. DiffAssemble achieves state-of-the-art (SOTA) results in most 2D and 3D reassembly tasks and is the first learning-based approach that solves 2D puzzles for both rotation and translation. Furthermore, we highlight its remarkable reduction in run-time, performing 11 times faster than the quickest optimization-based method for puzzle solving. Code available at https:// ***/IITPAVIS/DiffAssemble

关键词： diffusion model graph neural network puzzle reassembly

来源：评论

学校读者我要写书评

暂无评论

Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

Attentive Illumination Decomposition Model for Multi-Illumin...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Kim, Dongyoung Kim, Jinwoo Yu, Junsang Kim, Seon Joo Yonsei Univ Seoul South Korea Samsung Adv Inst Technol Suwon South Korea

ISBN: (纸本)9798350353006

White balance (WB) algorithms in many commercial cameras assume single and uniform illumination, leading to undesirable results when multiple lighting sources with different chromaticities exist in the scene. Prior research on multi-illuminant WB typically predicts illumination at the pixel level without fully grasping the scene's actual lighting conditions, including the number and color of light sources. This often results in unnatural outcomes lacking in overall consistency. To handle this problem, we present a deep white balancing model that leverages the slot attention, where each slot is in charge of representing individual illuminants. This design enables the model to generate [ chromaticities and weight maps for individual illuminants, which are then fused to compose the final illumination map. Furthermore, we propose the centroid-matching loss, which regulates the activation of each slot based on the color range, thereby enhancing the model to separate illumination more effectively. Our method achieves the state-of-the-art performance on both single- and multi-illuminant WB benchmarks, and also offers additional information such as the number of illuminants in the scene and their chromaticity. This capability allows for illumination editing, an application not feasible with prior methods.

关键词： Low level vision Photography White Balancing

来源：评论

学校读者我要写书评

暂无评论

NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and Results

NTIRE 2024 Challenge on Stereo Image Super-Resolution: Metho...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wang, Longguang Guo, Yulan Li, Juncheng Liu, Hongda Zhao, Yang Wang, Yingqian Jin, Zhi Gu, Shuhang Timofte, Radu Aviation University of Air Force Sun Yat-sen University The Shenzhen Campus of Sun Yat-sen University China National University of Defense Technology China Shanghai University China University of Electronic Science and Technology of China China Computer Vision Lab University of Würzburg Germany

ISBN: (纸本)9798350365474

This paper summarizes the 3rd NTIRE challenge on stereo image super-resolution (SR) with a focus on new solutions and results. The task of this challenge is to super-resolve a low-resolution stereo image pair to a high-resolution one with a magnification factor of x4 under a limited computational budget. Compared with single image SR, the major challenge of this challenge lies in how to exploit additional information in another viewpoint and how to maintain stereo consistency in the results. This challenge has 2 tracks, including one track on bicubic degradation and one track on real degradations. In total, 108 and 70 participants were successfully registered for each track, respectively. In the test phase, 14 and 13 teams successfully submitted valid results with PSNR (RGB) scores better than the baseline. This challenge establishes a new benchmark for stereo image SR.

关键词： Stereocenters

来源：评论

学校读者我要写书评

暂无评论

Compositional Chain-of-Thought Prompting for Large Multimodal Models

Compositional Chain-of-Thought Prompting for Large Multimoda...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Mitra, Chancharik Huang, Brandon Darrell, Trevor Herzig, Roei Univ Calif Berkeley Berkeley CA 94720 USA

ISBN: (纸本)9798350353006

The combination of strong visual backbones and Large Language Model (LLM) reasoning has led to Large Multimodal Models (LMMs) becoming the current standard for a wide range of vision and language (VL) tasks. However, recent research has shown that even the most advanced LMMs still struggle to capture aspects of compositional visual reasoning, such as attributes and relationships between objects. One solution is to utilize scene graphs (SGs)-a formalization of objects and their relations and attributes that has been extensively used as a bridge between the visual and textual domains. Yet, scene graph data requires scene graph annotations, which are expensive to collect and thus not easily scalable. Moreover, finetuning an LMM based on SG data can lead to catastrophic forgetting of the pretraining objective. To overcome this, inspired by chain-of-thought methods, we propose Compositional Chain-of-Thought (CCoT), a novel zero-shot Chain-of-Thought prompting method that utilizes SG representations in order to extract compositional knowledge from an LMM. Specifically, we first generate an SG using the LMM, and then use that SG in the prompt to produce a response. Through extensive experiments, we find that the proposed CCoT approach not only improves LMM performance on several vision and language (VL) compositional benchmarks but also improves the performance of several popular LMMs on general multimodal benchmarks, without the need for fine-tuning or annotated ground-truth SGs. Code: https://***/chancharikmitra/CCoT.

关键词： Compositionality Large Multimodal Models Multimodality Prompting Scene Graphs vision & Language

来源：评论

学校读者我要写书评

暂无评论

Unified-IO 2: Scaling Autoregressive Multimodal Models with vision, Language, Audio, and Action

Unified-IO 2: Scaling Autoregressive Multimodal Models with ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Lu, Jiasen Clark, Christopher Lee, Sangho Zhang, Zichen Khosla, Savya Marten, Ryan Hoiem, Derek Kembhavi, Aniruddha Allen Inst AI Seattle WA 98103 USA Univ Illinois Urbana IL USA Univ Washington Seattle WA 98195 USA

ISBN: (纸本)9798350353006

We present UNIFIED-IO 2, the first autoregressive multimodal model that is capable of understanding and generating image, text, audio, and action. To unify different modalities, we tokenize inputs and outputs - images, text, audio, action, bounding boxes etc., into a shared semantic space and then process them with a single encoder-decoder transformer model. Since training with such diverse modalities is challenging, we propose various architectural improvements to stabilize model training. We train our model from scratch on a large multimodal pre-training corpus from diverse sources with a multimodal mixture of denoisers objective. To learn an expansive set of skills, such as following multimodal instructions, we construct and finetune on an ensemble of 120 datasets with prompts and augmentations. With a single unified model, UNIFIED-IO 2 achieves state-of-the-art performance on the GRIT benchmark and strong results in more than 35 benchmarks, including image generation and understanding, natural language understanding, video and audio understanding, and robotic manipulation. We release all our models to the research community.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Fooling Polarization-based vision using Locally Controllable Polarizing Projection

Fooling Polarization-based Vision using Locally Controllable...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Li, Zhuoxiao Zhong, Zhihang Nobuhara, Shohei Nishino, Ko Zheng, Yinqiang Univ Tokyo Tokyo Japan Shanghai Artificial Intelligence Lab Shanghai Peoples R China Kyoto Univ Kyoto Japan

ISBN: (纸本)9798350353006

Polarization is a fundamental property of light that encodes abundant information regarding surface shape, material, illumination and viewing geometry. The computer vision community has witnessed a blossom of polarization-based vision applications, such as reflection removal, shape-from-polarization (SfP), transparent object segmentation and color constancy, partially due to the emergence of single-chip mono/color polarization sensors that make polarization data acquisition easier than ever. However, is polarization-based vision vulnerable to adversarial attacks? If so, is that possible to realize these adversarial attacks in the physical world, without being perceived by human eyes? In this paper, we warn the community of the vulnerability of polarization-based vision, which can be more serious than RGB-based vision. By adapting a commercial LCD projector, we achieve locally controllable polarizing projection, which is successfully utilized to fool state-of-the-art polarization-based vision algorithms for glass segmentation and SfP. Compared with existing physical attacks on RGB-based vision, which always suffer from the trade-off between attack efficacy and eye conceivability, the adversarial attackers based on polarizing projection are contact-free and visually imperceptible, since naked human eyes can rarely perceive the difference of viciously manipulated polarizing light and ordinary illumination. This poses unprecedented risks on polarization-based vision, for which due attentions should be paid and counter measures be considered.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 21 22 23 24 25 26 27 28 29 30 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：