检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

22,772 篇 会议
112 篇 期刊文献
23 册 图书

馆藏范围

22,906 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,399 篇 工学
- 10,880 篇 计算机科学与技术...
- 3,450 篇 软件工程
- 2,430 篇 机械工程
- 1,721 篇 光学工程
- 1,011 篇 控制科学与工程
- 998 篇 电气工程
- 761 篇 信息与通信工程
- 393 篇 仪器科学与技术
- 337 篇 生物工程
- 257 篇 生物医学工程（可授...
- 215 篇 电子科学与技术（可...
- 113 篇 化学工程与技术
- 112 篇 安全科学与工程
- 98 篇 测绘科学与技术
- 92 篇 交通运输工程
- 86 篇 建筑学
- 82 篇 土木工程
3,362 篇 医学
- 3,348 篇 临床医学
- 79 篇 基础医学(可授医学...
3,250 篇 理学
- 1,953 篇 物理学
- 1,664 篇 数学
- 567 篇 统计学（可授理学、...
- 484 篇 生物学
- 245 篇 系统科学
- 109 篇 化学
506 篇 管理学
- 299 篇 图书情报与档案管...
- 219 篇 管理科学与工程(可...
- 75 篇 工商管理
252 篇 艺术学
- 252 篇 设计学（可授艺术学...
62 篇 法学
- 59 篇 社会学
40 篇 农学
25 篇 教育学
19 篇 经济学
11 篇 军事学
3 篇 文学

主题

10,127 篇 computer vision
4,025 篇 pattern recognit...
2,900 篇 training
1,958 篇 computational mo...
1,793 篇 cameras
1,759 篇 visualization
1,485 篇 shape
1,466 篇 image segmentati...
1,447 篇 feature extracti...
1,412 篇 three-dimensiona...
1,288 篇 robustness
1,169 篇 computer archite...
1,144 篇 layout
1,142 篇 computer science
1,134 篇 semantics
1,071 篇 object detection
1,043 篇 conferences
1,009 篇 benchmark testin...
967 篇 codes
810 篇 face recognition

机构

135 篇 univ sci & techn...
118 篇 univ chinese aca...
118 篇 chinese univ hon...
110 篇 carnegie mellon ...
99 篇 tsinghua univers...
99 篇 microsoft resear...
94 篇 swiss fed inst t...
92 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 university of ch...
72 篇 shanghai jiao to...
68 篇 microsoft res as...
65 篇 national laborat...
65 篇 alibaba grp peop...
64 篇 tsinghua univ pe...
63 篇 adobe research
60 篇 peking univ peop...
59 篇 peng cheng labor...

作者

78 篇 van gool luc
72 篇 timofte radu
63 篇 zhang lei
45 篇 luc van gool
40 篇 yang yi
37 篇 loy chen change
33 篇 xiaoou tang
33 篇 li stan z.
33 篇 qi tian
32 篇 sun jian
31 篇 liu yang
31 篇 li fei-fei
30 篇 chen chen
30 篇 tian qi
30 篇 pascal fua
29 篇 darrell trevor
28 篇 ying shan
27 篇 li xin
27 篇 vasconcelos nuno
27 篇 hanqing lu

语言

22,845 篇 英文
35 篇 其他
20 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=1994 IEEE Computer-Society Conference on Computer Vision and Pattern Recognition"

共 22907 条记录，以下是221-230 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetr...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Turki, Haithem Agrawal, Vasu Bulo, Samuel Rota Porzi, Lorenzo Kontschieder, Peter Ramanan, Deva Zollhofer, Michael Richardt, Christian Meta Real Labs Menlo Pk CA 94025 USA Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9798350353006

Neural radiance fields provide state-of-the-art view synthesis quality but tend to be slow to render. One reason is that they make use of volume rendering, thus requiring many samples (and model queries) per ray at render time. Although this representation is flexible and easy to optimize, most real-world objects can be modeled more efficiently with surfaces instead of volumes, requiring far fewer samples per ray. This observation has spurred considerable progress in surface representations, such as signed distance functions, but these may struggle to model semi-opaque and thin structures. We propose a method, HybridNeRF, that leverages the strengths of both representations by rendering most objects as surfaces while modeling the (typically) small fraction of challenging regions volumetrically. We evaluate HybridNeRF against the challenging Eyeful Tower dataset [38] along with other commonly used view synthesis datasets. When comparing to state-of-the-art baselines, including recent rasterization-based approaches, we improve error rates by 15-30% while achieving real-time framerates (at least 36 FPS) for virtual-reality resolutions (2K -> 2K). Project page: https://***/hybrid-nerf/.

关键词： 3d reconstruction computer vision machine learning neural radiance fields neural rendering novel view synthesis

来源：评论

学校读者我要写书评

暂无评论

PairDETR : Joint Detection and Association of Human Bodies and Faces

PairDETR : Joint Detection and Association of Human Bodies a...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ali, Ammar Gaikov, Georgii Rybalchenko, Denis Chigorin, Alexander Laptev, Ivan Zagoruyko, Sergey MTS AI ITMO Moscow Russia MTS AI Moscow Russia VisionLabs Hyderabad Telangana India MBZUAI Abu Dhabi U Arab Emirates MTS AI Skoltech Moscow Russia

ISBN: (纸本)9798350353013;9798350353006

Image and video analysis requires not only accurate object detection but also the understanding of relationships among detected objects. Common solutions to relation modeling typically resort to stand-alone object detectors followed by non-differentiable post-processing techniques. Recently introduced detection transformers (DETR) perform end-to-end object detection based on a bipartite matching loss. Such methods, however, lack the ability to jointly detect objects and resolve object associations. In this paper, we build on the DETR approach and extend it to the joint detection of objects and their relationships by introducing an approximated bipartite matching. While our method can generalize to an arbitrary number of objects, we here focus on the modeling of object pairs and their relations. In particular, we apply our method PairDETR to the problem of detecting human bodies and faces, and associating them for the same person. Our approach not only eliminates the need for hand-designed post-processing but also achieves excellent results for body-face associations. We evaluate PairDETR on the challenging CrowdHuman and CityPersons datasets and demonstrate a large improvement over the state of the art. Our training code and pre-trained models are available at https://***/mts-ai/pairdetr

关键词： Association CityPersons computer vision CrowdHuman DETR end-to-end Object detection Transformers

来源：评论

学校读者我要写书评

暂无评论

SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in vision-Language Models with Counterfactual Examples

SocialCounterfactuals: Probing and Mitigating Intersectional...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Howard, Phillip Madasu, Avinash Le, Tiep Moreno, Gustavo Lujan Bhiwandiwalla, Anahita Lal, Vasudev Intel Labs Santa Clara CA 95052 USA

ISBN: (纸本)9798350353006

While vision-language models (VLMs) have achieved remarkable performance improvements recently, there is growing evidence that these models also posses harmful biases with respect to social attributes such as gender and race. Prior studies have primarily focused on probing such bias attributes individually while ignoring biases associated with intersections between social attributes. This could be due to the difficulty of collecting an exhaustive set of image-text pairs for various combinations of social attributes. To address this challenge, we employ text-to-image diffusion models to produce counterfactual examples for probing intersectional social biases at scale. Our approach utilizes Stable Diffusion with cross attention control to produce sets of counterfactual image-text pairs that are highly similar in their depiction of a subject (e.g., a given occupation) while differing only in their depiction of intersectional social attributes (e.g., race & gender). Through our over-generate-then-filter methodology, we produce SocialCounterfactuals, a high-quality dataset containing 171k image-text pairs for probing intersectional biases related to gender, race, and physical characteristics. We conduct extensive experiments to demonstrate the usefulness of our generated dataset for probing and mitigating intersectional social biases in state-of-the-art VLMs.

关键词： counterfactuals Fairness intersectionality social bias vision-language models

来源：评论

学校读者我要写书评

暂无评论

Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning

Context-based and Diversity-driven Specificity in Compositio...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Yun Liu, Zhe Chen, Hang Yao, Lina CSIROs Data61 Clayton Vic Australia Bytedance Ltd Beijing Peoples R China Snap Inc Santa Monica CA USA

ISBN: (纸本)9798350353006

Compositional Zero-Shot Learning (CZSL) aims to recognize unseen attribute-object pairs based on a limited set of observed examples. Current CZSL methodologies, despite their advancements, tend to neglect the distinct specificity levels present in attributes. For instance, given images of sliced strawberries, they may fail to prioritize 'Sliced-Strawberry' over a generic 'Red-Strawberry', despite the former being more informative. They also suffer from ballooning search space when shifting from Close-World (CW) to Open-World (OW) CZSL. To address the issues, we introduce the Context-based and Diversity-driven Specificity learning framework for CZSL (CDS-CZSL). Our framework evaluates the specificity of attributes by considering the diversity of objects they apply to and their related context. This novel approach allows for more accurate predictions by emphasizing specific attribute-object pairs and improves composition filtering in OW-CZSL. We conduct experiments in both CW and OW scenarios, and our model achieves state-of-the-art results across three datasets.

关键词： compositional zero-shot learning transfer learning vision language model

来源：评论

学校读者我要写书评

暂无评论

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects under Occlusion

WALT3D: Generating Realistic Training Data from Time-Lapse I...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khiem Vuong Reddy, N. Dinesh Tamburo, Robert Narasimhan, Srinivasa G. Carnegie Mellon Univ Pittsburgh PA 15213 USA Amazon Seattle WA USA

ISBN: (纸本)9798350353006

Current methods for 2D and 3D object understanding struggle with severe occlusions in busy urban environments, partly due to the lack of large-scale labeled groundtruth annotations for learning occlusion. In this work, we introduce a novel framework for automatically generating a large, realistic dataset of dynamic objects under occlusions using freely available time-lapse imagery. By leveraging off-the-shelf 2D (bounding box, segmentation, keypoint) and 3D (pose, shape) predictions as pseudo-groundtruth, unoccluded 3D objects are identified automatically and composited into the background in a clip-art style, ensuring realistic appearances and physically accurate occlusion configurations. The resulting clip-art image with pseudo-groundtruth enables efficient training of object reconstruction methods that are robust to occlusions. Our method demonstrates significant improvements in both 2D and 3D reconstruction, particularly in scenarios with heavily occluded objects like vehicles and people in urban scenes.

关键词： 3D from single images computer vision

来源：评论

学校读者我要写书评

暂无评论

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

Action Scene Graphs for Long-Form Understanding of Egocentri...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Rodin, Ivan Furnari, Antonino Min, Kyle Tripathi, Subarna Farinella, Giovanni Maria Univ Catania Catania Italy Intel Labs Hillsboro OR USA

ISBN: (纸本)9798350353006

We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos. EASGs extend standard manually-annotated representations of egocentric videos, such as verb-noun action labels, by providing a temporally evolving graph-based description of the actions performed by the camera wearer, including interacted objects, their relationships, and how actions unfold in time. Through a novel annotation procedure, we extend the Ego4D dataset adding manually labeled Egocentric Action Scene Graphs which offer a rich set of annotations for long-from egocentric video understanding. We hence define the EASG generation task and provide a baseline approach, establishing preliminary benchmarks. Experiments on two downstream tasks, action anticipation and activity summarization, highlight the effectiveness of EASGs for long-form egocentric video understanding. We will release the dataset and code to replicate experiments and annotations 1 1 The code is available at https://***/fpv-iplab/EASG.

关键词： egocentric vision long-form video understanding scene graphs

来源：评论

学校读者我要写书评

暂无评论

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Perturbing Attention Gives You More Bang for the Buck: Subtl...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xu, Jingyao Lu, Yuetong Li, Yandong Lu, Siyang Wang, Dongdong Wei, Xiang Beijing Jiaotong Univ Beijing Peoples R China Google Res Mountain View CA USA Univ Cent Florida Orlando FL 32816 USA

ISBN: (纸本)9798350353006

Diffusion models ( DMs) embark a new era of generative modeling and offer more opportunities for efficient generating high- quality and realistic data samples. However, their widespread use has also brought forth new challenges in model security, which motivates the creation of more effective adversarial attackers on DMs to understand its vulnerability. We propose CAAT, a simple but generic and efficient approach that does not require costly training to effectively fool latent diffusion models (LDMs). The approach is based on the observation that cross-attention layers exhibits higher sensitivity to gradient change, allowing for leveraging subtle perturbations on published images to significantly corrupt the generated images. We show that a subtle perturbation on an image can significantly impact the cross-attention layers, thus changing the mapping between text and image during the fine-tuning of customized diffusion models. Extensive experiments demonstrate that CAAT is compatible with diverse diffusion models and out- performs baseline attack methods in a more effective (more noise) and efficient (twice as fast as Anti-DreamBooth and Mist) manner.

关键词： Adversarial Attack computer vision

来源：评论

学校读者我要写书评

暂无评论

Towards 3D vision with Low-Cost Single-Photon Cameras

Towards 3D Vision with Low-Cost Single-Photon Cameras

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Mu, Fangzhou Sifferman, Carter Jungerman, Sacha Li, Yiquan Han, Mark Gleicher, Michael Gupta, Mohit Li, Yin Univ Wisconsin Madison WI 53706 USA

ISBN: (纸本)9798350353013;9798350353006

We present a method for reconstructing 3D shape of arbitrary Lambertian objects based on measurements by miniature, energy-efficient, low-cost single-photon cameras. These cameras, operating as time resolved image sensors, illuminate the scene with a very fast pulse of diffuse light and record the shape of that pulse as it returns back from the scene at a high temporal resolution. We propose to model this image formation process, account for its non-idealities, and adapt neural rendering to reconstruct 3D geometry from a set of spatially distributed sensors with known poses. We show that our approach can successfully recover complex 3D shapes from simulated data. We further demonstrate 3D object reconstruction from real-world captures, utilizing measurements from a commodity proximity sensor. Our work draws a connection between image-based modeling and active range scanning, and offers a step towards 3D vision with single-photon cameras. Our project webpage is at https://***/ towards_3d_vision/.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Probing the 3D Awareness of Visual Foundation Models

Probing the 3D Awareness of Visual Foundation Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： El Banani, Mohamed Raj, Amit Maninis, Kevis-Kokitsi Kar, Abhishek Li, Yuanzhen Rubinstein, Michael Sun, Deqing Guibas, Leonidas Johnson, Justin Jampani, Varun Univ Michigan Ann Arbor MI 48109 USA Google Mountain View CA 94043 USA Stability AI London ON Canada

ISBN: (纸本)9798350353006

Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their intermediate representations are useful for other visual tasks such as detection and segmentation. Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also represent their 3D structure? In this work, we analyze the 3D awareness of visual foundation models. We posit that 3D awareness implies that representations (1) encode the 3D structure of the scene and (2) consistently represent the surface across views. We conduct a series of experiments using task-specific probes and zero-shot inference procedures on frozen features. Our experiments reveal several limitations of the current models. Our code and analysis can be found at https://***/mbanani/probe3d.

关键词： 3D Awareness 3D vision Foundation Models Representation Learning

来源：评论

学校读者我要写书评

暂无评论

Learning to Predict Activity Progress by Self-Supervised Video Alignment

Learning to Predict Activity Progress by Self-Supervised Vid...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Donahue, Gerard Elhamifar, Ehsan Northwestern Univ Boston MA 02115 USA

ISBN: (纸本)9798350353006

In this paper, we tackle the problem of self-supervised video alignment and activity progress prediction using in-the-wild videos. Our proposed self-supervised representation learning method carefully addresses different action orderings, redundant actions, and background frames to generate improved video representations compared to previous methods. Our model generalizes temporal cycleconsistency learning to allow for more flexibility in determining cycle-consistent neighbors. More specifically, to handle repeated actions, we propose a multi-neighbor cycle consistency and a multi-cycle-back regression loss by finding multiple soft nearest neighbors using a Gaussian Mixture Model. To handle background and redundant frames, we introduce a context-dependent drop function in our framework, discouraging the alignment of droppable frames. On the other hand, to learn from videos of multiple activities jointly, we propose a multi-head crosstask network, allowing us to embed a video and estimate progress without knowing its activity label. Experiments on multiple datasets show that our method outperforms the state-of-the-art for video alignment and progress prediction. (1)

关键词： computer vision in the wild procedural learning progress progress prediction representation learning self-supervised self-supervised representation learning unconstrained unconstrained videos Video Alignment video understanding

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 19 20 21 22 23 24 25 26 27 28 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：