检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

19,438 篇 会议
46 篇 期刊文献
5 册 图书

馆藏范围

19,488 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

12,440 篇 工学
- 10,282 篇 计算机科学与技术...
- 2,395 篇 机械工程
- 2,007 篇 软件工程
- 813 篇 光学工程
- 531 篇 电气工程
- 419 篇 控制科学与工程
- 322 篇 信息与通信工程
- 210 篇 测绘科学与技术
- 80 篇 生物医学工程（可授...
- 73 篇 电子科学与技术（可...
- 70 篇 生物工程
- 60 篇 仪器科学与技术
- 38 篇 建筑学
- 36 篇 土木工程
- 33 篇 力学（可授工学、理...
- 31 篇 航空宇航科学与技...
- 26 篇 安全科学与工程
- 20 篇 材料科学与工程（可...
- 20 篇 交通运输工程
3,409 篇 医学
- 3,408 篇 临床医学
1,980 篇 理学
- 1,006 篇 数学
- 973 篇 物理学
- 359 篇 统计学（可授理学、...
- 336 篇 生物学
- 231 篇 系统科学
- 24 篇 化学
258 篇 管理学
- 138 篇 管理科学与工程(可...
- 122 篇 图书情报与档案管...
- 27 篇 工商管理
19 篇 法学
- 19 篇 社会学
14 篇 农学
8 篇 教育学
7 篇 经济学
3 篇 军事学
3 篇 艺术学

主题

7,893 篇 computer vision
2,727 篇 training
2,680 篇 pattern recognit...
1,760 篇 computational mo...
1,644 篇 visualization
1,410 篇 cameras
1,372 篇 three-dimensiona...
1,327 篇 shape
1,213 篇 face recognition
1,207 篇 image segmentati...
1,164 篇 feature extracti...
1,109 篇 robustness
1,087 篇 semantics
983 篇 layout
959 篇 object detection
949 篇 computer archite...
942 篇 benchmark testin...
931 篇 codes
902 篇 computer science
859 篇 deep learning

机构

174 篇 univ sci & techn...
161 篇 carnegie mellon ...
148 篇 univ chinese aca...
144 篇 chinese univ hon...
110 篇 microsoft resear...
106 篇 tsinghua univ pe...
103 篇 zhejiang univ pe...
99 篇 swiss fed inst t...
92 篇 tsinghua univers...
89 篇 microsoft res as...
88 篇 shanghai ai lab ...
81 篇 zhejiang univers...
76 篇 alibaba grp peop...
73 篇 university of sc...
73 篇 hong kong univ s...
72 篇 peking univ peop...
72 篇 university of ch...
68 篇 shanghai jiao to...
66 篇 univ oxford oxfo...
66 篇 shanghai jiao to...

作者

79 篇 van gool luc
70 篇 zhang lei
59 篇 timofte radu
48 篇 yang yi
47 篇 xiaoou tang
45 篇 luc van gool
43 篇 darrell trevor
43 篇 tian qi
42 篇 loy chen change
42 篇 sun jian
42 篇 li fei-fei
40 篇 qi tian
38 篇 li stan z.
36 篇 chen xilin
36 篇 torralba antonio
35 篇 vasconcelos nuno
35 篇 shan shiguang
35 篇 liu yang
34 篇 liu xiaoming
34 篇 tao dacheng

语言

19,483 篇 英文
2 篇 日文
2 篇 其他
2 篇 中文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2000"

共 19489 条记录，以下是4691-4700 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

One-Shot Neural Ensemble Architecture Search by Diversity-Gu...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Chen, Minghao Fu, Jianlong Ling, Haibin SUNY Stony Brook Stony Brook NY 11794 USA Microsoft Res Asia Beijing Peoples R China Microsoft Beijing Peoples R China

ISBN: (纸本)9781665445092

Despite remarkable progress achieved, most neural architecture search (NAS) methods focus on searching for one single accurate and robust architecture. To further build models with better generalization capability and performance, model ensemble is usually adopted and performs better than stand-alone models. Inspired by the merits of model ensemble, we propose to search for multiple diverse models simultaneously as an alternative way to find powerful models. Searching for ensembles is non-trivial and has two key challenges: enlarged search space and potentially more complexity for the searched model. In this paper, we propose a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges. For the first challenge, we introduce a novel diversity-based metric to guide search space shrinking, considering both the potentiality and diversity of candidate operators. For the second challenge, we enable a new search dimension to learn layer sharing among different models for efficiency purposes. The experiments on ImageNet clearly demonstrate that our solution can improve the supernet's capacity of ranking ensemble architectures, and further lead to better search results. The discovered architectures achieve superior performance compared with state-of-the-arts such as MobileNetV3 and EfficientNet families under aligned settings. Moreover, we evaluate the generalization ability and robustness of our searched architecture on the COCO detection benchmark and achieve a 3.1% improvement on AP compared with MobileNetV3. Codes and models are available here.

关键词： computer vision Codes computer architecture Benchmark testing Extraterrestrial measurements Robustness Complexity theory

来源：评论

学校读者我要写书评

暂无评论

Dogfight: Detecting Drones from Drones Videos

Dogfight: Detecting Drones from Drones Videos

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Ashraf, Muhammad Waseem Sultani, Waqas Shah, Mubarak Informat Technol Univ Intelligent Machines Lab Lahore Punjab Pakistan Univ Cent Florida Ctr Res Comp Vis Orlando FL 32816 USA

ISBN: (纸本)9781665445092

As airborne vehicles are becoming more autonomous and ubiquitous, it has become vital to develop the capability to detect the objects in their surroundings. This paper attempts to address the problem of drones detection from other flying drones. The erratic movement of the source and target drones, small size, arbitrary shape, large intensity variations, and occlusion make this problem quite challenging. In this scenario, region-proposal based methods are not able to capture sufficient discriminative foreground-background information. Also, due to the extremely small size and complex motion of the source and target drones, feature aggregation based methods are unable to perform well. To handle this, instead of using region-proposal based methods, we propose to use a two-stage segmentation-based approach employing spatio-temporal attention cues. During the first stage, given the overlapping frame regions, detailed contextual information is captured over convolution feature maps using pyramid pooling. After that pixel and channel-wise attention is enforced on the feature maps to ensure accurate drone localization. In the second stage, first stage detections are verified and new probable drone locations are explored. To discover new drone locations, motion boundaries are used. This is followed by tracking candidate drone detections for a few frames, cuboid formation, extraction of the 3D convolution feature map, and drones detection within each cuboid. The proposed approach is evaluated on two publicly available drone detection datasets and outperforms several competitive baselines.

关键词： Visualization computer vision Shape Convolution Tracking Motion segmentation Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Open-Set Representation Learning through Combinatorial Embedding

Open-Set Representation Learning through Combinatorial Embed...

引用

conference on computer vision and pattern recognition (cvpr)

作者： Geeho Kim Junoh Kang Bohyung Han Computer Vision Laboratory ECE Seoul National University IPAI Seoul National University

Visual recognition tasks are often limited to dealing with a small subset of classes simply because the labels for the remaining classes are unavailable. We are interested in identifying novel concepts in a dataset through representation learning based on both labeled and unlabeled examples, and extending the horizon of recognition to both known and novel classes. To address this challenging task, we propose a combinatorial learning approach, which naturally clusters the examples in unseen classes using the compositional knowledge given by multiple supervised meta-classifiers on heterogeneous label spaces. The representations given by the combinatorial embedding are made more robust by unsupervised pairwise relation learning. The proposed algorithm discovers novel concepts via a joint optimization for enhancing the discrimitiveness of unseen classes as well as learning the representations of known classes generalizable to novel ones. Our extensive experiments demonstrate remarkable performance gains by the proposed approach on public datasets for image retrieval and image categorization with novel class discovery.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

Action Scene Graphs for Long-Form Understanding of Egocentri...

引用

conference on computer vision and pattern recognition (cvpr)

作者： Ivan Rodin Antonino Furnari Kyle Min Subarna Tripathi Giovanni Maria Farinella University of Catania Intel Labs

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos. EASGs extend standard manually-annotated representations of egocentric videos, such as verb-noun action labels, by providing a temporally evolving graph-based description of the actions performed by the camera wearer, including interacted objects, their relationships, and how actions unfold in time. Through a novel annotation procedure, we extend the Ego4D dataset adding manually labeled Egocentric Action Scene Graphs which offer a rich set of annotations for long-from egocentric video understanding. We hence define the EASG generation task and provide a baseline approach, establishing preliminary benchmarks. Experiments on two downstream tasks, action anticipation and activity summarization, highlight the effectiveness of EASGs for long-form egocentric video understanding. We will release the dataset and code to replicate experiments and annotations 1 1 The code is available at https://***/fpv-iplab/EASG.

关键词： computer vision Codes Annotations Manuals Benchmark testing Cameras pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

Improving Subject-Driven Image Synthesis with Subject-Agnost...

引用

conference on computer vision and pattern recognition (cvpr)

作者： Kelvin C.K. Chan Yang Zhao Xuhui Jia Ming-Hsuan Yang Huisheng Wang Google

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt. In this work, we propose Subject-Agnostic Guidance (SAG), a simple yet effective solution to remedy the problem. We show that through constructing a subject-agnostic condition and applying our proposed dual classifier-free guidance, one could obtain outputs consistent with both the given subject and input text prompts. We validate the efficacy of our approach through both optimization-based and encoder-based methods. Additionally, we demonstrate its applicability in second-order customization methods, where an encoder-based model is fine-tuned with DreamBooth. Our approach is conceptually simple and requires only minimal code modifications, but leads to substantial quality improvements, as evidenced by our evaluations and user studies.

关键词： Training computer vision Codes Image synthesis Text to image pattern recognition

来源：评论

学校读者我要写书评

暂无评论

How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines

How Privacy-Preserving are Line Clouds? Recovering Scene Det...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Chelani, Kunal Kahl, Fredrik Sattler, Torsten Chalmers Univ Technol Gothenburg Sweden Czech Tech Univ Prague Czech Republic

ISBN: (纸本)9781665445092

Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. Visual localization algorithms are a fundamental building block in advanced computer vision applications, including Mixed and Virtual Reality systems. Many algorithms used in practice represent the scene through a Structure-from-Motion (SfM) point cloud and use 2D-3D matches between a query image and the 3D points for camera pose estimation. As recently shown, image details can be accurately recovered from SfM point clouds by translating renderings of the sparse point clouds to images. To address the resulting potential privacy risks for user-generated content, it was recently proposed to lift point clouds to line clouds by replacing 3D points by randomly oriented 3D lines passing through these points. The resulting representation is unintelligible to humans and effectively prevents point cloudto-image translation. This paper shows that a significant amount of information about the 3D scene geometry is preserved in these line clouds, allowing us to (approximately) recover the 3D point positions and thus to (approximately) recover image content. Our approach is based on the observation that the closest points between lines can yield a good approximation to the original 3D points.

关键词： Location awareness Cloud computing Visualization computer vision Privacy Three-dimensional displays User-generated content

来源：评论

学校读者我要写书评

暂无评论

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Haurum, Joakim Bruslund Moeslund, Thomas B. Aalborg Univ Visual Anal & Percept VAP Lab Aalborg Denmark

ISBN: (纸本)9781665445092

Perhaps surprisingly sewerage infrastructure is one of the most costly infrastructures in modern society. Sewer pipes are manually inspected to determine whether the pipes are defective. However, this process is limited by the number of qualified inspectors and the time it takes to inspect a pipe. Automatization of this process is therefore of high interest. So far, the success of computer vision approaches for sewer defect classification has been limited when compared to the success in other fields mainly due to the lack of public datasets. To this end, in this work we present a large novel and publicly available multi-label classification dataset for image-based sewer defect classification called Sewer-ML. The Sewer-ML dataset consists of 1.3 million images annotated by professional sewer inspectors from three different utility companies across nine years. Together with the dataset, we also present a benchmark algorithm and a novel metric for assessing performance. The benchmark algorithm is a result of evaluating 12 state-of-the-art algorithms, six from the sewer defect classification domain and six from the multi-label classification domain, and combining the best performing algorithms. The novel metric is a class-importance weighted F2 score, F2CIW, reflecting the economic impact of each class, used together with the normal pipe F1 score, F1Normal. The benchmark algorithm achieves an F2CIW score of 55.11% and F1Normal score of 90.94%, leaving ample room for improvement on the SewerML dataset. The code, models, and dataset are available at the project page http://***/sewer-ml

关键词： Measurement Economics computer vision Codes Biological system modeling Benchmark testing Inspection

来源：评论

学校读者我要写书评

暂无评论

MAFA: Managing False Negatives for vision-Language Pre-Training

MAFA: Managing False Negatives for Vision-Language Pre-Train...

引用

conference on computer vision and pattern recognition (cvpr)

作者： Jaeseok Byun Dohoon Kim Taesup Moon Department of ECE Seoul National University Department of ASRI/INMC/IPAI/AIIS Seoul National University

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

We consider a critical issue of false negatives in vision-Language Pre-training (VLP), a challenge that arises from the inherent many-to-many correspondence of image-text pairs in large-scale web-crawled datasets. The presence of false negatives can impede achieving optimal performance and even lead to a significant performance drop. To address this challenge, we propose MAFA (MAnaging FAlse negatives), which consists of two pivotal components building upon the recently developed GRouped mIni-bcTch sampling (GRIT) strategy: 1) an efficient connection mining process that identifies and converts false negatives into positives, and 2) label smoothing for the image-text contrastive (ITC) loss. Our comprehensive experiments verify the effectiveness of MAFA across multiple downstream tasks, emphasizing the crucial role of addressing false negatives in VLP, potentially even surpassing the importance of addressing false positives. In addition, the compatibility of MAFA with the recent BLIP-family model is also demonstrated. Code is available at https://***/jaeseokbyun/MAFA.

关键词： computer vision Smoothing methods Codes Computational modeling Buildings pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action recognition from Egocentric RGB Videos

Hierarchical Temporal Transformer for 3D Hand Pose Estimatio...

引用

conference on computer vision and pattern recognition (cvpr)

作者： Yilin Wen Hao Pan Lei Yang Jia Pan Taku Komura Wenping Wang The University of Hong Kong Microsoft Research Asia TransGP Texas A&M University

Understanding dynamic hand motions and actions from egocentric RGB videos is a fundamental yet challenging task due to self-occlusion and ambiguity. To address occlusion and ambiguity, we develop a transformer-based framework to exploit temporal information for robust estimation. Noticing the different temporal granularity of and the semantic correlation between hand pose estimation and action recognition, we build a network hierarchy with two cascaded transformer encoders, where the first one exploits the short-term temporal cue for hand pose estimation, and the latter aggregates per-frame pose and object information over a longer time span to recognize the action. Our approach achieves competitive results on two first-person hand action benchmarks, namely FPHA and H2O. Extensive ablation studies verify our design choices.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Seeing Out of tHe bOx: End-to-End Pre-training for vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-La...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Huang, Zhicheng Zeng, Zhaoyang Huang, Yupan Liu, Bei Fu, Dongmei Fu, Jianlong Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing Peoples R China Beijing Engn Res Ctr Ind Spectrum Imaging Beijing Peoples R China Sun Yat Sen Univ Guangzhou Peoples R China Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9781665445092

We study joint learning of Convolutional Neural Network (CNN) and Transformer for vision-language pre-training (VLPT) which aims to learn cross-modal alignments from millions of image-text pairs. State-of-the-art approaches extract salient image regions and align regions with words step-by-step. As region-based visual features usually represent parts of an image, it is challenging for existing visionlanguage models to fully understand the semantics from paired natural languages. In this paper, we propose SOHO to "Seeing Out of tHe bOx" that takes a whole image as input, and learns vision-language representation in an end-to-end manner. SOHO does not require bounding box annotations which enables inference 10 times faster than region-based approaches. In particular, SOHO learns to extract comprehensive yet compact image features through a visual dictionary (VD) that facilitates cross-modal understanding. VD is designed to represent consistent visual abstractions of similar semantics. It is updated on-the-fly and utilized in our proposed pre-training task Masked Visual Modeling (MVM). We conduct experiments on four well-established vision-language tasks by following standard VLPT settings. In particular, SOHO achieves absolute gains of 2.0% R@1 score on MSCOCO text retrieval 5k test split, 1.5% accuracy on NLVR2 test-P split, 6.7% accuracy on SNLI-VE test split, respectively.

关键词： Visualization Dictionaries Annotations Semantics Transforms Feature extraction Transformers

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 466 467 468 469 470 471 472 473 474 475 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：