检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

8,905 篇 会议
43 篇 期刊文献
18 册 图书

馆藏范围

8,965 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

4,564 篇 工学
- 4,024 篇 计算机科学与技术...
- 2,182 篇 软件工程
- 1,241 篇 光学工程
- 558 篇 控制科学与工程
- 433 篇 信息与通信工程
- 430 篇 机械工程
- 294 篇 电气工程
- 288 篇 仪器科学与技术
- 179 篇 生物工程
- 159 篇 生物医学工程（可授...
- 119 篇 电子科学与技术（可...
- 64 篇 安全科学与工程
- 58 篇 建筑学
- 58 篇 化学工程与技术
- 52 篇 土木工程
- 52 篇 交通运输工程
- 40 篇 力学（可授工学、理...
2,066 篇 理学
- 1,382 篇 物理学
- 1,198 篇 数学
- 420 篇 统计学（可授理学、...
- 238 篇 生物学
- 55 篇 化学
- 36 篇 系统科学
266 篇 管理学
- 182 篇 图书情报与档案管...
- 92 篇 管理科学与工程(可...
- 47 篇 工商管理
223 篇 医学
- 222 篇 临床医学
- 39 篇 基础医学(可授医学...
205 篇 艺术学
- 205 篇 设计学（可授艺术学...
45 篇 法学
- 43 篇 社会学
21 篇 农学
14 篇 教育学
9 篇 经济学
6 篇 军事学

主题

3,414 篇 computer vision
1,216 篇 pattern recognit...
946 篇 cameras
908 篇 conferences
765 篇 computer science
674 篇 image segmentati...
618 篇 layout
598 篇 training
548 篇 shape
518 篇 robustness
451 篇 feature extracti...
448 篇 humans
445 篇 face recognition
405 篇 computational mo...
402 篇 object detection
365 篇 visualization
356 篇 computer archite...
336 篇 application soft...
304 篇 lighting
257 篇 image reconstruc...

机构

41 篇 microsoft resear...
30 篇 department of co...
25 篇 department of co...
23 篇 institute for co...
22 篇 department of co...
22 篇 school of comput...
20 篇 university of sc...
20 篇 swiss fed inst t...
19 篇 tsinghua univers...
19 篇 institute of com...
18 篇 swiss fed inst t...
17 篇 the robotics ins...
17 篇 carnegie mellon ...
17 篇 computer vision ...
17 篇 department of co...
16 篇 institute of inf...
16 篇 school of comput...
15 篇 school of comput...
15 篇 carnegie mellon ...
14 篇 national laborat...

作者

57 篇 timofte radu
25 篇 huang thomas s.
24 篇 van gool luc
23 篇 s.k. nayar
22 篇 nayar shree k.
22 篇 t. kanade
21 篇 jain anil k.
20 篇 luc van gool
19 篇 t.s. huang
18 篇 xiaoou tang
18 篇 murino vittorio
18 篇 horst bischof
17 篇 a.k. jain
17 篇 t. darrell
16 篇 g. healey
16 篇 bowyer kevin w.
16 篇 bischof horst
15 篇 m.j. black
15 篇 li stan z.
15 篇 m. shah

语言

8,904 篇 英文
53 篇 其他
8 篇 中文
1 篇 土耳其文

检索条件"任意字段=IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops"

共 8966 条记录，以下是511-520 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

UIGR: Unified Interactive Garment Retrieval

UIGR: Unified Interactive Garment Retrieval

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Han, Xiao He, Sen Zhang, Li Song, Yi-Zhe Xiang, Tao Univ Surrey CVSSP Guildford Surrey England iFlyTek Surrey Joint Res Ctr Artificial Intellige Guildford Surrey England Fudan Univ Sch Data Sci Shanghai Peoples R China

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Interactive garment retrieval (IGR) aims to retrieve a target garment image based on a reference garment image along with user feedback on what to change on the reference garment. Two IGR tasks have been studied extensively: text-guided garment retrieval (TGR) and visually compatible garment retrieval (VCR). The user feedback for the former indicates what semantic attributes to change with the garment category preserved, while the category is the only thing to be changed explicitly for the latter, with an implicit requirement on style preservation. Despite the similarity between these two tasks and the practical need for an efficient system tackling both, they have never been unified and modeled jointly. In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR. To this end, we first contribute a large-scale benchmark suited for both problems. We further propose a strong baseline architecture to integrate TGR and VCR in one model. Extensive experiments suggest that unifying two tasks in one framework is not only more efficient by requiring a single model only, it also leads to better performance. Code and datasets are available at GitHub.

关键词： Computational modeling Clothing Semantics computer architecture Benchmark testing Multitasking pattern recognition

来源：评论

学校读者我要写书评

暂无评论

H-Net: Unsupervised Attention-based Stereo Depth Estimation Leveraging Epipolar Geometry

H-Net: Unsupervised Attention-based Stereo Depth Estimation ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Huang, Baoru Zheng, Jian-Qing Giannarou, Stamatia Elson, Daniel S. Imperial Coll London Hamlyn Ctr Robot Surg London England Univ Oxford Kennedy Inst Rheumatol Oxford England Univ Oxford Big Data Inst Oxford England

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Depth estimation from a stereo image pair has become one of the most explored applications in computer vision, with most previous methods relying on fully supervised learning settings. However, due to the difficulty in acquiring accurate and scalable ground truth data, the training of fully supervised methods is challenging. As an alternative, self-supervised methods are becoming more popular to mitigate this challenge. In this paper, we introduce the H-Net, a deep-learning framework for unsupervised stereo depth estimation that leverages epipolar geometry to refine stereo matching. For the first time, a Siamese autoencoder architecture is used for depth estimation which allows mutual information between rectified stereo images to be extracted. To enforce the epipolar constraint, the mutual epipolar attention mechanism has been designed which gives more emphasis to correspondences of features that lie on the same epipolar line while learning mutual information between the input stereo pair. Stereo correspondences are further enhanced by incorporating semantic information to the proposed attention mechanism. More specifically, the optimal transport algorithm is used to suppress attention and eliminate outliers in areas not visible in both cameras. Extensive experiments on KITTI2015 and Cityscapes show that the proposed modules are able to improve the performance of the unsupervised stereo depth estimation methods while closing the gap with the fully supervised approaches.

关键词： Geometry Training computer vision Supervised learning Semantics Estimation computer architecture

来源：评论

学校读者我要写书评

暂无评论

Simple and Efficient Architectures for Semantic Segmentation

Simple and Efficient Architectures for Semantic Segmentation

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Mehta, Dushyant Skliar, Andrii Ben Yahia, Haitam Borse, Shubhankar Porikli, Fatih Habibian, Amirhossein Blankevoort, Tijmen Qualcomm AI Res San Diego CA 92121 USA

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Though the state-of-the architectures for semantic segmentation, such as HRNet, demonstrate impressive accuracy, the complexity arising from their salient design choices hinders a range of model acceleration tools, and further they make use of operations that are inefficient on current hardware. This paper demonstrates that a simple encoder-decoder architecture with a ResNet-like backbone and a small multi-scale head, performs on-par or better than complex semantic segmentation architectures such as HRNet, FANet and DDRNets. Naively applying deep backbones designed for Image Classification to the task of Semantic Segmentation leads to sub-par results, owing to a much smaller effective receptive field of these backbones. Implicit among the various design choices put forth in works like HRNet, DDRNet, and FANet are networks with a large effective receptive field. It is natural to ask if a simple encoder-decoder architecture would compare favorably if comprised of backbones that have a larger effective receptive field, though without the use of inefficient operations like dilated convolutions. We show that with minor and inexpensive modifications to ResNets, enlarging the receptive field, very simple and competitive baselines can be created for Semantic Segmentation. We present a family of such simple architectures for desktop as well as mobile targets, which match or exceed the performance of complex models on the Cityscapes dataset. We hope that our work provides simple yet effective baselines for practitioners to develop efficient semantic segmentation models. The model definitions and pre-trained weights are available at https://***/Qualcomm-AI-research/FFNet.

关键词： Image segmentation Head Semantics Graphics processing units computer architecture Hardware Distance measurement

来源：评论

学校读者我要写书评

暂无评论

Beyond AUROC & co. for evaluating out-of-distribution detection performance

Beyond AUROC & co. for evaluating out-of-distribution detect...

引用

2023 ieee/CVF conference on computer vision and pattern recognition workshops, CVPRW 2023

作者： Humblot-Renaux, Galadrielle Escalera, Sergio Moeslund, Thomas B. Aalborg University Visual Analysis and Perception Lab Denmark Universitat Autònoma de Barcelona Computer Vision Center Spain Universitat de Barcelona Dept. of Mathematics and Informatics Spain

ISBN: (纸本)9798350302493

While there has been a growing research interest in developing out-of-distribution (OOD) detection methods, there has been comparably little discussion around how these methods should be evaluated. Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs. In this work, we take a closer look at the go-to metrics for evaluating OOD detection, and question the approach of exclusively reducing OOD detection to a binary classification task with little consideration for the detection threshold. We illustrate the limitations of current metrics (AUROC & its friends) and propose a new metric - Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples. Scripts and data are available at https://***/glhr/beyond-auroc © 2023 ieee.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Making the V in Text-VQA Matter

Making the V in Text-VQA Matter

引用

2023 ieee/CVF conference on computer vision and pattern recognition workshops, CVPRW 2023

作者： Hegde, Shamanthak Jahagirdar, Soumya Gangisetty, Shankar Kle Technological University Hubballi India Cvit Iiit Hyderabad Hyderabad India Iiit Hyderabad Hyderabad India

ISBN: (纸本)9798350302493

Text-based VQA aims at answering questions by reading the text present in the images. It requires a large amount of scene-text relationship understanding compared to the VQA task. Recent studies have shown that the question-answer pairs in the dataset are more focused on the text present in the image but less importance is given to visual features and some questions do not require understanding the image. The models trained on this dataset predict biased answers due to the lack of understanding of visual context. For example, in questions like "What is written on the signboard?", the answer predicted by the model is always "STOP"which makes the model to ignore the image. To address these issues, we propose a method to learn visual features (making V matter in TextVQA) along with the OCR features and question features using VQA dataset as external knowledge for Text-based VQA. Specifically, we combine the TextVQA dataset and VQA dataset and train the model on this combined dataset. Such a simple, yet effective approach increases the understanding and correlation between the image features and text present in the image, which helps in the better answering of questions. We further test the model on different datasets and compare their qualitative and quantitative results. © 2023 ieee.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Masked vision Transformers for Hyperspectral Image Classification

Masked Vision Transformers for Hyperspectral Image Classific...

引用

2023 ieee/CVF conference on computer vision and pattern recognition workshops, CVPRW 2023

作者： Scheibenreif, Linus Mommert, Michael Borth, Damian University of St. Gallen Aiml Lab School of Computer Science Switzerland

ISBN: (纸本)9798350302493

Transformer architectures have become state-of-the-art models in computer vision and natural language processing. To a significant degree, their success can be attributed to self-supervised pre-training on large scale unlabeled datasets. This work investigates the use of self-supervised masked image reconstruction to advance transformer models for hyperspectral remote sensing imagery. To facilitate self-supervised pre-training, we build a large dataset of unlabeled hyperspectral observations from the EnMAP satellite and systematically investigate modifications of the vision transformer architecture to optimally leverage the characteristics of hyperspectral data. We find significant improvements in accuracy on different land cover classification tasks over both standard vision and sequence transformers using (i) blockwise patch embeddings, (ii) spatialspectral self-attention, (iii) spectral positional embeddings and (iv) masked self-supervised pre-training1. The resulting model outperforms standard transformer architectures by +5% accuracy on a labeled subset of our EnMAP data and by +15% on Houston2018 hyperspectral dataset, making it competitive with a strong 3D convolutional neural network baseline. In an ablation study on label-efficiency based on the Houston2018 dataset, self-supervised pre-training significantly improves transformer accuracy when little labeled training data is available. The self-supervised model outperforms randomly initialized transformers and the 3D convolutional neural network by +7-8% when only 0.1-10% of the training labels are available. © 2023 ieee.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

Impact of Pseudo Depth on Open World Object Segmentation with Minimal User Guidance

Impact of Pseudo Depth on Open World Object Segmentation wit...

引用

2023 ieee/CVF conference on computer vision and pattern recognition workshops, CVPRW 2023

作者： Schön, Robin Ludwig, Katja Lienhart, Rainer University of Augsburg Machine Learning and Computer Vision Germany

ISBN: (纸本)9798350302493

Pseudo depth maps are depth map predicitions which are used as ground truth during training. In this paper we leverage pseudo depth maps in order to segment objects of classes that have never been seen during training. This renders our object segmentation task an open world task. The pseudo depth maps are generated using pretrained networks, which have either been trained with the full intention to generalize to downstream tasks (LeRes and MiDaS), or which have been trained in an unsupervised fashion on video sequences (MonodepthV2). In order to tell our network which object to segment, we provide the network with a single click on the object's surface on the pseudo depth map of the image as input. We test our approach on two different scenarios: One without the RGB image and one where the RGB image is part of the input. Our results demonstrate a considerably better generalization performance from seen to unseen object types when depth is used. On the Semantic Boundaries Dataset we achieve an improvement from 61.57 to 69.79 IoU score on unseen classes, when only using half of the training classes during training and performing the segmentation on depth maps only. © 2023 ieee.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2024 ieee/CVF conference on computer vision and pattern recognition, CVPR 2024

Proceedings - 2024 IEEE/CVF Conference on Computer Vision an...

引用

2024 ieee/CVF conference on computer vision and pattern recognition, CVPR 2024

ISBN: (纸本)9798350353006

The proceedings contain 2715 papers. The topics discussed include: revisiting adversarial training at scale;SPIDeRS: structured polarization for invisible depth and reflectance sensing;MA-LMM: memory-augmented large multimodal model for long-term video understanding;geometrically-driven aggregation for zero-shot 3D point cloud understanding;TextCraftor: your text encoder can be image quality controller;ViLa-MIL: dual-scale vision-language multiple instance learning for whole slide image classification;HumanNorm: learning normal diffusion model for high-quality and realistic 3D human generation;AnEmpirical study of scaling law for scene text recognition;improving image restoration through removing degradations in textual representations;and steganographic passport: an owner and user verifiable credential for deep model ip protection without retraining.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task Sampling

Few-Shot Image Classification Benchmarks are Too Far From Re...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bennequin, Etienne Tami, Myriam Toubhans, Antoine Hudelot, Celine Univ Paris Saclay Cent Supelec Gif Sur Yvette France Sicara Paris France

ISBN: (纸本)9781665487399

Every day, a new method is published to tackle Few-Shot Image Classification, showing better and better performances on academic benchmarks. Nevertheless, we observe that these current benchmarks do not accurately represent the real industrial use cases that we encountered. In this work, through both qualitative and quantitative studies, we expose that the widely used benchmark tieredImageNet is strongly biased towards tasks composed of very semantically dissimilar classes e.g. bathtub, cabbage, pizza, schipperke, and cardoon. This makes tieredImageNet (and similar benchmarks) irrelevant to evaluate the ability of a model to solve real-life use cases usually involving more fine-grained classification. We mitigate this bias using semantic information about the classes of tieredImageNet and generate an improved, balanced benchmark. Going further, we also introduce a new benchmark for Few-Shot Image Classification using the Danish Fungi 2020 dataset. This benchmark proposes a wide variety of evaluation tasks with various fine-graininess. Moreover, this benchmark includes many-way tasks (e.g. composed of 100 classes), which is a challenging setting yet very common in industrial applications. Our experiments bring out the correlation between the difficulty of a task and the semantic similarity between its classes, as well as a heavy performance drop of state-of-the-art methods on many-way few-shot classification, raising questions about the scaling abilities of these methods. We hope that our work will encourage the community to further question the quality of standard evaluation processes and their relevance to real-life applications.

关键词： Fungi Training Limiting Shape Semantics Benchmark testing pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Deep Normalized Cross-Modal Hashing with Bi-Direction Relation Reasoning

Deep Normalized Cross-Modal Hashing with Bi-Direction Relati...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sun, Changchang Latapie, Hugo Liu, Gaowen Yan, Yan IIT Dept Comp Sci Chicago IL 60616 USA Cisco Res Emerging Technol & Incubat San Jose CA USA

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Due to the continuous growth of large-scale multi-modal data and increasing requirements for retrieval speed, deep cross-modal hashing has gained increasing attention recently. Most of existing studies take a similarity matrix as supervision to optimize their models, and the inner product between continuous surrogates of hash codes is utilized to depict the similarity in the Hamming space. However, all of them merely consider the relevant information to build the similarity matrix, ignoring the contribution of the irrelevant one, i.e., the categories that samples do not belong to. Therefore, they cannot effectively alleviate the effect of dissimilar samples. Moreover, due to the modality distribution difference, directly utilizing continuous surrogates of hash codes to calculate similarity may induce suboptimal retrieval performance. To tackle these issues, in this paper, we propose a novel deep normalized cross-modal hashing scheme with bi-direction relation reasoning, named Bi NCMH. Specifically, we build the multi-level semantic similarity matrix by considering bi-direction relation, i.e., consistent and inconsistent relation. It hence can holistically characterize relations among instances. Besides, we execute feature normalization on continuous surrogates of hash codes to eliminate the deviation caused by modality gap, which further reduces the negative impact of binarization on retrieval performance. Extensive experiments on two cross-modal benchmark datasets demonstrate the superiority of our model over several state-of-the-art baselines.

关键词： computer vision Codes conferences Computational modeling Semantics Bidirectional control Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 48 49 50 51 52 53 54 55 56 57 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：