检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

8,905 篇 会议
43 篇 期刊文献
18 册 图书

馆藏范围

8,965 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

4,564 篇 工学
- 4,024 篇 计算机科学与技术...
- 2,182 篇 软件工程
- 1,241 篇 光学工程
- 558 篇 控制科学与工程
- 433 篇 信息与通信工程
- 430 篇 机械工程
- 294 篇 电气工程
- 288 篇 仪器科学与技术
- 179 篇 生物工程
- 159 篇 生物医学工程（可授...
- 119 篇 电子科学与技术（可...
- 64 篇 安全科学与工程
- 58 篇 建筑学
- 58 篇 化学工程与技术
- 52 篇 土木工程
- 52 篇 交通运输工程
- 40 篇 力学（可授工学、理...
2,066 篇 理学
- 1,382 篇 物理学
- 1,198 篇 数学
- 420 篇 统计学（可授理学、...
- 238 篇 生物学
- 55 篇 化学
- 36 篇 系统科学
266 篇 管理学
- 182 篇 图书情报与档案管...
- 92 篇 管理科学与工程(可...
- 47 篇 工商管理
223 篇 医学
- 222 篇 临床医学
- 39 篇 基础医学(可授医学...
205 篇 艺术学
- 205 篇 设计学（可授艺术学...
45 篇 法学
- 43 篇 社会学
21 篇 农学
14 篇 教育学
9 篇 经济学
6 篇 军事学

主题

3,414 篇 computer vision
1,216 篇 pattern recognit...
946 篇 cameras
908 篇 conferences
765 篇 computer science
674 篇 image segmentati...
618 篇 layout
598 篇 training
548 篇 shape
518 篇 robustness
451 篇 feature extracti...
448 篇 humans
445 篇 face recognition
405 篇 computational mo...
402 篇 object detection
365 篇 visualization
356 篇 computer archite...
336 篇 application soft...
304 篇 lighting
257 篇 image reconstruc...

机构

41 篇 microsoft resear...
30 篇 department of co...
25 篇 department of co...
23 篇 institute for co...
22 篇 department of co...
22 篇 school of comput...
20 篇 university of sc...
20 篇 swiss fed inst t...
19 篇 tsinghua univers...
19 篇 institute of com...
18 篇 swiss fed inst t...
17 篇 the robotics ins...
17 篇 carnegie mellon ...
17 篇 computer vision ...
17 篇 department of co...
16 篇 institute of inf...
16 篇 school of comput...
15 篇 school of comput...
15 篇 carnegie mellon ...
14 篇 national laborat...

作者

57 篇 timofte radu
25 篇 huang thomas s.
24 篇 van gool luc
23 篇 s.k. nayar
22 篇 nayar shree k.
22 篇 t. kanade
21 篇 jain anil k.
20 篇 luc van gool
19 篇 t.s. huang
18 篇 xiaoou tang
18 篇 murino vittorio
18 篇 horst bischof
17 篇 a.k. jain
17 篇 t. darrell
16 篇 g. healey
16 篇 bowyer kevin w.
16 篇 bischof horst
15 篇 m.j. black
15 篇 li stan z.
15 篇 m. shah

语言

8,904 篇 英文
53 篇 其他
8 篇 中文
1 篇 土耳其文

检索条件"任意字段=IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops"

共 8966 条记录，以下是1241-1250 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

FedProK: Trustworthy Federated Class-Incremental Learning via Prototypical Feature Knowledge Transfer

FedProK: Trustworthy Federated Class-Incremental Learning vi...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Xin Gao Xin Yang Hao Yu Yan Kang Tianrui Li Southwestern University of Finance and Economics Webank Southwest Jiaotong University

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Federated Class-Incremental Learning (FCIL) focuses on continually transferring the previous knowledge to learn new classes in dynamic Federated Learning (FL). However, existing methods do not consider the trustworthiness of FCIL, i.e., improving continual utility, privacy, and efficiency simultaneously, which is greatly influenced by catastrophic forgetting and data heterogeneity among clients. To address this issue, we propose FedProK (Federated Prototypical Feature Knowledge Transfer), leveraging prototypical feature as a novel representation of knowledge to perform spatial-temporal knowledge transfer. Specifically, FedProK consists of two components: (1) feature translation procedure on the client side by temporal knowledge transfer from the learned classes and (2) prototypical knowledge fusion on the server side by spatial knowledge transfer among clients. Extensive experiments conducted in both synchronous and asynchronous settings demonstrate that our FedProK outperforms the other state-of-the-art methods in three perspectives of trustworthiness, validating its effectiveness in selectively transferring spatial-temporal knowledge.

关键词： Privacy Data privacy computer vision Federated learning conferences Prototypes Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

InViG: Benchmarking Open-Ended Interactive Visual Grounding with 500K Dialogues

InViG: Benchmarking Open-Ended Interactive Visual Grounding ...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Hanbo Zhang Jie Xu Yuchen Mo Tao Kong ByteDance Research Beijing China Xi’an Jiaotong University Beijing China

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Ambiguity is ubiquitous in human communication. Previous approaches in Human-Robot Interaction (HRI) have often relied on predefined interaction templates, leading to reduced performance in realistic and open-ended scenarios. To address these issues, we present a large-scale dataset, InViG, for interactive visual grounding under language ambiguity. Our dataset comprises over 520K images accompanied by open-ended goal-oriented disambiguation dialogues, encompassing millions of object instances and corresponding question-answer pairs. Leveraging the InViG dataset, we conduct extensive studies and propose a set of baseline solutions for end-to-end interactive visual disambiguation and grounding, achieving a 45.6% success rate during validation. To the best of our knowledge, the InViG dataset is the first large-scale dataset for resolving open-ended interactive visual grounding, presenting a practical yet highly challenging benchmark for ambiguity-aware HRI. Codes and datasets are available at: https://***.

关键词： Visualization computer vision Image resolution Codes Grounding conferences Human-robot interaction

来源：评论

学校读者我要写书评

暂无评论

Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images

Depth-Regularized Optimization for 3D Gaussian Splatting in ...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Jaeyoung Chung Jeongtaek Oh Kyoung Mu Lee Department of ECE ASRI IPAI Seoul National University Seoul Korea

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

This paper presents a method to optimize Gaussian splatting with a limited number of images while avoiding overfitting. Representing a 3D scene by combining numerous Gaussian splats has yielded outstanding visual quality. However, it tends to overfit the training views when only a few images are available. To address this issue, we employ an adjusted depth map as a geometric reference, derived from a pre-trained monocular depth estimation model and subsequently aligned with the sparse structure-from-motion points. We regularize the optimization process of 3D Gaussian splatting with the adjusted depth and an additional unsupervised smooth constraint, thereby effectively reducing the occurrence of floating artifacts. Our method is mainly validated on the NeRF-LLFF dataset with varying numbers of images, and we conduct multiple experiments with randomly selected training images, presenting the average value to ensure fairness. Our approach demonstrates robust geometry compared to the original method, which relied solely on images.

关键词： Training Geometry Visualization Solid modeling computer vision Three-dimensional displays conferences

来源：评论

学校读者我要写书评

暂无评论

Practical Region-level Attack against Segment Anything Models

Practical Region-level Attack against Segment Anything Model...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Yifan Shen Zhengyuan Li Gang Wang University of Illinois Urbana-Champaign

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Segment Anything Models (SAM) have made significant advancements in image segmentation, allowing users to segment target portions of an image with a single click (i.e., user prompt). Given its broad applications, the robustness of SAM against adversarial attacks is a critical concern. While recent works have explored adversarial attacks against a pre-defined prompt/click, their threat model is not yet realistic: (1) they often assume the user-click position is known to the attacker (point-based attack), and (2) they often operate under a white-box setting with limited transferability. In this paper, we propose a more practical region-level attack where attackers do not need to know the precise user prompt. The attack remains effective as the user clicks on any point on the target object in the image, hiding the object from SAM. Also, by adapting a spectrum transformation method, we make the attack more transferable under a black-box setting. Both control experiments and testing against real-world SAM services confirm its effectiveness.

关键词： Threat modeling Image segmentation computer vision conferences Computational modeling Closed box Robustness

来源：评论

学校读者我要写书评

暂无评论

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

Investigating the Effectiveness of Cross-Attention to Unlock...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Saman Motamed Wouter Van Gansbeke Luc Van Gool INSAIT Sofia University Bulgaria ETH Zurich KU Leuven

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

With recent advances in image and video diffusion models for content creation, a plethora of techniques have been proposed for customizing their generated content. In particular, manipulating the cross-attention layers of Text-to-Image (T2I) diffusion models has shown great promise in controlling the shape and location of objects in the scene. Transferring image-editing techniques to the video domain, however, is extremely challenging as object motion and temporal consistency are difficult to capture accurately. In this work, we take a first look at the role of cross-attention in Text-to-Video (T2V) diffusion models for zero-shot video editing. While one-shot models have shown potential in controlling motion and camera movement, we demonstrate zero-shot control over object shape, position and movement in T2V models. We show that despite the limitations of current T2V models, cross-attention guidance can be a promising approach for editing videos. Code: https://***/sam-motamed/***

关键词： computer vision Codes Shape Computational modeling conferences Text to image Diffusion models

来源：评论

学校读者我要写书评

暂无评论

Lightweight Maize Disease Detection through Post-Training Quantization with Similarity Preservation

Lightweight Maize Disease Detection through Post-Training Qu...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Carlos Victorino Padeiro Tse-Wei Chen Takahiro Komamizu Ichiro Ide Nagoya University Nagoya Japan

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Traditional crop disease diagnosis, reliant on expert visual observation, is expensive, time-consuming, and prone to error. While Convolutional Neural Networks (CNNs) offer promising alternatives, their high resource demands limit their accessibility to farmers, particularly those in resource-constrained settings. Lightweight models that operate on resource-limited devices without network access are crucial to address this gap. This paper proposes a Similarity-Preserving Quantization (SPQ) method to convert high-precision CNNs into lower-precision models while maintaining similar feature representations. While quantization offers a promising approach for building lightweight CNNs for crop disease detection, the quality of quantized models often suffers. SPQ addresses this challenge by ensuring equivalent activation patterns for similar crop images in both the original and quantized models. Experimental evaluation using MobileNetV2 and ResNet-50 demonstrates that SPQ improves throughput, inference, and memory footprint more than 3 times while preserving the detection performance.

关键词： Visualization Quantization (signal) Crops Euclidean distance Throughput pattern recognition Medical diagnosis

来源：评论

学校读者我要写书评

暂无评论

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

Cross-Modal Fusion and Attention Mechanism for Weakly Superv...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Ayush Ghadiya Purbayan Kar Vishal Chudasama Pankaj Wasnik Media Analysis Group Sony Research India Bangalore India

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Recently, weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction to identify anomaly events like violence and nudity in videos using only video-level labels. However, this task has substantial challenges, including addressing imbalanced modality information and consistently distinguishing between normal and abnormal features. In this paper, we address these challenges and propose a multi-modal WS-VAD framework to accurately detect anomalies such as violence and nudity. Within the proposed framework, we introduce a new fusion mechanism known as the Cross-modal Fusion Adapter (CFA), which dynamically selects and enhances highly relevant audio-visual features in relation to the visual modality. Additionally, we introduce a Hyperbolic Lorentzian Graph Attention (HLGAtt) to effectively capture the hierarchical relationships between normal and abnormal representations, thereby enhancing feature separation accuracy. Through extensive experiments, we demonstrate that the proposed model achieves state-of-the-art results on benchmark datasets of violence and nudity detection.

关键词： Visualization computer vision Adaptation models Accuracy Attention mechanisms conferences Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models

Audio-Visual Generalized Zero-Shot Learning using Pre-Traine...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： David Kurzendörfer Otniel-Bogdan Mercea A. Sophia Koepke Zeynep Akata University of Tübingen Localyzer GmbH Tübingen AI Center Helmholtz Munich Technical University of Munich

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Audio-visual zero-shot learning methods commonly build on features extracted from pre-trained models, e.g. video or audio classification models. However, existing benchmarks predate the popularization of large multi-modal models, such as CLIP and CLAP. In this work, we explore such large pre-trained models to obtain features, i.e. CLIP for visual features, and CLAP for audio features. Furthermore, the CLIP and CLAP text encoders provide class label embeddings which are combined to boost the performance of the system. We propose a simple yet effective model that only relies on feed-forward neural networks, exploiting the strong generalization capabilities of the new audio, visual and textual features. Our framework achieves state-of-the-art performance on VGGSound-GZSL cls , UCF-GZSL cls , and ActivityNet-GZSL cls with our new features. Code and data available at: https://***/dkurzend/ClipClap-GZSL.

关键词： Visualization computer vision Codes Zero-shot learning conferences Computational modeling Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

GenVideo: One-shot target-image and shape aware video editing using T2I diffusion models

GenVideo: One-shot target-image and shape aware video editin...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Sai Sree Harsha Ambareesh Revanur Dhwanit Agarwal Shradha Agrawal Adobe Inc.

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Video editing methods based on diffusion models that rely solely on a text prompt for the edit are hindered by the limited expressive power of text prompts. Thus, incorporating a reference target image as a visual guide becomes desirable for precise control over edit. Also, most existing methods struggle to accurately edit a video when the shape and size of the object in the target image differ from the source object. To address these challenges, we propose "GenVideo" for editing videos leveraging target-image aware T2I models. Our approach handles edits with target objects of varying shapes and sizes while maintaining the temporal consistency of the edit using our novel target and shape aware InvEdit masks. Further, we propose a novel target-image aware latent noise correction strategy during inference to improve the temporal consistency of the edits. Experimental analyses indicate that GenVideo can effectively handle edits with objects of varying shapes, where existing approaches fail.

关键词： Visualization computer vision Shape conferences Pipelines Noise Wheels

来源：评论

学校读者我要写书评

暂无评论

vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning

Vision-Language Pseudo-Labels for Single-Positive Multi-Labe...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Xin Xing Zhexiao Xiong Abby Stylianou Srikumar Sastry Liyu Gong Nathan Jacobs University of Nebraska Omaha Washington University in St. Louis Saint Louis University Oracle Inc

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

We study a limited label problem and present a novel approach to Single-Positive Multi-label Learning. In the multi-label learning setting, a model learns to predict multiple labels or categories for a single input image. This contrasts with standard multi-class image classification, where the task is to predict a single label from many possible labels for an image. Single-Positive Multi-label Learning specifically considers learning to predict multiple labels when there is only one annotation per image in the training data. Multi-label learning is a more natural task than single-label learning because real-world data often involves instances belonging to multiple categories simultaneously; however, most computer vision datasets contain single labels due to the inherent complexity and cost of collecting multiple high-quality annotations per image. We propose a novel approach called vision-Language Pseudo-Labeling, which uses a vision-language model, CLIP, to suggest strong positive and negative pseudo-labels. The experiment performance shows the effectiveness of the proposed model. Our code and data will be made publicly available at https://***/mvrl/VLPL.

关键词： computer vision Costs Annotations Computational modeling conferences Training data Predictive models

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 121 122 123 124 125 126 127 128 129 130 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：