检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

4,477 篇 会议
9 篇 期刊文献
5 册 图书

馆藏范围

4,491 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,329 篇 工学
- 1,912 篇 计算机科学与技术...
- 541 篇 软件工程
- 417 篇 机械工程
- 327 篇 光学工程
- 269 篇 控制科学与工程
- 216 篇 仪器科学与技术
- 117 篇 信息与通信工程
- 99 篇 电气工程
- 79 篇 生物工程
- 50 篇 生物医学工程（可授...
- 34 篇 电子科学与技术（可...
- 25 篇 安全科学与工程
- 21 篇 化学工程与技术
- 16 篇 建筑学
- 15 篇 交通运输工程
- 14 篇 土木工程
489 篇 理学
- 327 篇 物理学
- 194 篇 数学
- 83 篇 生物学
- 79 篇 统计学（可授理学、...
- 23 篇 系统科学
- 18 篇 化学
206 篇 艺术学
- 206 篇 设计学（可授艺术学...
67 篇 管理学
- 48 篇 图书情报与档案管...
- 19 篇 管理科学与工程(可...
- 10 篇 工商管理
45 篇 医学
- 45 篇 临床医学
- 13 篇 基础医学(可授医学...
- 11 篇 药学(可授医学、理...
20 篇 法学
- 18 篇 社会学
7 篇 农学
4 篇 教育学
1 篇 经济学
1 篇 文学
1 篇 军事学

主题

1,834 篇 computer vision
890 篇 conferences
696 篇 pattern recognit...
656 篇 training
472 篇 cameras
381 篇 feature extracti...
375 篇 computational mo...
341 篇 visualization
314 篇 computer archite...
285 篇 image segmentati...
259 篇 face recognition
231 篇 object detection
230 篇 robustness
208 篇 shape
193 篇 three-dimensiona...
184 篇 humans
176 篇 neural networks
169 篇 semantics
166 篇 computer science
157 篇 benchmark testin...

机构

21 篇 swiss fed inst t...
19 篇 swiss fed inst t...
18 篇 university of sc...
17 篇 univ sci & techn...
17 篇 carnegie mellon ...
15 篇 institute for co...
14 篇 tsinghua univers...
13 篇 computer vision ...
13 篇 tsinghua univ pe...
13 篇 stanford univ st...
12 篇 harbin inst tech...
12 篇 mit cambridge ma...
12 篇 sun yat sen univ...
12 篇 carnegie mellon ...
11 篇 chinese univ hon...
11 篇 megvii technol p...
11 篇 chinese acad sci...
10 篇 comp vis ctr bar...
10 篇 univ modena & re...
10 篇 beihang univ peo...

作者

57 篇 timofte radu
20 篇 luc van gool
20 篇 radu timofte
17 篇 horst bischof
16 篇 van gool luc
15 篇 sergio escalera
12 篇 zhigang zhu
12 篇 li stan z.
12 篇 chen wei-ting
12 篇 bischof horst
12 篇 lei lei
11 篇 fan haoqiang
11 篇 sun jian
11 篇 marcos v. conde
11 篇 lei zhen
10 篇 escalera sergio
10 篇 cucchiara rita
10 篇 zhang lei
10 篇 angel d. sappa
10 篇 liu shuaicheng

语言

4,486 篇 英文
4 篇 中文
1 篇 其他

检索条件"任意字段=2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013"

共 4491 条记录，以下是91-100 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Beyond the Screen: Evaluating Deepfake Detectors under Moire pattern Effects

Beyond the Screen: Evaluating Deepfake Detectors under Moire...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Tariq, Razaib Heo, Minji Woo, Simon S. Tariq, Shahroz Sungkyunkwan Univ Seoul South Korea CSIROs Data61 Eveleigh Australia

ISBN: (纸本)9798350365474

The detection of deepfakes is crucial for mitigating the societal impact of falsified video content. Despite the development of various algorithms for this purpose, challenges arise for detectors in real-world scenarios, especially when users capture deepfake content from screens and upload it online or when detectors operate on external devices like smartphones, requiring the capture of potential deepfakes through the camera for evaluation. A significant challenge in these scenarios is the presence of Moire patterns, which degrade image quality and complicate conventional classification methods, notably deep neural networks (DNNs). However, the impact of Moire patterns on the effectiveness of deepfake detection systems has not been adequately explored. This study aims to investigate how capturing deepfake videos via digital screen cameras affects the accuracy of detection mechanisms. We introduced the Moire patterns by capturing the display of a monitor using a smartphone camera and conducted empirical evaluations using four widely recognized datasets: CelebDF, DFD, DFDC, and FF++. We compare the performance of twelve SOTA detectors on deepfake videos captured under the influence of Moire patterns. Our findings reveal a performance decrease of up to 33.1 and 31.3 percentage points for image- and video-based detectors. Therefore, highlighting the challenges posed by Moire patterns and other naturally induced artifacts is critical for improving the effectiveness of real-world deepfake detection efforts. To facilitate further research, we will release the Moire pattern impact version of CelebDF, DFD, DFDC, and FF++ datasets with this paper. Our code is available here: https: //***/Razaib-Tariq/deepmoire

关键词： Deepfake Fake Media Media Forensics Moire pattern

来源：评论

学校读者我要写书评

暂无评论

SAM-CLIP: Merging vision Foundation Models towards Semantic and Spatial Understanding

SAM-CLIP: Merging Vision Foundation Models towards Semantic ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Haoxiang Vasu, Pavan Kumar Anasosalu Faghri, Fartash Vemulapalli, Raviteja Farajtabar, Mehrdad Mehta, Sachin Rastegari, Mohammad Tuzel, Oncel Pouransari, Hadi Apple Cupertino CA 95014 USA Univ Illinois Urbana IL 61801 USA

ISBN: (纸本)9798350365474

The landscape of publicly available vision foundation models (VFMs), such as CLIP and Segment Anything Model (SAM), is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their pre-training objectives. For instance, CLIP excels in semantic understanding, while SAM specializes in spatial understanding for segmentation. In this work, we introduce a simple recipe to efficiently merge VFMs into a unified model that absorbs their expertise. Our method integrates techniques of multi-task learning, continual learning, and distillation. Further, it demands significantly less computational cost compared to traditional multi-task training from scratch, and it only needs a small fraction of the pre-training datasets that were initially used to train individual models. By applying our method to SAM and CLIP, we obtain SAM-CLIP : a unified model that combines the capabilities of SAM and CLIP into a single vision transformer. Compared with deploying SAM and CLIP independently, our merged model, SAM-CLIP, reduces storage and compute costs for inference, making it well-suited for edge device applications. We show that SAM-CLIP not only retains the foundational strengths of SAM and CLIP, but also introduces synergistic functionalities, notably in zero-shot semantic segmentation, where SAM-CLIP establishes new state-of-the-art results on 5 benchmarks. It outperforms previous models that are specifically designed for this task by a large margin, including +6.8% and +5.9% mean IoU improvement on Pascal-VOC and COCO-Stuff datasets, respectively.

关键词： CLIP Foundation Model Model Merging Segmentation

来源：评论

学校读者我要写书评

暂无评论

Event-Based Eye Tracking. AIS 2024 Challenge Survey

Event-Based Eye Tracking. AIS 2024 Challenge Survey

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Zuowen Gao, Chang Wu, Zongwei Conde, Marcos, V Timofte, Radu Liu, Shih-Chii Chen, Qinyu Zha, Zheng-jun Zhai, Wei Han, Han Liao, Bohao Wu, Yuliang Wan, Zengyu Wang, Zhong Cao, Yang Tan, Ganchao Chen, Jinze Pei, Yan Ru Bruers, Sasskia Crouzet, Sebastien McLelland, Douglas Coenen, Oliver Zhang, Baoheng Gao, Yizhao Li, Jingyuan So, Hayden Kwok-Hay Bich, Philippe Boretti, Chiara Prono, Luciano Lica, Mircea Dinucu-Jianu, David Griu, Catalin Lin, Xiaopeng Ren, Hongwei Cheng, Bojun Zhang, Xinan Vial, Valentin Yezzi, Anthony Tsai, James Univ Zurich Inst Neuroinformat Zurich Switzerland Swiss Fed Inst Technol Zurich Switzerland Delft Univ Technol Delft Netherlands Univ Wurzburg Wurzburg Germany Leiden Univ Leiden Netherlands Univ Sci & Technol China Hefei Anhui Peoples R China Brainchip Inc Laguna Hills CA USA Univ Hong Kong Hong Kong Peoples R China Politecn Torino Turin Italy Hong Kong Univ Sci & Technol Guangzhou Guangzhou Guangdong Peoples R China Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (纸本)9798350365474

This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggle competition, and 8 teams submitted a challenge factsheet. The novel and diverse methods from the submitted factsheets are reviewed and analyzed in this survey to advance future event-based eye tracking research.

关键词： computer vision dynamic vision sensor event camera eye tracking

来源：评论

学校读者我要写书评

暂无评论

Vim4Path: Self-Supervised vision Mamba for Histopathology Images

Vim4Path: Self-Supervised Vision Mamba for Histopathology Im...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Nasiri-Sarvi, Ali Vincent Quoc-Huy Trinh Rivaz, Hassan Hosseini, Mahdi S. Concordia Univ Dept Comp Sci & Software Engn CSSE Montreal PQ Canada Univ Montreal Inst Res Immunol & Canc Montreal PQ Canada Concordia Univ Dept Elect & Comp Engn ECE Montreal PQ Canada

ISBN: (纸本)9798350365474

Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge, leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The performance of both SSL and MIL methods relies on the architecture of the feature encoder. This paper proposes leveraging the vision Mamba (Vim) architecture, inspired by state space models, within the DINO framework for representation learning in computational pathology. We evaluate the performance of Vim against vision Transformers (ViT) on the Camelyon16 dataset for both patch-level and slide-level classification. Our findings highlight Vim's enhanced performance compared to ViT, particularly at smaller scales, where Vim achieves an 8.21 increase in ROC AUC for models of similar size. An explainability analysis further highlights Vim's capabilities, which reveals that Vim uniquely emulates the pathologist workflow-unlike ViT. This alignment with human expert analysis highlights Vim's potential in practical diagnostic settings and contributes significantly to developing effective representation-learning algorithms in computational pathology. We release the codes and pretrained weights at https: //***/AtlasAnalyticsLab/Vim4Path.

关键词： Cancer Diagnosis Computational Pathology Self-Supervised Learning vision Mamba Whole Slide Image

来源：评论

学校读者我要写书评

暂无评论

AAPL: Adding Attributes to Prompt Learning for vision-Language Models

AAPL: Adding Attributes to Prompt Learning for Vision-Langua...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Kim, Gahyeon Kim, Sohee Lee, Seokju Korea Inst Energy Technol KENTECH Naju South Korea

ISBN: (纸本)9798350365474

Recent advances in large pre-trained vision-language models have demonstrated remarkable performance on zero-shot downstream tasks. Building upon this, recent studies, such as CoOp and CoCoOp, have proposed the use of prompt learning, where context within a prompt is replaced with learnable vectors, leading to significant improvements over manually crafted prompts. However, the performance improvement for unseen classes is still marginal, and to tackle this problem, data augmentation has been frequently used in traditional zero-shot learning techniques. Through our experiments, we have identified important issues in CoOp and CoCoOp: the context learned through traditional image augmentation is biased toward seen classes, negatively impacting generalization to unseen classes. To address this problem, we propose adversarial token embedding to disentangle low-level visual augmentation features from highlevel class information when inducing bias in learnable prompts. Through our novel mechanism called "Adding Attributes to Prompt Learning", AAPL, we guide the learnable context to effectively extract text features by focusing on high-level features for unseen classes. We have conducted experiments across 11 datasets, and overall, AAPL shows favorable performances compared to the existing methods in few-shot learning, zero-shot learning, cross-dataset, and domain generalization tasks.

关键词： prompt learning vision language model VLMs

来源：评论

学校读者我要写书评

暂无评论

Co-designing a Sub-millisecond Latency Event-based Eye Tracking System with Submanifold Sparse CNN

Co-designing a Sub-millisecond Latency Event-based Eye Track...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Baoheng Gao, Yizhao Li, Jingyuan So, Hayden Kwok-Hay Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9798350365474

Eye-tracking technology is integral to numerous consumer electronics applications, particularly in the realm of virtual and augmented reality (VR/AR). These applications demand solutions that excel in three crucial aspects: low-latency, low-power consumption, and precision. Yet, achieving optimal performance across all these fronts presents a formidable challenge, necessitating a balance between sophisticated algorithms and efficient backend hardware implementations. In this study, we tackle this challenge through a synergistic software/hardware co-design of the system with an event camera. Leveraging the inherent sparsity of event-based input data, we integrate a novel sparse FPGA dataflow accelerator customized for submanifold sparse convolution neural networks (SCNN). The SCNN implemented on the accelerator can efficiently extract the embedding feature vector from each representation of event slices by only processing the non-zero activations. Subsequently, these vectors undergo further processing by a gated recurrent unit (GRU) and a fully connected layer on the host CPU to generate the eye centers. Deployment and evaluation of our system reveal outstanding performance metrics. On the Event-based Eye-Tracking-AIS2024 dataset, our system achieves 81% p5 accuracy, 99.5% p10 accuracy, and 3.71 Mean Euclidean Distance with 0.7 ms latency while only consuming 2.29 mJ per inference. Notably, our solution opens up opportunities for future eye-tracking systems. Code is available at https://***/CASRHKU/ESDA/tree/eye_tracking.

关键词： dynamic vision sensor event camera event-based vision eye tracking FPGA hardware-software codesign sparse processing

来源：评论

学校读者我要写书评

暂无评论

IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery

IrrNet: Advancing Irrigation Mapping with Incremental Patch ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Hoque, Oishee Bintey Swarup, Samarth Adiga, Abhijin Nouwakpo, Sayjro Kossi Marathe, Madhav Univ Virginia Dept Comp Sci Charlottesville VA 22903 USA Univ Virginia Biocomplex Inst Charlottesville VA 22903 USA ARS USDA Kimberly ID USA

ISBN: (纸本)9798350365474

Irrigation mapping plays a crucial role in effective water management, essential for preserving both water quality and quantity, and is key to mitigating the global issue of water scarcity. The complexity of agricultural fields, adorned with diverse irrigation practices, especially when multiple systems coexist in close quarters, poses a unique challenge. This complexity is further compounded by the nature of Landsat's remote sensing data, where each pixel is rich with densely packed information, complicating the task of accurate irrigation mapping. In this study, we introduce an innovative approach that employs a progressive training method, which strategically increases patch sizes throughout the training process, utilizing datasets from Landsat 5 and 7, labeled with the WRLU dataset for precise labeling. This initial focus allows the model to capture detailed features, progressively shifting to broader, more general features as the patch size enlarges. Remarkably, our method enhances the performance of existing state-of-the-art models by approximately 20%. Furthermore, our analysis delves into the significance of incorporating various spectral bands into the model, assessing their impact on performance. The findings reveal that additional bands are instrumental in enabling the model to discern finer details more effectively. This work sets a new standard for leveraging remote sensing imagery in irrigation mapping.

关键词： computer vision Irrigation Mapping Patch Scaling Remote Sensing Segmentation Transfer learning

来源：评论

学校读者我要写书评

暂无评论

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

SegFormer3D: an Efficient Transformer for 3D Medical Image S...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Perera, Shehan Navard, Pouyan Yilmaz, Alper Ohio State Univ Photogrammetr Comp Vis Lab Columbus OH 43210 USA

ISBN: (纸本)9798350365474

The adoption of vision Transformers (ViTs) based architectures represents a significant advancement in 3D Medical Image (MI) segmentation, surpassing traditional Convolutional Neural Network (CNN) models by enhancing global contextual understanding. While this paradigm shift has significantly enhanced 3D segmentation performance, state-of-the-art architectures require extremely large and complex architectures with large scale computing resources for training and deployment. Furthermore, in the context of limited datasets, often encountered in medical imaging, larger models can present hurdles in both model generalization and convergence. In response to these challenges and to demonstrate that lightweight models are a valuable area of research in 3D medical imaging, we present SegFormer3D, a hierarchical Transformer that calculates attention across multiscale volumetric features. Additionally, SegFormer3D avoids complex decoders and uses an all-MLP decoder to aggregate local and global attention features to produce highly accurate segmentation masks. The proposed memory efficient Transformer preserves the performance characteristics of a significantly larger model in a compact design. SegFormer3D democratizes deep learning for 3D medical image segmentation by offering a model with 33x less parameters and a 13x reduction in GFLOPS compared to the current state-of-the-art (SOTA). We benchmark SegFormer3D against the current SOTA models on three widely used datasets Synapse, BRaTs, and ACDC, achieving competitive results. Code: https://***/OSUPCVLab/***

关键词： 3D Medical Image Segmentation ACDC Attention BraTs Deep Learning Efficient Attention Segmentation Synapse Transformers vision Transformers

来源：评论

学校读者我要写书评

暂无评论

Internal Diverse Image Completion

Internal Diverse Image Completion

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Alkobi, Noa Shaham, Tamar Rott Michaeli, Tomer Technion Haifa Israel MIT Technion Haifa Israel

ISBN: (纸本)9798350302493

Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, existing methods require large training sets from a specific domain of interest, and often fail on general-content images. In this paper, we propose a diverse completion method that does not require a training set and can thus treat arbitrary images from any domain. Our internal diverse completion (IDC) approach draws inspiration from recent single-image generative models that are trained on multiple scales of a single image, adapting them to the extreme setting in which only a small portion of the image is available for training. We illustrate the strength of IDC on several datasets, using both user studies and quantitative comparisons.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Multi-Modal Fusion of Event and RGB for Monocular Depth Estimation Using a Unified Transformer-based Architecture

Multi-Modal Fusion of Event and RGB for Monocular Depth Esti...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Devulapally, Anusha Khan, Md Fahim Faysal Advani, Siddharth Narayanan, Vijaykrishnan Penn State Univ University Pk PA 16802 USA Samsung Elect Amer Ridgefield Pk NJ USA

ISBN: (纸本)9798350365474

In the field of robotics and autonomous navigation, accurate pixel-level depth estimation has gained significant importance. Event cameras or dynamic vision sensors, capture asynchronous changes in brightness at the pixel level, offering benefits such as high temporal resolution, no motion blur, and a wide dynamic range. However, unlike traditional cameras that measure absolute intensity, event cameras lack the ability to provide scene context. Efficiently combining the advantages of both asynchronous events and synchronous RGB images to enhance depth estimation remains a challenge. In our study, we introduce a unified transformer that combines both event and RGB modalities to achieve precise depth prediction. In contrast to individual transformers for input modalities, a unified transformer model captures inter-modal dependencies and uses self-attention to enhance event-RGB contextual interactions. This approach exceeds the performance of recurrent neural network (RNN) methods used in state-of-the-art models. To encode the temporal information from events, convLSTMs are used before the transformer to improve depth estimation. Our proposed architecture outperforms the existing approaches in terms of absolute mean depth error, achieving state-of-the-art results in most cases. Additionally, the performance is also seen in other metrics like RMSE, absolute relative difference and depth thresholds compared to the existing approaches. The source code is available at:https://***/anusha-devulapally/ER-F2D.

关键词： Event Cameras Monocular Depth Estimation Multi-Modal Fusion vision Transformer

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共450页 << < 6 7 8 9 10 11 12 13 14 15 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：