检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

8,901 篇 会议
43 篇 期刊文献
18 册 图书

馆藏范围

8,961 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

4,560 篇 工学
- 4,020 篇 计算机科学与技术...
- 2,178 篇 软件工程
- 1,241 篇 光学工程
- 555 篇 控制科学与工程
- 431 篇 信息与通信工程
- 430 篇 机械工程
- 294 篇 电气工程
- 287 篇 仪器科学与技术
- 179 篇 生物工程
- 159 篇 生物医学工程（可授...
- 119 篇 电子科学与技术（可...
- 61 篇 安全科学与工程
- 58 篇 建筑学
- 58 篇 化学工程与技术
- 52 篇 土木工程
- 49 篇 交通运输工程
- 40 篇 力学（可授工学、理...
2,065 篇 理学
- 1,382 篇 物理学
- 1,198 篇 数学
- 420 篇 统计学（可授理学、...
- 238 篇 生物学
- 54 篇 化学
- 36 篇 系统科学
263 篇 管理学
- 180 篇 图书情报与档案管...
- 89 篇 管理科学与工程(可...
- 47 篇 工商管理
223 篇 医学
- 222 篇 临床医学
- 39 篇 基础医学(可授医学...
205 篇 艺术学
- 205 篇 设计学（可授艺术学...
45 篇 法学
- 43 篇 社会学
21 篇 农学
14 篇 教育学
9 篇 经济学
6 篇 军事学

主题

3,412 篇 computer vision
1,216 篇 pattern recognit...
946 篇 cameras
908 篇 conferences
765 篇 computer science
674 篇 image segmentati...
618 篇 layout
598 篇 training
548 篇 shape
518 篇 robustness
451 篇 feature extracti...
448 篇 humans
445 篇 face recognition
405 篇 computational mo...
402 篇 object detection
365 篇 visualization
356 篇 computer archite...
336 篇 application soft...
304 篇 lighting
259 篇 image reconstruc...

机构

41 篇 microsoft resear...
30 篇 department of co...
25 篇 department of co...
23 篇 institute for co...
22 篇 department of co...
22 篇 school of comput...
20 篇 university of sc...
20 篇 swiss fed inst t...
19 篇 tsinghua univers...
19 篇 institute of com...
18 篇 swiss fed inst t...
17 篇 the robotics ins...
17 篇 carnegie mellon ...
17 篇 computer vision ...
17 篇 department of co...
16 篇 institute of inf...
16 篇 school of comput...
15 篇 school of comput...
15 篇 carnegie mellon ...
14 篇 national laborat...

作者

57 篇 timofte radu
25 篇 huang thomas s.
24 篇 van gool luc
23 篇 s.k. nayar
22 篇 nayar shree k.
22 篇 t. kanade
21 篇 jain anil k.
20 篇 luc van gool
19 篇 t.s. huang
18 篇 xiaoou tang
18 篇 murino vittorio
18 篇 horst bischof
17 篇 a.k. jain
17 篇 t. darrell
16 篇 g. healey
16 篇 bowyer kevin w.
16 篇 bischof horst
15 篇 m.j. black
15 篇 li stan z.
15 篇 m. shah

语言

8,932 篇 英文
21 篇 其他
8 篇 中文
1 篇 土耳其文

检索条件"任意字段=IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops"

共 8962 条记录，以下是11-20 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Connecting NeRFs, Images, and Text

Connecting NeRFs, Images, and Text

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ballerini, Francesco Ramirez, Pierluigi Zama Mirabella, Roberto Salti, Samuele Di Stefano, Luigi Univ Bologna Bologna Italy

ISBN: (纸本)9798350365474

Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.

关键词： 3D computer vision Neural Fields

来源：评论

学校读者我要写书评

暂无评论

Efficient Online Multi-Camera Tracking with Memory-Efficient Accumulated Appearance Features and Trajectory Validation

Efficient Online Multi-Camera Tracking with Memory-Efficient...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Lap Quoc Tran Huan Duc Vi Asilla Tokyo Japan

ISBN: (纸本)9798350365474

Multi-camera tracking (MCT) plays a crucial role in various computer vision applications. However, accurate tracking of individuals across multiple cameras faces challenges, particularly with identity switches. In this paper, we present an efficient online MCT system that tackles these challenges through online processing. Our system leverages memory-efficient accumulated appearance features to provide stable representations of individuals across cameras and time. By incorporating trajectory validation using hierarchical agglomerative clustering (HAC) in overlapping regions, ID transfers are identified and rectified. Evaluation on the 2024 AI City Challenge Track 1 dataset [39] demonstrates the competitive performance of our system, achieving accurate tracking in both overlapping and non-overlapping camera networks. With a 40.3% HOTA score [29], our system ranked 9th in the challenge. The integration of trajectory validation enhances performance by 8% over the baseline, and the accumulated appearance features further contribute to a 17% improvement.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

ED-DCFNet: an unsupervised encoder-decoder neural model for event-driven feature extraction and object tracking

ED-DCFNet: an unsupervised encoder-decoder neural model for ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ramon, Raz Cohen-Duwek, Hadar Tsur, Elishai Ezra Open Univ Israel Dept Math & Comp Sci Neurobiomorph Engn Lab NBEL Raanana Israel

ISBN: (纸本)9798350365474

Neuromorphic cameras feature asynchronous event-based pixel-level processing and are particularly useful for object tracking in dynamic environments. Current approaches for feature extraction and optical flow with high-performing hybrid RGB-events vision systems require large computational models and supervised learning, which impose challenges for embedded vision and require annotated datasets. In this work, we propose ED-DCFNet, a small and efficient (< 72k) unsupervised multidomain learning framework, which extracts events-frames shared features without requiring annotations, with comparable performance. Furthermore, we introduce an open-sourced event and frame-based dataset that captures indoor scenes with various lighting and motion-type conditions in realistic scenarios, which can be used for model building and evaluation. The dataset is available at https://***/NBELab/UnsupervisedTracking.

关键词： event camera Neuromorphic vision object tracking

来源：评论

学校读者我要写书评

暂无评论

Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis

Lacunarity Pooling Layers for Plant Image Classification usi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Mohan, Akshatha Peeples, Joshua Texas A&M Univ Dept Elect & Comp Engn College Stn TX 77840 USA

ISBN: (纸本)9798350365474

Pooling layers (e.g., max and average) may overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values. We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows. The layer operates at multiple scales, allowing the network to adaptively learn hierarchical features. The lacunarity pooling layer can be seamlessly integrated into any artificial neural network architecture. Experimental results demonstrate the layer's effectiveness in capturing intricate spatial patterns, leading to improved feature extraction capabilities. The proposed approach holds promise in various domains, especially in agricultural image analysis tasks. This work contributes to the evolving landscape of artificial neural network architectures by introducing a novel pooling layer that enriches the representation of spatial features. Our code is publicly available. (1)

关键词： computer vision Image Classification Machine Learning Texture Analysis

来源：评论

学校读者我要写书评

暂无评论

AnimalFormer: Multimodal vision Framework for Behavior-based Precision Livestock Farming

AnimalFormer: Multimodal Vision Framework for Behavior-based...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Qazi, Ahmed Razzaq, Taha Iqbal, Asim Tibbling Technol Redmond WA 98052 USA

ISBN: (纸本)9798350365474

We introduce a multimodal vision framework for precision livestock farming, harnessing the power of GroundingDINO, HQSAM, and ViTPose models. This integrated suite enables comprehensive behavioral analytics from video data without invasive animal tagging. GroundingDINO generates accurate bounding boxes around livestock, while HQSAM segments individual animals within these boxes. ViTPose estimates key body points, facilitating posture and movement analysis. Demonstrated on a sheep dataset with grazing, running, sitting, standing, and walking activities, our framework extracts invaluable insights: activity and grazing patterns, interaction dynamics, and detailed postural evaluations. Applicable across species and video resolutions, this framework revolutionizes non-invasive livestock monitoring for activity detection, counting, health assessments, and posture analyses. It empowers data-driven farm management, optimizing animal welfare and productivity through AI-powered behavioral understanding.

关键词： Livestock

来源：评论

学校读者我要写书评

暂无评论

IrrNet: Spatio-Temporal Segmentation guided Classification for Irrigation Mapping

IrrNet: Spatio-Temporal Segmentation guided Classification f...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Hoque, Oishee Bintey Univ Virginia Dept Comp Sci Charlottesville VA 22903 USA

ISBN: (纸本)9798350365474

Irrigation systems can vary widely in scale, from smallscale subsistence farming to large commercial agriculture (see Fig. 1 ). The heterogeneity in irrigation practices and systems across different regions adds to the complexity of mapping (see Fig. 1 ). Distinguishing between irrigated and non-irrigated areas is challenging due to the spectral characteristics of various irrigation systems and practices across different regions, further complicating the task of mapping different types of irrigation. For example, rainfed agriculture is prevalent in the Midwest, Southeast, and parts of the Northeast U.S., while irrigation is common in arid Western and Southwestern states. Rainfed farming can result in highly variable patterns of cultivation. Farmers may practice rainfed agriculture in some fields while irrigating others, leading to a complex mosaic of irrigated and non-irrigated areas within the same region. © 2024 ieee.

关键词： Deep Learning Irrigation Mapping MTL Remote Sensing Segmentation Transfer Learning vision in Agriculture

来源：评论

学校读者我要写书评

暂无评论

Benchmarking Zero-Shot recognition with vision-Language Models: Challenges on Granularity and Specificity

Benchmarking Zero-Shot Recognition with Vision-Language Mode...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xu, Zhenlin Zhu, Yi Deng, Siqi Mittal, Abhay Chen, Yanbei Wang, Manchen Favaro, Paolo Tighe, Joseph Modolo, Davide AWS AI Labs Seattle WA 98109 USA Boson AI Santa Clara CA 95054 USA Meta Menlo Pk CA USA

ISBN: (纸本)9798350365474

This paper presents novel benchmarks for evaluating vision-language models (VLMs) in zero-shot recognition, focusing on granularity and specificity. Although VLMs excel in tasks like image captioning, they face challenges in open-world settings. Our benchmarks test VLMs' consistency in understanding concepts across semantic granularity levels and their response to varying text specificity. Findings show that VLMs favor moderately fine-grained concepts and struggle with specificity, often misjudging texts that differ from their training data. Extensive evaluations reveal limitations in current VLMs, particularly in distinguishing between correct and subtly incorrect descriptions. While fine-tuning offers some improvements, it doesn't fully address these issues, highlighting the need for VLMs with enhanced generalization capabilities for real-world applications. This study provides insights into VLM limitations and suggests directions for developing more robust models.

关键词： Benchmarking Foundational Models vision and language models Zero-shot recognition

来源：评论

学校读者我要写书评

暂无评论

AffordanceLLM: Grounding Affordance from vision Language Models

AffordanceLLM: Grounding Affordance from Vision Language Mod...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Qian, Shengyi Chen, Weifeng Bai, Mm Zhou, Xiong Tu, Zhuowen Li, Li Erran Amazon AWS AI Seattle WA 98109 USA

ISBN: (纸本)9798350365474

Affordance grounding refers to the task of finding the area of an object with which one can interact. It is a fundamental but challenging task, as a successful solution requires the comprehensive understanding of a scene in multiple aspects including detection, localization, and recognition of objects with their parts, of geo-spatial configuration/layout of the scene, of 3D shapes and physics, as well as of the functionality and potential interaction of the objects and humans. Much of the knowledge is hidden and beyond the image content with the supervised labels from a limited training set. In this paper, we make an attempt to improve the generalization capability of the current affordance grounding by taking the advantage of the rich world, abstract, and human-object-interaction knowledge from pre-trained large-scale vision language models [40]. Under the AGD20K benchmark, our proposed model demonstrates a significant performance gain over the competing methods for in-the-wild object affordance grounding. We further demonstrate it can ground affordance for objects from random Internet images, even if both objects and actions are unseen during training.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

PromptSync: Bridging Domain Gaps in vision-Language Models through Class-Aware Prototype Alignment and Discrimination

PromptSync: Bridging Domain Gaps in Vision-Language Models t...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khandelwal, Anant Glance AI Bangalore Karnataka India

ISBN: (纸本)9798350365474

The potential for zero-shot generalization in vision-language (V-L) models such as CLIP has spurred their widespread adoption in addressing numerous downstream tasks. Previous methods have employed test-time prompt tuning to adapt the model to unseen domains, but they overlooked the issue of imbalanced class distributions. In this study, we explicitly address this problem by employing class-aware prototype alignment weighted by mean class probabilities obtained for the test sample and filtered augmented views. Additionally, we ensure that the class probabilities are as accurate as possible by performing prototype discrimination using contrastive learning. The combination of alignment and discriminative loss serves as a geometric regularizer, preventing the prompt representation from collapsing onto a single class and effectively bridging the distribution gap between the source and test domains. Our method, named PromptSync, synchronizes the prompts for each test sample on both the text and vision branches of the V-L model. In empirical evaluations on the domain generalization benchmark, our method outperforms previous best methods by 2.33% in overall performance, by 1% in base-to-novel generalization, and by 2.84% in cross-dataset transfer tasks.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

Deep Video Codec Control for vision Models

Deep Video Codec Control for Vision Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Reich, Christoph Debnath, Biplob Patel, Deep Prangemeier, Tim Cremers, Daniel Chakradhar, Srimat NEC Labs Amer Inc San Jose CA 95110 USA Tech Univ Munich Munich Germany Tech Univ Darmstadt Darmstadt Germany Munich Ctr Machine Learning MCML Munich Germany

ISBN: (纸本)9798350365474

Standardized lossy video coding is at the core of almost all real-world video processing pipelines. Rate control is used to enable standard codecs to adapt to different network bandwidth conditions or storage constraints. However, standard video codecs (e.g., H.264) and their rate control modules aim to minimize video distortion w.r.t. human quality assessment. We demonstrate empirically that standard-coded videos vastly deteriorate the performance of deep vision models. To overcome the deterioration of vision performance, this paper presents the first end-to-end learnable deep video codec control that considers both bandwidth constraints and downstream deep vision performance, while adhering to existing standardization. We demonstrate that our approach better preserves downstream deep vision performance than traditional standard video coding.

关键词： Codec Coding for Machines Deep Video Codec Control Deep vision Models End-to-end Learning Rate Control Self-supervised Learning Standardization Video Coding

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：