检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

8,905 篇 会议
43 篇 期刊文献
18 册 图书

馆藏范围

8,965 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

4,564 篇 工学
- 4,024 篇 计算机科学与技术...
- 2,182 篇 软件工程
- 1,241 篇 光学工程
- 558 篇 控制科学与工程
- 433 篇 信息与通信工程
- 430 篇 机械工程
- 294 篇 电气工程
- 288 篇 仪器科学与技术
- 179 篇 生物工程
- 159 篇 生物医学工程（可授...
- 119 篇 电子科学与技术（可...
- 64 篇 安全科学与工程
- 58 篇 建筑学
- 58 篇 化学工程与技术
- 52 篇 土木工程
- 52 篇 交通运输工程
- 40 篇 力学（可授工学、理...
2,066 篇 理学
- 1,382 篇 物理学
- 1,198 篇 数学
- 420 篇 统计学（可授理学、...
- 238 篇 生物学
- 55 篇 化学
- 36 篇 系统科学
266 篇 管理学
- 182 篇 图书情报与档案管...
- 92 篇 管理科学与工程(可...
- 47 篇 工商管理
223 篇 医学
- 222 篇 临床医学
- 39 篇 基础医学(可授医学...
205 篇 艺术学
- 205 篇 设计学（可授艺术学...
45 篇 法学
- 43 篇 社会学
21 篇 农学
14 篇 教育学
9 篇 经济学
6 篇 军事学

主题

3,414 篇 computer vision
1,216 篇 pattern recognit...
946 篇 cameras
908 篇 conferences
765 篇 computer science
674 篇 image segmentati...
618 篇 layout
598 篇 training
548 篇 shape
518 篇 robustness
451 篇 feature extracti...
448 篇 humans
445 篇 face recognition
405 篇 computational mo...
402 篇 object detection
365 篇 visualization
356 篇 computer archite...
336 篇 application soft...
304 篇 lighting
257 篇 image reconstruc...

机构

41 篇 microsoft resear...
30 篇 department of co...
25 篇 department of co...
23 篇 institute for co...
22 篇 department of co...
22 篇 school of comput...
20 篇 university of sc...
20 篇 swiss fed inst t...
19 篇 tsinghua univers...
19 篇 institute of com...
18 篇 swiss fed inst t...
17 篇 the robotics ins...
17 篇 carnegie mellon ...
17 篇 computer vision ...
17 篇 department of co...
16 篇 institute of inf...
16 篇 school of comput...
15 篇 school of comput...
15 篇 carnegie mellon ...
14 篇 national laborat...

作者

57 篇 timofte radu
25 篇 huang thomas s.
24 篇 van gool luc
23 篇 s.k. nayar
22 篇 nayar shree k.
22 篇 t. kanade
21 篇 jain anil k.
20 篇 luc van gool
19 篇 t.s. huang
18 篇 xiaoou tang
18 篇 murino vittorio
18 篇 horst bischof
17 篇 a.k. jain
17 篇 t. darrell
16 篇 g. healey
16 篇 bowyer kevin w.
16 篇 bischof horst
15 篇 m.j. black
15 篇 li stan z.
15 篇 m. shah

语言

8,904 篇 英文
53 篇 其他
8 篇 中文
1 篇 土耳其文

检索条件"任意字段=IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops"

共 8966 条记录，以下是1411-1420 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

ProTEA: Programmable Transformer Encoder Acceleration on FPGA

ProTEA: Programmable Transformer Encoder Acceleration on FPG...

引用

2024 workshops of the International conference for High Performance Computing, Networking, Storage and Analysis, SC workshops 2024

作者： Kabir, Ehsan Bakos, Jason D. Andrews, David Huang, Miaoqing University of Arkansas Department of Electrical Engineering and Computer Science Fayetteville United States University of South Carolina Department of Computer Science and Engineering United States

ISBN: (纸本)9798350355543

Transformer neural networks (TNN) have been widely utilized on a diverse range of applications, including natural language processing (NLP), machine translation, and computer vision (CV). Their widespread adoption has been primarily driven by the exceptional performance of their multi-head self-attention block used to extract key features from sequential data. The multi-head self-attention block is followed by feedforward neural networks, which play a crucial role in introducing non-linearity to assist the model in learning complex patterns. Despite the popularity of TNNs, there has been limited numbers of hardware accelerators targeting these two critical blocks. Most prior works have concentrated on sparse architectures that are not flexible for popular TNN variants. This paper introduces ProTEA, a runtime programmable accelerator tailored for the dense computations of most of state-of-the-art transformer encoders. ProTEA is designed to reduce latency by maximizing parallelism. We introduce an efficient tiling of large matrices that can distribute memory and computing resources across different hardware components within the FPGA. We provide run time evaluations of ProTEA on a Xilinx Alveo U55C high-performance data center accelerator card. Experimental results demonstrate that ProTEA can host a wide range of popular transformer networks and achieve near optimal performance with a tile size of 64 in the multi-head self-attention block and 6 in the feedforward networks block when configured with 8 parallel attention heads, 12 layers, and an embedding dimension of 768 on the U55C. Comparative results are provided showing ProTEA is 2.5× faster than an NVIDIA Titan XP GPU. Results also show that it achieves 1.3 - 2.8× speed up compared with current state-of-the-art custom designed FPGA accelerators. © 2024 ieee.

关键词： Attention Encoder FPGA Hardware Accelerators High-Level Synthesis Natural Language Processing Neural Networks Transformer

来源：评论

学校读者我要写书评

暂无评论

ALINA: Advanced Line Identification and Notation Algorithm

ALINA: Advanced Line Identification and Notation Algorithm

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Mohammed Abdul Hafeez Khan Parth Ganeriwala Siddhartha Bhattacharyya Natasha Neogi Raja Muthalagu Florida Institute of Technology Melbourne Florida NASA Langley Research Center Hampton Virginia BITS Pilani Dubai Campus Dubai UAE

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Labels are the cornerstone of supervised machine learning algorithms. Most visual recognition methods are fully supervised, using bounding boxes or pixel-wise segmentations for object localization. Traditional labeling methods, such as crowd-sourcing, are prohibitive due to cost, data privacy, amount of time, and potential errors on large datasets. To address these issues, we propose a novel annotation framework, Advanced Line Identification and Notation Algorithm (ALINA), which can be used for labeling taxiway datasets that consist of different camera perspectives and variable weather attributes (sunny and cloudy). Additionally, the CIRCular threshoLd pixEl Discovery And Traversal (CIRCLEDAT) algorithm has been proposed, which is an integral step in determining the pixels corresponding to taxiway line markings. Once the pixels are identified, ALINA generates corresponding pixel coordinate annotations on the frame. Using this approach, 60,249 frames from the taxiway dataset, AssistTaxi have been labeled. To evaluate the performance, a context-based edge map (CBEM) set was generated manually based on edge features and connectivity. The detection rate after testing the annotated labels with the CBEM set was recorded as 98.45%, attesting its dependability and effectiveness.

关键词： Visualization Costs Annotations Image edge detection Roads Manuals pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Thermal Image Super-Resolution Challenge Results - PBVS 2023

Thermal Image Super-Resolution Challenge Results - PBVS 2023

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Rafael E. Rivadeneira Angel D. Sappa Boris X. Vintimilla Chenyang Wang Junjun Jiang Xianming Liu Zhiwei Zhong Dai Bin Li Ruodi Li Shengye Escuela Superior Politécnica del Litoral ESPOL Guayaquil Ecuador Computer Vision Center Campus UAB Bellaterra Barcelona Spain

This paper presents the results of two tracks from the fourth Thermal Image Super-Resolution (TISR) challenge, held at the Perception Beyond the Visible Spectrum (PBVS) 2023 workshop. Track-1 uses the same thermal image dataset as previous challenges, with 951 training images and 50 validation images at each resolution. In this track, two evaluations were conducted: the first consists of generating a SR image from a HR thermal noisy image downsampled by four, and the second consists of generating a SR image from a mid-resolution image and compare it with its semi-registered HR image (acquired with another camera). The results of Track-1 outperformed those from last year’s challenge. On the other hand, Track-2 uses a new acquired dataset consisting of 160 registered visible and thermal images of the same scenario for training and 30 validation images. This year, more than 150 teams participated in the challenge tracks, demonstrating the community’s ongoing interest in this topic.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Dense Multitask Learning to Reconfigure Comics

Dense Multitask Learning to Reconfigure Comics

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Deblina Bhattacharjee Sabine Süsstrunk Mathieu Salzmann School of Computer and Communication Sciences EPFL Switzerland

In this paper, we develop a MultiTask Learning (MTL) model to achieve dense predictions for comics panels to, in turn, facilitate the transfer of comics from one publication channel to another by assisting authors in the task of reconfiguring their narratives. Our MTL method can successfully identify the semantic units as well as the embedded notion of 3D in comics panels. This is a significantly challenging problem because comics comprise disparate artistic styles, illustrations, layouts, and object scales that depend on the author's creative process. Typically, dense image-based prediction techniques require a large corpus of data. Finding an automated solution for dense prediction in the comics domain, therefore, becomes more difficult with the lack of ground-truth dense annotations for the comics images. To address these challenges, we develop the following solutions- we leverage a commonly-used strategy known as unsupervised image-to-image translation, which allows us to utilize a large corpus of real-world annotations; - we utilize the results of the translations to develop our multitasking approach that is based on a vision transformer backbone and a domain transferable attention module; -we study the feasibility of integrating our MTL dense-prediction method with an existing retargeting method, thereby reconfiguring comics.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Knowledge Distillation for Efficient Instance Semantic Segmentation with Transformers

Knowledge Distillation for Efficient Instance Semantic Segme...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Maohui Li Michael Halstead Chris McCool University of Bonn Lamarr Institute for Machine Learning and Artificial Intelligence

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Instance-based semantic segmentation provides detailed per-pixel scene understanding information crucial for both computer vision and robotics applications. However, state-of-the-art approaches such as Mask2Former are computationally expensive and reducing this computational burden while maintaining high accuracy remains challenging. Knowledge distillation has been regarded as a potential way to compress neural networks, but to date limited work has explored how to apply this to distill information from the output queries of a model such as *** this paper, we match the output queries of the student and teacher models to enable a query-based knowledge distillation scheme. We independently match the teacher and the student to the groundtruth and use this to define the teacher to student relationship for knowledge distillation. Using this approach we show that it is possible to perform knowledge distillation where the student models can have a lower number of queries and the backbone can be changed from a Transformer architecture to a convolutional neural network architecture. Experiments on two challenging agricultural datasets, sweet pepper (BUP20) and sugar beet (SB20), and Cityscapes demonstrate the efficacy of our approach. Across the three datasets the student models obtain an average absolute performance improvement in AP of 1.8 and 1.9 points for ResNet-50 and Swin-Tiny backbone respectively. To the best of our knowledge, this is the first work to propose knowledge distillation schemes for instance semantic segmentation with transformer-based models.

关键词： Knowledge engineering computer vision Semantic segmentation Computational modeling Impedance matching Neural networks computer architecture

来源：评论

学校读者我要写书评

暂无评论

Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Models

Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Mod...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Guihong LiM Hsiang Hsu Chun-Fu Richard Chen Radu Marculescu The University of Texas at Austin USA Global Technology Applied Research JPMorgan Chase USA

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

The rapid growth of machine learning has spurred legislative initiatives such as "the Right to be Forgotten," allowing users to request data removal. In response, machine unlearning proposes the selective removal of unwanted data without the need for retraining from scratch. While the Neural-Tangent-Kernel (NTK) based unlearning method excels in performance, it suffers from significant computational complexity, especially for large-scale models and datasets. To improve this situation, our work introduces "Fast-NTK," a novel NTK-based unlearning algorithm that significantly reduces the computational complexity by incorporating parameter-efficient fine-tuning methods, such as fine-tuning batch normalization layers in a CNN or visual prompts in a vision transformer. Our experimental results demonstrate scalability to really large neural networks and datasets (e.g., 88M parameters and 5k images), surpassing the limitations of previous full-model NTK-based approaches designed for smaller cases (e.g., 8M parameters and 500 images). Notably, our approach maintains a performance comparable to the traditional methods of retraining on the retain set alone. Fast-NTK can thus enable practical and scalable NTK-based unlearning in deep neural networks.

关键词： computer vision Visualization Computational modeling Scalability conferences Artificial neural networks Machine learning

来源：评论

学校读者我要写书评

暂无评论

Neuromorphic Lip-Reading with Signed Spiking Gated Recurrent Units

Neuromorphic Lip-Reading with Signed Spiking Gated Recurrent...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Manon Dampfhoffer Thomas Mesquida CEA List Univ. Grenoble Alpes Grenoble France

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Automatic Lip-Reading (ALR) requires the recognition of spoken words based on a visual recording of the speaker’s lips, without access to the sound. ALR with neuromorphic event-based vision sensors, instead of traditional frame-based cameras, is particularly promising for edge applications due to their high temporal resolution, low power consumption and robustness. Neuromorphic models, such as Spiking Neural Networks (SNNs), encode information using events and are naturally compatible with such data. The sparse and event-based nature of both the sensor data and SNN activations can be leveraged in an end-to-end neuromorphic hardware pipeline for low-power and low-latency edge applications. However, the accuracy of SNNs is often largely degraded compared to state-of-the-art non-spiking Artificial Neural Networks (ANNs). In this work, a new SNN model, the Signed Spiking Gated Recurrent Unit (SpikGRU2+), is proposed and used as a task head for event-based ALR. The SNN architecture is as accurate as its ANN equivalent, and outperforms the state-of-the-art on the DVS-Lip dataset. Notably, the accuracy is improved by 25% (respectively 4%) compared to the previous state-of-the-art SNN (respectively ANN). In addition, the SNN spike sparsity can be optimized to further reduce the number of operations up to 22x compared to the ANN while maintaining a high accuracy. This work opens up new perspectives for the use of SNNs for accurate and low-power end-to-end neuromorphic gesture recognition. Code is available 1 .

关键词： Visualization Accuracy Neuromorphics Gesture recognition Spiking neural networks Logic gates vision sensors

来源：评论

学校读者我要写书评

暂无评论

The revenge of BiSeNet: Efficient Multi-Task Image Segmentation

The revenge of BiSeNet: Efficient Multi-Task Image Segmentat...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Gabriele Rosi Claudia Cuttano Niccolò Cavagnero Giuseppe Averta Fabio Cermelli Politecnico di Torino Focoos AI

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Recent advancements in image segmentation have focused on enhancing the efficiency of the models to meet the demands of real-time applications, especially on edge devices. However, existing research has primarily concentrated on single-task settings, especially on semantic segmentation, leading to redundant efforts and specialized architectures for different tasks. To address this limitation, we propose a novel architecture for efficient multi-task image segmentation, capable of handling various segmentation tasks without sacrificing efficiency or accuracy. We introduce BiSeNetFormer, that leverages the efficiency of two-stream semantic segmentation architectures and it extends them into a mask classification framework. Our approach maintains the efficient spatial and context paths to capture detailed and semantic information, respectively, while leveraging an efficient transformed-based segmentation head that computes the binary masks and class probabilities. By seamlessly supporting multiple tasks, namely semantic and panoptic segmentation, BiSeNetFormer offers a versatile solution for multi-task segmentation. We evaluate our approach on popular datasets, Cityscapes and ADE20K, demonstrating impressive inference speeds while maintaining competitive accuracy compared to state-of-the-art architectures. Our results indicate that BiSeNetFormer represents a significant advancement towards fast, efficient, and multi-task segmentation networks, bridging the gap between model efficiency and task adaptability.

关键词： computer vision Accuracy Semantic segmentation Semantics Refining Redundancy computer architecture

来源：评论

学校读者我要写书评

暂无评论

Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion

Zero-Shot Monocular Motion Segmentation in the Wild by Combi...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Yuxiang Huang Yuhao Chen John Zelek Vision and Image Processing Lab University of Waterloo Waterloo Canada

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Detecting and segmenting moving objects from a moving monocular camera is challenging in the presence of unknown camera motion, diverse object motions and complex scene structures. Most existing methods rely on a single motion cue to perform motion segmentation, which is usually insufficient when facing different complex environments. While a few recent deep learning based methods are able to combine multiple motion cues to achieve improved accuracy, they depend heavily on vast datasets and extensive annotations, making them less adaptable to new scenarios. To address these limitations, we propose a novel monocular dense segmentation method that achieves state-of-the-art motion segmentation results in a zero-shot manner. The proposed method synergestically combines the strengths of deep learning and geometric model fusion methods by performing geometric model fusion on object proposals. Experiments show that our method achieves competitive results on several motion segmentation datasets and even surpasses some state-of-the-art supervised methods on certain benchmarks, while not being trained on any data. We also present an ablation study to show the effectiveness of combining different geometric models together for motion segmentation, highlighting the value of our geometric model fusion strategy.

关键词： Deep learning computer vision Motion segmentation conferences Computational modeling Geometric modeling Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestrian Attribute recognition

Learning CLIP Guided Visual-Text Fusion Transformer for Vide...

引用

ieee computer society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Jun Zhu Jiandong Jin Zihan Yang Xiaohao Wu Xiao Wang School of Computer Science and Technology Anhui University Hefei China School of Artificial Intelligence Anhui University Hefei China

Existing pedestrian attribute recognition (PAR) algorithms are mainly developed based on a static image. However, the performance is not reliable for images with challenging factors, such as heavy occlusion, motion blur, etc. In this work, we propose to understand human attributes using video frames that can make full use of temporal information. Specifically, we formulate the video-based PAR as a vision-language fusion problem and adopt pre-trained big models CLIP to extract the feature embeddings of given video frames. To better utilize the semantic information, we take the attribute list as another input and transform the attribute words/phrase into the corresponding sentence via split, expand, and prompt. Then, the text encoder of CLIP is utilized for language embedding. The averaged visual tokens and text tokens are concatenated and fed into a fusion Transformer for multi-modal interactive learning. The enhanced tokens will be fed into a classification head for pedestrian attribute prediction. Extensive experiments on a large-scale video-based PAR dataset fully validated the effectiveness of our proposed framework. Both the source code and pre-trained models will be released at https://***/Event–AHU/VTF_PAR.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 138 139 140 141 142 143 144 145 146 147 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：