检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

23,000 篇 会议
126 册 图书
92 篇 期刊文献

馆藏范围

23,217 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,623 篇 工学
- 11,107 篇 计算机科学与技术...
- 3,479 篇 软件工程
- 2,444 篇 机械工程
- 1,717 篇 光学工程
- 1,076 篇 电气工程
- 1,014 篇 控制科学与工程
- 784 篇 信息与通信工程
- 411 篇 仪器科学与技术
- 352 篇 生物工程
- 251 篇 生物医学工程（可授...
- 196 篇 电子科学与技术（可...
- 114 篇 化学工程与技术
- 107 篇 安全科学与工程
- 100 篇 测绘科学与技术
- 88 篇 建筑学
- 86 篇 交通运输工程
- 84 篇 土木工程
3,493 篇 医学
- 3,480 篇 临床医学
- 81 篇 基础医学(可授医学...
3,241 篇 理学
- 1,939 篇 物理学
- 1,640 篇 数学
- 563 篇 统计学（可授理学、...
- 500 篇 生物学
- 249 篇 系统科学
- 106 篇 化学
521 篇 管理学
- 311 篇 图书情报与档案管...
- 223 篇 管理科学与工程(可...
- 76 篇 工商管理
276 篇 艺术学
- 276 篇 设计学（可授艺术学...
66 篇 法学
- 63 篇 社会学
38 篇 农学
28 篇 教育学
22 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,187 篇 computer vision
3,967 篇 pattern recognit...
3,010 篇 training
2,002 篇 computational mo...
1,816 篇 cameras
1,814 篇 visualization
1,515 篇 feature extracti...
1,482 篇 shape
1,459 篇 three-dimensiona...
1,439 篇 image segmentati...
1,289 篇 robustness
1,203 篇 computer archite...
1,158 篇 semantics
1,148 篇 conferences
1,106 篇 layout
1,093 篇 computer science
1,088 篇 object detection
1,024 篇 benchmark testin...
967 篇 codes
921 篇 face recognition

机构

136 篇 univ sci & techn...
121 篇 univ chinese aca...
118 篇 chinese univ hon...
107 篇 carnegie mellon ...
101 篇 tsinghua univers...
101 篇 microsoft resear...
97 篇 swiss fed inst t...
93 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
80 篇 university of ch...
77 篇 shanghai ai lab ...
72 篇 shanghai jiao to...
69 篇 national laborat...
68 篇 microsoft res as...
66 篇 alibaba grp peop...
64 篇 adobe research
60 篇 peking univ peop...
59 篇 univ oxford oxfo...
59 篇 tsinghua univ pe...

作者

81 篇 van gool luc
71 篇 timofte radu
64 篇 zhang lei
51 篇 luc van gool
41 篇 li stan z.
40 篇 yang yi
37 篇 loy chen change
35 篇 chen chen
33 篇 xiaoou tang
33 篇 qi tian
32 篇 liu yang
32 篇 pascal fua
31 篇 tian qi
31 篇 sun jian
30 篇 murino vittorio
29 篇 darrell trevor
28 篇 li xin
28 篇 li fei-fei
27 篇 vasconcelos nuno
27 篇 hanqing lu

语言

23,023 篇 英文
166 篇 其他
22 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition Workshops"

共 23218 条记录，以下是941-950 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Directional Connectivity-based Segmentation of Medical Images

Directional Connectivity-based Segmentation of Medical Image...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yang, Ziyun Farsiu, Sina Duke Univ Durham NC 27706 USA

ISBN: (纸本)9798350301298

Anatomical consistency in biomarker segmentation is crucial for many medical image analysis tasks. A promising paradigm for achieving anatomically consistent segmentation via deep networks is incorporating pixel connectivity, a basic concept in digital topology, to model inter-pixel relationships. However, previous works on connectivity modeling have ignored the rich channel-wise directional information in the latent space. In this work, we demonstrate that effective disentanglement of directional sub-space from the shared latent space can significantly enhance the feature representation in the connectivity-based network. To this end, we propose a directional connectivity modeling scheme for segmentation that decouples, tracks, and utilizes the directional information across the network. Experiments on various public medical image segmentation benchmarks show the effectiveness of our model as compared to the state-of-the-art methods. Code is available at https://***/Zyun-Y/DconnNet.

关键词： cell microscopy Medical and biological vision

来源：评论

学校读者我要写书评

暂无评论

Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestrian Attribute recognition

Learning CLIP Guided Visual-Text Fusion Transformer for Vide...

引用

2023 ieee/CVF conference on computer vision and pattern recognition workshops, CVPRW 2023

作者： Zhu, Jun Jin, Jiandong Yang, Zihan Wu, Xiaohao Wang, Xiao Anhui University School of Computer Science and Technology Hefei230601 China Anhui University School of Artificial Intelligence Hefei230601 China

ISBN: (纸本)9798350302493

Existing pedestrian attribute recognition (PAR) algorithms are mainly developed based on a static image. However, the performance is not reliable for images with challenging factors, such as heavy occlusion, motion blur, etc. In this work, we propose to understand human attributes using video frames that can make full use of temporal information. Specifically, we formulate the video-based PAR as a vision-language fusion problem and adopt pre-trained big models CLIP to extract the feature embeddings of given video frames. To better utilize the semantic information, we take the attribute list as another input and transform the attribute words/phrase into the corresponding sentence via split, expand, and prompt. Then, the text encoder of CLIP is utilized for language embedding. The averaged visual tokens and text tokens are concatenated and fed into a fusion Transformer for multi-modal interactive learning. The enhanced tokens will be fed into a classification head for pedestrian attribute prediction. Extensive experiments on a large-scale video-based PAR dataset fully validated the effectiveness of our proposed framework. Both the source code and pre-trained models will be released at https://***/Event-AHU/VTF-PAR. © 2023 ieee.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

PolyFormer: Referring Image Segmentation as Sequential Polyg...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Liu, Jiang Ding, Hui Cai, Zhaowei Zhang, Yuting Satzoda, Ravi Kumar Mahadevan, Vijay Manmatha, R. Johns Hopkins Univ Baltimore MD 21218 USA AWS AI Labs Pasadena CA USA

ISBN: (纸本)9798350301298

In this work, instead of directly predicting the pixel-level segmentation masks, the problem of referring image segmentation is formulated as sequential polygon generation, and the predicted polygons can be later converted into segmentation masks. This is enabled by a new sequence-to-sequence framework, Polygon Transformer (PolyFormer), which takes a sequence of image patches and text query tokens as input, and outputs a sequence of polygon vertices autoregressively. For more accurate geometric localization, we propose a regression-based decoder, which predicts the precise floating-point coordinates directly, without any coordinate quantization error. In the experiments, PolyFormer outperforms the prior art by a clear margin, e.g., 5.40% and 4.52% absolute improvements on the challenging RefCOCO+ and RefCOCOg datasets. It also shows strong generalization ability when evaluated on the referring video segmentation task without fine-tuning, e.g., achieving competitive 61.5% J&F on the Ref-DAVIS17 dataset.

关键词： and reasoning language vision

来源：评论

学校读者我要写书评

暂无评论

Efficient Multi-Purpose Cross-Attention Based Image Alignment Block for Edge Devices

Efficient Multi-Purpose Cross-Attention Based Image Alignmen...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bilecen, Bahri Batuhan Fisne, Alparslan Ayazoglu, Mustafa Aselsan Res Ankara Turkey

ISBN: (纸本)9781665487399

Image alignment, also known as image registration, is a critical block used in many computer vision problems. One of the key factors in alignment is efficiency, as inefficient aligners can cause significant overhead to the overall problem. In the literature, there are some blocks that appear to do the alignment operation, although most do not focus on efficiency. Therefore, an image alignment block which can both work in time and/or space and can work on edge devices would be beneficial for almost all networks dealing with multiple images. Given its wide usage and importance, we propose an efficient, cross-attention-based, multi-purpose image alignment block (XABA) suitable to work within edge devices. Using cross-attention, we exploit the relationships between features extracted from images. To make cross-attention feasible for real-time image alignment problems and handle large motions, we provide a pyramidal block based cross-attention scheme. This also captures local relationships besides reducing memory requirements and number of operations. Efficient XABA models achieve real-time requirements of running above 20 FPS performance on NVIDIA Jetson Xavier with 30W power consumption compared to other powerful computers. Used as a sub-block in a larger network, XABA also improves multi-image super-resolution network performance in comparison to other alignment methods.

关键词： Performance evaluation computer vision Power demand Image edge detection Superresolution Feature extraction Optical imaging

来源：评论

学校读者我要写书评

暂无评论

Light Source Separation and Intrinsic Image Decomposition under AC Illumination

Light Source Separation and Intrinsic Image Decomposition un...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yoshida, Yusaku Kawahara, Ryo Okabe, Takahiro Kyushu Inst Technol Dept Artificial Intelligence 680-4 Kawazu Iizuka Fukuoka 8208502 Japan

ISBN: (纸本)9798350301298

Artificial light sources are often powered by an electric grid, and then their intensities rapidly oscillate in response to the grid's alternating current (AC). Interestingly, the flickers of scene radiance values due to AC illumination are useful for extracting rich information on a scene of interest. In this paper, we show that the flickers due to AC illumination is useful for intrinsic image decomposition (IID). Our proposed method conducts the light source separation (LSS) followed by the IID under AC illumination. In particular, we reveal the ambiguity in the blind LSS via matrix factorization and the ambiguity in the IID assuming the diffuse reflection model, and then show why and how those ambiguities can be resolved via a physics-based approach. We experimentally confirmed that our method can recover the colors of the light sources, the diffuse reflectance values, and the diffuse and specular intensities (shadings) under each of the light sources, and that the IID under AC illumination is effective for application to auto white balancing.

关键词： Physics-based vision and shape-from-X

来源：评论

学校读者我要写书评

暂无评论

CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language

CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Div...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sanghi, Aditya Fu, Rao Liu, Vivian Willis, Karl D. D. Shayani, Hooman Khasahmadi, Amir H. Sridhar, Srinath Ritchie, Daniel Autodesk Res San Francisco CA 94105 USA Brown Univ Providence RI USA Columbia Univ New York NY USA

ISBN: (纸本)9798350301298

Recent works have demonstrated that natural language can be used to generate and edit 3D shapes. However, these methods generate shapes with limited fidelity and diversity. We introduce CLIP-Sculptor, a method to address these constraints by producing high-fidelity and diverse 3D shapes without the need for (text, shape) pairs during training. CLIP-Sculptor achieves this in a multi-resolution approach that first generates in a low-dimensional latent space and then upscales to a higher resolution for improved shape fidelity. For improved shape diversity, we use a discrete latent space which is modeled using a transformer conditioned on CLIP's image-text embedding space. We also present a novel variant of classifier-free guidance, which improves the accuracy-diversity trade-off. Finally, we perform extensive experiments demonstrating that CLIP-Sculptor outperforms state-of-the-art baselines.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network

Weakly Supervised Video Emotion Detection and Prediction via...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Zhicheng Wang, Lijuan Yang, Jufeng Nankai Univ Coll Comp Sci TMCC Tianjin Peoples R China

ISBN: (纸本)9798350301298

Automatically predicting the emotions of user-generated videos (UGVs) receives increasing interest recently. However, existing methods mainly focus on a few key visual frames, which may limit their capacity to encode the context that depicts the intended emotions. To tackle that, in this paper, we propose a cross-modal temporal erasing network that locates not only keyframes but also context and audio-related information in a weakly-supervised manner. In specific, we first leverage the intra- and inter-modal relationship among different segments to accurately select keyframes. Then, we iteratively erase keyframes to encourage the model to concentrate on the contexts that include complementary information. Extensive experiments on three challenging video emotion benchmarks demonstrate that our method performs favorably against state-of-the-art approaches. The code is released on https://***/nku-zhichengzhang/WECL.

关键词： vision applications and systems

来源：评论

学校读者我要写书评

暂无评论

GLaMa: Joint Spatial and Frequency Loss for General Image Inpainting

GLaMa: Joint Spatial and Frequency Loss for General Image In...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Lu, Zeyu Jiang, Junjun Huang, Junqin Wu, Gang Liu, Xianming Harbin Inst Technol Harbin Peoples R China Beihang Univ Beijing Peoples R China

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

The purpose of image inpainting is to recover scratches and damaged areas using context information from remaining parts. In recent years, thanks to the resurgence of convolutional neural networks (CNNs), image inpainting task has made great breakthroughs. However, most of the work consider insufficient types of mask, and their performance will drop dramatically when encountering unseen masks. To combat these challenges, we propose a simple yet general method to solve this problem based on the LaMa image inpainting framework [35], dubbed GLaMa. Our proposed GLaMa can better capture different types of missing information by using more types of masks. By incorporating more degraded images in the training phase, we can expect to enhance the robustness of the model with respect to various masks. In order to yield more reasonable results, we further introduce a frequency-based loss in addition to the traditional spatial reconstruction loss and adversarial loss. In particular, we introduce an effective reconstruction loss both in the spatial and frequency domain to reduce the chessboard effect and ripples in the reconstructed image. Extensive experiments demonstrate that our method can boost the performance over the original LaMa method for each type of mask on FFHQ [18], ImageNet [7], Places2 [42] and WikiArt [28] dataset. The proposed GLaMa was ranked first in terms of PSNR, LPIPS [39] and SSIM [34] in the NTIRE 2022 Image Inpainting Challenge Track 1 Unsupervised [27].

关键词： Training computer vision Frequency-domain analysis conferences computer architecture Robustness pattern recognition

来源：评论

学校读者我要写书评

暂无评论

GIVL: Improving Geographical Inclusivity of vision-Language Models with Pre-Training Methods

GIVL: Improving Geographical Inclusivity of Vision-Language ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yin, Da Gao, Feng Thattai, Govind Johnston, Michael Chang, Kai -Wei Univ Calif Los Angeles Los Angeles CA 90095 USA Amazon Alexa AI Lexington MA USA

ISBN: (纸本)9798350301298

A key goal for the advancement of AI is to develop technologies that serve the needs not just of one group but of all communities regardless of their geographical region. In fact, a significant proportion of knowledge is locally shared by people from certain regions but may not apply equally in other regions because of cultural differences. If a model is unaware of regional characteristics, it may lead to performance disparity across regions and result in bias against underrepresented groups. We propose GIVL, a Geographically Inclusive vision-and-Language Pre-trained model. There are two attributes of geo-diverse visual concepts which can help to learn geodiverse knowledge: 1) concepts under similar categories have unique knowledge and visual characteristics, 2) concepts with similar visual features may fall in completely different categories. Motivated by the attributes, we design new pre-training objectives Image-Knowledge Matching (IKM) and Image Edit Checking (IEC) to pre-train GIVL. Compared with similar-size models pre-trained with similar scale of data, GIVL achieves state-of-the-art (SOTA) and more balanced performance on geo-diverse V&L tasks.

关键词： language reasoning vision

来源：评论

学校读者我要写书评

暂无评论

sRGB Real Noise Synthesizing with Neighboring Correlation-Aware Noise Model

sRGB Real Noise Synthesizing with Neighboring Correlation-Aw...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Fu, Zixuan Guo, Lanqing Wen, Bihan Nanyang Technol Univ Singapore Singapore

ISBN: (纸本)9798350301298

Modeling and synthesizing real noise in the standard RGB (sRGB) domain is challenging due to the complicated noise distribution. While most of the deep noise generators proposed to synthesize sRGB real noise using an end-to-end trained model, the lack of explicit noise modeling degrades the quality of their synthesized noise. In this work, we propose to model the real noise as not only dependent on the underlying clean image pixel intensity, but also highly correlated to its neighboring noise realization within the local region. Correspondingly, we propose a novel noise synthesizing framework by explicitly learning its neighboring correlation on top of the signal dependency. With the proposed noise model, our framework greatly bridges the distribution gap between synthetic noise and real noise. We show that our generated "real" sRGB noisy images can be used for training supervised deep denoisers, thus to improve their real denoising results with a large margin, comparing to the popular classic denoisers or the deep denoisers that are trained on other sRGB noise generators. The code will be available at https://***/xuan611/sRGB-Real-Noise-Synthesizing.

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 91 92 93 94 95 96 97 98 99 100 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：