检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

1,927 篇 会议
237 册 图书
24 篇 期刊文献

馆藏范围

2,187 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

1,606 篇 工学
- 1,333 篇 计算机科学与技术...
- 466 篇 软件工程
- 250 篇 电气工程
- 235 篇 机械工程
- 186 篇 光学工程
- 179 篇 信息与通信工程
- 119 篇 控制科学与工程
- 96 篇 生物工程
- 62 篇 生物医学工程（可授...
- 42 篇 仪器科学与技术
- 39 篇 电子科学与技术（可...
- 30 篇 化学工程与技术
- 21 篇 安全科学与工程
- 18 篇 材料科学与工程（可...
- 15 篇 交通运输工程
- 13 篇 建筑学
444 篇 理学
- 257 篇 物理学
- 202 篇 数学
- 110 篇 生物学
- 57 篇 统计学（可授理学、...
- 22 篇 化学
228 篇 医学
- 200 篇 临床医学
- 26 篇 基础医学(可授医学...
- 22 篇 特种医学
137 篇 管理学
- 83 篇 图书情报与档案管...
- 60 篇 管理科学与工程(可...
- 19 篇 工商管理
27 篇 艺术学
- 27 篇 设计学（可授艺术学...
16 篇 农学
- 15 篇 作物学
15 篇 法学
- 13 篇 社会学
9 篇 教育学
7 篇 经济学
5 篇 文学
5 篇 军事学

主题

320 篇 computer vision
286 篇 pattern recognit...
166 篇 artificial intel...
119 篇 feature extracti...
118 篇 computer imaging...
101 篇 image processing...
82 篇 face recognition
68 篇 training
61 篇 object detection
60 篇 image segmentati...
57 篇 computer applica...
54 篇 deep learning
51 篇 robustness
47 篇 computer graphic...
46 篇 cameras
45 篇 visualization
43 篇 semantics
38 篇 object recogniti...
37 篇 shape
36 篇 information syst...

机构

89 篇 univ chinese aca...
67 篇 chinese acad sci...
59 篇 national laborat...
56 篇 chinese acad sci...
50 篇 univ chinese aca...
36 篇 chinese univ hon...
36 篇 university of ch...
31 篇 institute of aut...
27 篇 chinese acad sci...
25 篇 school of artifi...
23 篇 univ sci & techn...
22 篇 chinese academy ...
18 篇 chinese acad sci...
17 篇 chinese univ hon...
16 篇 chinese acad sci...
16 篇 univ chinese aca...
15 篇 national laborat...
15 篇 computer vision ...
14 篇 tsinghua univers...
14 篇 department of in...

作者

32 篇 wang xiaogang
29 篇 lu hanqing
28 篇 tan tieniu
28 篇 wang jinqiao
23 篇 li stan z.
22 篇 pal umapada
21 篇 huang kaiqi
21 篇 lei zhen
21 篇 qiao yu
19 篇 tieniu tan
19 篇 hu weiming
18 篇 tang xiaoou
17 篇 xilin chen
15 篇 wang liang
15 篇 chen xilin
15 篇 cheng jian
14 篇 liu jing
14 篇 tang ming
13 篇 xiaoou tang
13 篇 shiguang shan

语言

2,167 篇 英文
19 篇 中文
7 篇 其他
1 篇 土耳其文

检索条件"任意字段=7th Chinese Conference on Pattern Recognition and Computer Vision"

共 2188 条记录，以下是541-550 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Everyday object meets vision-and-language navigation agent via backdoor 24

Everyday object meets vision-and-language navigation agent v...

引用

Proceedings of the 38th International conference on Neural Information Processing Systems

作者： Keji He Kehan Chen Jiawang Bai Yan Huang Qi Wu Shu-Tao Xia Liang Wang Shandong University New Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences and School of Artificial Intelligence University of Chinese Academy of Sciences Tencent School of Computer Science University of Adelaide Tsinghua Shenzhen International Graduate School Tsinghua University

ISBN: (纸本)9798331314385

vision-and-Language Navigation (VLN) requires an agent to dynamically explore environments following natural language. the VLN agent, closely integrated into daily lives, poses a substantial threat to the security of privacy and property upon the occurrence of malicious behavior. However, this serious issue has long been overlooked. In this paper, we pioneer the exploration of an object-aware backdoored VLN, achieved by implanting object-aware backdoors during the training phase. Tailored to the unique VLN nature of cross-modality and continuous decision-making, we propose a novel backdoored VLN paradigm: IPR Backdoor. this enables the agent to act in abnormal behavior once encountering the object triggers during language-guided navigation in unseen environments, thereby executing an attack on the target scene. Experiments demonstrate the effectiveness of our method in both physical and digital spaces across different VLN agents, as well as its robustness to various visual and textual variations. Additionally, our method also well ensures navigation performance in normal scenarios with remarkable stealthiness. the code is available at https://***/Chenkehan21/VLN-ATT.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Alleviating Action Hallucination for LLM-based Embodied Agents via Inner and Outer Alignment

Alleviating Action Hallucination for LLM-based Embodied Agen...

引用

pattern recognition and Artificial Intelligence (PRAI), International conference on

作者： Kanxue Li Qi Zheng Yibing Zhan Chong Zhang Tianle Zhang Xu Lin Chongchong Qi Lusong Li Dapeng Tao Yunnan University Kunming China JD Explore Academy Beijing China Shenzhen University Shenzhen China Yunnan United Vision Technology Co. Ltd Kunming China Yunnan Key Laboratory of Media Convergence Kunming China

ISBN: (数字)9798350350890

ISBN: (纸本)9798350350906

Large language models (LLMs) have demonstrated impressive potential in empowering embodied agents, fortifying them with task planning and reasoning capabilities closely akin to humans. However, there remain significant challenges in aligning LLM-based embodied agent actions with the executable and safe action space to reduce hallucination. In this paper, we propose a flexible and resource-efficient framework for aligning the action space of LLM-based embodied agents. the framework employs a parameter-efficient fine-tuning method for inner alignment and a retrieval-based generation approach for outer alignment. Specifically, when the inner aligned model generates an action, the outer alignment employs ROUGE to calculate the similarity between the action and all actions within a safe and valid action space, ultimately selecting the action with the highest similarity as the output. For situations with multiple alternative actions, the outer alignment introduces a policy model, which could be either open-source small LLMs or commercial LLMs, to determine the optimal action based on the current context of the agent's task execution. the retrieval-based outer alignment ensures all actions align with an executable action space, alleviating action hallucination for LLM-based agents and significantly improving the controllability and interpretability of their decision-making process. through extensive experimentation with the LlaMA-2, Bloomz, and OPT models on the ALFWorld benchmark, we validate the effectiveness and adaptability of our framework.

关键词： Large language models Decision making Process control Benchmark testing Aerospace electronics Controllability Stability analysis Planning Safety pattern recognition

来源：评论

学校读者我要写书评

暂无评论

FreeStyler: A Free-Form Stylization Method via Multimodal Vector Quantization 12th

FreeStyler: A Free-Form Stylization Method via Multimodal V...

引用

12th International conference on Computational Visual Media, CVM 2024

作者： Liu, WuQin Lin, MinXuan Huang, HaiBin Ma, ChongYang Dong, WeiMing School of Artificial Intelligence University of Chinese Academy of Sciences Beijing101408 China National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences Beijing100190 China Kuaishou Technology Beijing100085 China

ISBN: (纸本)9789819720910

Image stylization refers to the process of transforming an input image into a new one, while retaining its original content but in different styles. However, most existing works only support single-modal guidance, which is not ideal for real-world applications. To tackle this limitation, we propose FreeStyler, a flexible framework for image stylization that is capable of handling various input scenarios. Our approach goes beyond the traditional approach of relying on content and style images to generate a stylized image and supports situations where these references are absent. Specifically, in such cases, FreeStyler allows performing the stylization through text or audio information. the core of FreeStyler is a vector quantized style transfer framework that encodes content and style information into a shared discrete latent feature space, followed by a stylization transformer for style fusion and an image decoder for stylized image reconstruction. To enable free-form stylization, we introduce a novel pseudo-paired token predictor that can estimate tokens from varying input forms without the need for additional text or audio data. Specifically, we leverage Contrastive Language-Image Pre-training (CLIP) as prior knowledge to align discrete representations across different modalities and train the framework using an image and pseudo caption pair provided by Bootstrapping Language-Image Pre-training (BLIP). through qualitative and quantitative experiments, our method has demonstrated superior performance compared to state-of-the-art stylization methods. © the Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Vector quantization

来源：评论

学校读者我要写书评

暂无评论

A Scalable 3D Array Architecture for Accelerating Convolutional Neural Networks 6th

A Scalable 3D Array Architecture for Accelerating Convolutio...

引用

6th International conference on Cognitive Systems and Signal Processing, ICCSIP 2021

作者： Ji, Yafei Wang, Xiang Zhou, Yangfan Cheng, Chen Li, Jiang Wang, Haoyuan Wang, Xuguang Liu, Xin Chinese Academy of Sciences 398 Ruoshui Road Suzhou Industrial Park Jiangsu Suzhou China Gusu Laboratory of Materials 388 Ruoshui Road Suzhou Industrial Park Jiangsu Suzhou China

ISBN: (纸本)9789811692468

Convolutional neural network (CNN) is widely used in computer vision and image recognition, and the structure of the CNN becomes more and more complex. the complexity of CNN brings challenges of performance and storage capacity for hardware implementation. To address these challenges, in this paper, we propose a novel 3D array architecture for accelerating CNN. this proposed architecture has several benefits: Firstly, the strategy of multilevel caches is employed to improve data reusage, and thus reducing the access frequency to external memory;Secondly, performance and throughout are balanced among 3D array nodes by using novel workload and weight partitioning schemes. thirdly, computing and transmission are performed simultaneously, resulting in higher parallelism and lower hardware storage requirement;Finally, the efficient data mapping strategy is proposed for better scalability of the entire system. the experimental results show that our proposed 3D array architecture can effectively improve the overall computing performance of the system. © 2022, Springer Nature Singapore Pte Ltd.

关键词： Image recognition

来源：评论

学校读者我要写书评

暂无评论

Self-prior Guided Mamba-UNet Networks for Medical Image Super-Resolution 27th

Self-prior Guided Mamba-UNet Networks for Medical Image Supe...

引用

27th International conference on pattern recognition, ICPR 2024

作者： Ji, Zexin Zou, Beiji Kui, Xiaoyan Vera, Pierre Ruan, Su School of Computer Science and Engineering Central South University Changsha410083 China Hunan Engineering Research Center of Machine Vision and Intelligent Medicine Central South University Changsha410083 China University of Rouen-Normandy LITIS - QuantIF UR 4108 Rouen76000 France Department of Nuclear Medicine Henri Becquerel Cancer Center Rouen France

ISBN: (纸本)9783031781940

In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, State Space Models (SSMs) especially Mamba have emerged, capable of modeling long-range dependencies with linear computational complexity. Inspired by Mamba, our approach aims to learn the self-prior multi-scale contextual features under Mamba-UNet networks, which may help to super-resolve low-resolution medical images in an efficient way. Specifically, we obtain self-priors by perturbing the brightness inpainting of the input image during network training, which can learn detailed texture and brightness information that is beneficial for super-resolution. Furthermore, we combine Mamba with Unet network to mine global features at different levels. We also design an improved 2D-Selective-Scan (ISS2D) module to divide image features into different directional sequences to learn long-range dependencies in multiple directions, and adaptively fuse sequence information to enhance super-resolved feature representation. Both qualitative and quantitative experimental results demonstrate that our approach outperforms current state-of-the-art methods on two public medical datasets: the IXI and fastMRI. © the Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

RRSR:Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection 17th

RRSR:Reciprocal Reference-Based Image Super-Resolution with ...

引用

17th European conference on computer vision (ECCV)

作者： Zhang, Lin Li, Xin He, Dongliang Li, Fu Wang, Yili Zhang, Zhaoxiang Chinese Acad Sci Inst Automat Beijing Peoples R China Baidu Inc Dept Comp Vis Technol VIS Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Tsinghua Univ Beijing Peoples R China CASIA Natl Lab Pattern Recognit Beijing Peoples R China UCAS Sch Future Technol Beijing Peoples R China HKISI CAS Ctr Artificial Intelligence & Robot Beijing Peoples R China

ISBN: (纸本)9783031197994;9783031198007

Reference-based image super-resolution (RefSR) is a promising SR branch and has shown great potential in overcoming the limitations of single image super-resolution. While previous state-of-the-art RefSR methods mainly focus on improving the efficacy and robustness of reference feature transfer, it is generally overlooked that a well reconstructed SR image should enable better SR reconstruction for its similar LR images when it is referred to as. therefore, in this work, we propose a reciprocal learning framework that can appropriately leverage such a fact to reinforce the learning of a RefSR network. Besides, we deliberately design a progressive feature alignment and selection module for further improving the RefSR task. the newly proposed module aligns reference-input images at multi-scale feature spaces and performs reference-aware feature selection in a progressive manner, thus more precise reference features can be transferred into the input features and the network capability is enhanced. Our reciprocal learning paradigm is model-agnostic and it can be applied to arbitrary RefSR models. We empirically show that multiple recent state-of-the-art RefSR models can be consistently improved with our reciprocal learning paradigm. Furthermore, our proposed model together with the reciprocal learning strategy sets new state-of-the-art performances on multiple benchmarks.

关键词： Reference-based image super-resolution Reciprocal learning Reference-input feature alignment

来源：评论

学校读者我要写书评

暂无评论

Multi-view anal fistula disease diagnosis based on local enhancement and global fusion via MRI

Multi-view anal fistula disease diagnosis based on local enh...

引用

7th International conference on vision, Image and Signal Processing (ICVISP 2023)

作者： C. Tan L. Tian H. Chen P. Liao College of Intelligent Medicine Chengdu University of Traditional Chinese Medicine Chengdu 611137 People's Republic of China School of Electronic Engineering and Computer Science Queen Mary University of London E1 4NS UK National Key Laboratory of Fundamental Science on Synthetic Vision Sichuan University Chengdu 610065 People's Republic of China The Sixth People's Hospital of Chengdu Chengdu 610051 People's Republic of China

Anal fistulas often result from cryptoglandular infections or complications of conditions like Crohn's disease. Patients typically present with symptoms such as recurrent perianal abscesses, discharge, and occasional fecal incontinence. Accurate diagnosis is crucial for determining the type of fistula and planning appropriate treatment. Usually, Magnetic Resonance Imaging (MRI) is used for diagnosis. However, depending on a single perspective can become progressively obstructed, and potentially resulting in erroneous conclusions and miscarriages of justice. thus, a deep learning method for multi-view anal fistula MRI image fusion, combining convolutional neural networks (CNN) and Transformers is proposed. this model extracts feature from different angles of anal fistula MRI, achieving initial feature extraction through CNN modules. It models the local semantic relationship between the two views using a joint crossattention mechanism, exploring multi-view feature connections for local regions, and enhancing feature interaction across views. Additionally, a consistency loss is introduced by computing the cosine similarity between features from two views, maximizing the similarity between features, to constrain the feature distances, ensuring global feature consistency calculation across different views, enhancing the feature representation of single-view images. Experimental validation demonstrates that the method proposed outperforms other mainstream models and can achieve precise identification of anal fistula.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Dual-Evidential Learning for Weakly-supervised Temporal Action Localization 1

引用

17th European conference on computer vision (ECCV)

作者： Chen, Mengyuan Gao, Junyu Yang, Shicai Xu, Changsheng Chinese Acad Sci CASIA Inst Automat Natl Lab Pattern Recognit NLPR Beijing Peoples R China Univ Chinese Acad Sci UCAS Sch Artificial Intelligence Beijing Peoples R China Hikvis Res Inst Hangzhou Peoples R China Peng Cheng Lab Shenzhen Peoples R China

ISBN: (数字)9783031197727

ISBN: (纸本)9783031197710;9783031197727

Weakly-supervised temporal action localization (WS-TAL) aims to localize the action instances and recognize their categories with only video-level labels. Despite great progress, existing methods suffer from severe action-background ambiguity, which mainly comes from background noise introduced by aggregation operations and large intra-action variations caused by the task gap between classification and localization. To address this issue, we propose a generalized evidential deep learning (EDL) framework for WS-TAL, called Dual-Evidential Learning for Uncertainty modeling (DELU), which extends the traditional paradigm of EDL to adapt to the weakly-supervised multi-label classification goal. Specifically, targeting at adaptively excluding the undesirable background snippets, we utilize the video-level uncertainty to measure the interference of background noise to video-level prediction. then, the snippet-level uncertainty is further deduced for progressive learning, which gradually focuses on the entire action instances in an "easy-to-hard" manner. Extensive experiments show that DELU achieves state-of-the-art performance on thUMOS14 and ActivityNetl.2 benchmarks. Our code is available in ***/MengyuanChen21/ECCV2022-DELU.

关键词： Weakly-supervised temporal action localization Evidential deep learning Action-background ambiguity

来源：评论

学校读者我要写书评

暂无评论

An Efficient Polyp Detection Framework with Suspicious Targets Assisted Training 4th

An Efficient Polyp Detection Framework with Suspicious Targe...

引用

4th chinese conference on pattern recognition and computer vision (PRCV)

作者： Zhang, Zhipeng Xiao, Li Zhuang, Fuzhen Ma, Ling Chang, Yuan Wang, Yuanyuan Jiang, Huiqin He, Qing Zhengzhou Univ Henan Inst Adv Technol Zhengzhou Peoples R China Chinese Acad Sci Inst Comp Technol Chinese Acad Sci CAS Key Lab Intelligent Informat Proc Beijing Peoples R China Univ Chinese Acad Sci Ningbo Huamei Hosp Ningbo Peoples R China Beihang Univ Inst Artificial Intelligence Beijing 100191 Peoples R China Xiamen Inst Data Intelligence Xiamen Peoples R China Zhengzhou Univ Affiliated Hosp 1 Zhengzhou Peoples R China

ISBN: (纸本)9783030880132;9783030880125

Automatic polyp detection during colonoscopy is beneficial for reducing the risk of colorectal cancer. However, due to the various shapes and sizes of polyps and the complex structures in the intestinal cavity, some normal tissues may display features similar to actual polyps. As a result, traditional object detection models are easily confused by such suspected target regions and lead to false-positive detection. In this work, we propose a multi-branch spatial attention mechanism based on the one-stage object detection framework, YOLOv4. Our model is further jointly optimized with a top likelihood and similarity to reduce false positives caused by suspected target regions. A similarity loss is further added to identify the suspected targets from real ones. We then introduce a Cross Stage Partial Connection mechanism to reduce the parameters. Our model is evaluated on the private colonic polyp dataset and the public MICCAI 2015 grand challenge dataset including the CVCClinic 2015 and Etis-Larib, both of the results show our model improves performance by a large margin and with less computational cost.

关键词： Polyp detection Suspected target Semi-supervised learning

来源：评论

学校读者我要写书评

暂无评论

A Self-Attention Based Method for Facial Expression recognition 2021

A Self-Attention Based Method for Facial Expression Recognit...

引用

7th International conference on Computing and Artificial Intelligence, ICCAI 2021

作者： Ling, Xufeng Liang, Jingxin Yang, Jie Shanghai Normal University Tianhua College Ai School No. 1661 North Sheng Xin Road China Institute of Image Processing and Pattern Recognition Shanghai Jiaotong University China

ISBN: (纸本)9781450389501

We present a self-attention-based method termed as vision Transformer (ViT) to efficiently classify the human facial expressions. Our work can be divided into two contributions. First, the facial expression image is divided to N∗N patches, each of which corresponds to word, and the whole image data is used as a paragraph that composed of n words. Second, we design a learnable module, the ViT, with sequence length of L, latent dimension, and 12 attention layers which are integrated together as a unified framework. We also train the proposed model on the normalized and augmented version of FER2013plusdataset. We show empirically that ViT has superior performance compared to alternative approaches. © 2021 ACM.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共219页 << < 51 52 53 54 55 56 57 58 59 60 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：