检索结果-内蒙古大学图书馆

第三十九届中国控制会议

作者： Juan Qian Xiaoling Wang Guo-Ping Jiang Housheng Su College of Automation and College of Artificial Intelligence Nanjing University of Posts and Telecommunicationsand Jiangsu Engineering Lab for IOT Intelligent Robots(IOTRobot) School of Artificial Intelligence and Automation Image Processing and Intelligent Control Key Laboratory of Education Ministry of China Huazhong University of Science and Technology

In this paper, the robust containment control problem of the leader-following multi-agent systems with input saturation and input additive disturbance is addressed, where the followers can be informed by multiple leaders. With the help of the lowand-high gain feedback technique and the high-gain observer approach, a distributed control algorithm for each agent is firstly designed by using the observed output information, then sufficient conditions are provided to guarantee the semi-global robust containment of the system. Finally, some numerical simulations are given to verify the correctness of the theoretical results.

关键词： Containment control multi-agent system input saturation input additive disturbance

来源：评论

学校读者我要写书评

暂无评论

Tigc-Net: Transformer-Improved Graph Convolution Network for Spatio-Temporal Prediction

SSRN

引用

SSRN 2022年

作者： Chen, Kai Yang, Chunfeng Zhou, Zhengyuan Liu, Yao Ji, Tianjiao Sun, Weiya Chen, Yang School of Cyber Science and Engineering Southeast University Nanjing210096 China Key Laboratory of Computer Network and Information Integration Southeast University Ministry of Education Nanjing210096 China The College of Software Engineering Southeast University Nanjing210096 China Laboratory of Image Science and Technology The School of Computer Science and Engineering Southeast University Nanjing210096 China Jiangsu Key Laboratory of Molecular and Functional Imaging Department of Radiology Zhongda Hospital Southeast University Nanjing210009 China Jiangsu Provincial Joint International Research Laboratory of Medical Information Processing School of Computer Science and Engineering Southeast University Nanjing210096 China NHC Key Laboratory of Medical Virology and Viral Diseases National Institute for Viral Disease Control and Prevention Chinese Center for Disease Control and Prevention Beijing China Beijing Institute of Tracking and Communication Technology Beijing100094 China

Modeling spatio-temporal sequences is an important topic yet challenging for existing neural networks. Most of the current spatio-temporal sequence prediction methods usually capture features separately in temporal and spatial dimensions or employ multiple mutually independent local spatio-temporal graphs to represent a spatio-temporal sequence. The first kind of method mentioned above is difficult to mine the complex spatio-temporal correlations, while the other is limited for the accuracy of long-term predictions. To handle these issues, this paper proposes a Transformer-Improved Graph Convolution Network for spatio-temporal prediction. Specifically, the temporal location encoding method is exploited to derive the spatio-temporal characteristics of the sequence utilizing a spatio-temporal feature fusion network. In addition, a spatio-temporal attention network is developed to enhance the spatio-temporal correlation of the sequence, and the dynamic spatial features of sequence are further extracted through the adaptive graph convolution network. A private dataset and a public dataset are employed to demonstrate the performance of the proposed TIGC-Net. The qualitative and quantitative results show that the proposed TIGC-Net can extract dynamic spatiotemporal properties more effectively, enhance the spatio-temporal correlation of sequences and improve the prediction accuracy compared with four stateof-the-art. © 2022, The Authors. All rights reserved.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Partial Attribute-Driven Video Person Re-Identification

Partial Attribute-Driven Video Person Re-Identification

引用

International Conference on Tools for Artificial Intelligence (ICTAI)

作者： Wanru Song Jieying Zheng Yahong Wu Changhong Chen Feng Liu Department of Telecommunication Nanjing Univ. Posts & Telecommun. Nanjing China Jiangsu Key Lab of Image Processing & Image Communications Nanjing Univ. Posts & Telecommun. Nanjing China

Person re-identification has gradually become a hot research topic in many fields, such as security, criminal investigation and video analysis. In this paper, we propose a novel feature extraction framework for video-based person re-identification, namely, the partial attribute-driven network (PADNet). The proposed method is based on the deep-learning architecture and incorporates the attribute and identity learning of the pedestrian. Existing attribute research always focuses on the feature representation at the global-level. Unlike them, first, the pedestrian is automatically partitioned to several body parts in our work. Then the pedestrian and his/her body parts are annotated by the global and partial attributes, respectively. Finally, we employ a four-branch multi-label network to explore the spatial-temporal cues of videos by utilizing these labeled samples. Extensive experiments are conducted on two video-based datasets, including PRID2011 and iLIDS-VID. The experimental results demonstrate the superiority and effectiveness of the proposed PADNet over the state-of-the-art approaches.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Building a mixed-lingual neural TTS system with only monolingual data

arXiv

引用

arXiv 2019年

作者： Xue, Liumeng Song, Wei Xu, Guanghui Xie, Lei Wu, Zhizheng Shaanxi Provincial Key Lab of Speech and Image Information Processing School of Computer Science Northwestern Polytechnical University Xi'an China ***

When deploying a Chinese neural Text-to-Speech (TTS) system, one of the challenges is to synthesize Chinese utterances with English phrases or words embedded. This paper looks into the problem in the encoder-decoder framework when only monolingual data from a target speaker is available. Specifically, we view the problem from two aspects: speaker consistency within an utterance and naturalness. We start the investigation with an average voice model which is built from multispeaker monolingual data, i.e., Mandarin and English data. On the basis of that, we look into speaker embedding for speaker consistency within an utterance and phoneme embedding for naturalness and intelligibility, and study the choice of data for model training. We report the findings and discuss the challenges to build a mixed-lingual TTS system with only monolingual data. Copyright © 2019, The Authors. All rights reserved.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

Joint DBF and SAO Parallel Filtering Based on Multithread Load Balancing

引用

Journal of Physics: Conference Series 2021年第1期1828卷

作者： Hao Ma Dong Hu Yi Li Jiangsu Province's Key Lab of Image Procession and Image Communications Nanjing 210003 China Education Ministry's Key Lab of Broadband Wireless Communication and Sensor Network Technology Nanjing 210003 China Education Ministry's Engineering Research Center of Ubiquitous Network and Heath Service Nanjing University of Posts and Telecommunications Nanjing 210003 China

This paper presents a joint parallel loop filtering algorithm based on multi-thread load balancing in HEVC decoding, which implements the parallel processing of deblocking filtering (DBF) and sample adaptive compensation (SAO). Because of the diversity of video, the texture of different regions in an image is also different, which leads to various CTU partition methods. Therefore, the number of the boundary to be filtered is greatly different, resulting the computation load among multiple threads unbalanced in parallel processing. To solve this problem, an area division scheme is proposed, which divides the image into multiple areas, and the number of boundaries to be filtered in each area is similar. Then, the mapping relationship table is used to allocate these areas to multiple threads for parallel processing, so as to achieve the load balancing among the filtering threads. Finally, the cache technology is used to combine DBF and SAO to reduce the delay between them and improve the overall parallelism of the loop filter. Experimental results show that the performance of the proposed load balancing joint filtering algorithm is 8.15% higher than the previous scheme.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Scale Adaptive Block Target Tracking Based on Multi-layer Convolution Features and Kernel Correlation Filter

引用

Journal of Physics: Conference Series 2021年第1期1828卷

作者： Ting Zhang Dong Hu Jing Zhang Jiangsu Province's Key Lab of Image Procession and Image Communications Nanjing 210003 China Education Ministry's Key Lab of Broadband Wireless Communication and Sensor Network Technology Nanjing 210003 China Education Ministry's Engineering Research Center of Ubiquitous Network and Heath Service Nanjing University of Posts and Telecommunications Nanjing 210003 China

Target tracking is currently a hot research topic in Computer Vision and has a wide range of use in many research fields. However, due to factors such as occlusion, fast motion, blur and scale variation, tracking method still needs to be deeply studied. In this paper, we propose a block target tracking method based on multi-convolutional layer features and Kernel correlation filter. Our method divides the tracking process into two parts: target position estimation and target scale estimation. First, we block the target frame based on the condition number. Second, we extract the features by the convolutional layer and apply it to the kernel correlation filter to get the center position of different block targets. With the reliability of different blocks measured by the Barker coefficient, the overall target position center is obtained. Then, the affine transformation is adopted to achieve the scale adaptation. The algorithm in this paper is evaluated by the public video sequences in OTB-2013. Numerous experimental results demonstrate that the proposed tracking method can achieve target scale adaptation and effectively improve the tracking accuracy.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Cross-receptive Focused Inference Network for Lightweight image Super-Resolution

arXiv

引用

arXiv 2022年

作者： Li, Wenjie Li, Juncheng Gao, Guangwei Deng, Weihong Zhou, Jiantao Yang, Jian Qi, Guo-Jun The Intelligent Visual Information Perception Laboratory Institute of Advanced Technology Nanjing University of Posts and Telecommunications Nanjing210046 China The Provincial Key Laboratory for Computer Information Processing Technology Soochow University Suzhou215006 China The School of Communication and Information Engineering Shanghai University Shanghai200444 China Jiangsu Key Laboratory of Image and Video Understanding for Social Safety Nanjing University of Science and Technology Nanjing210094 China The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China The State Key Laboratory of Internet of Things for Smart City Department of Computer and Information Science Faculty of Science and Technology University of Macau 999078 China The School of Computer Science and Technology Nanjing University of Science and Technology Nanjing210094 China The Research Center for Industries of the Future The School of Engineering Westlake University Hangzhou310024 China OPPO Research SeattleWA98101 United States

Recently, Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks due to the ability of global feature extraction. However, the capabilities of Transformers that need to incorporate contextual information to extract features dynamically are neglected. To address this issue, we propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer. Specifically, in the CT block, we first propose a CNN-based Cross-Scale Information Aggregation Module (CIAM) to enable the model to better focus on potentially helpful information to improve the efficiency of the Transformer phase. Then, we design a novel Cross-receptive Field Guided Transformer (CFGT) to enable the selection of contextual information required for reconstruction by using a modulated convolutional kernel that understands the current semantic information and exploits the information interaction within different self-attention. Extensive experiments have shown that our proposed CFIN can effectively reconstruct images using contextual information, and it can strike a good balance between computational cost and model performance as an efficient model. Source codes will be available at https://***/IVIPlab/CFIN. Copyright © 2022, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Exudate detection in fundus images via convolutional neural network 14th

Exudate detection in fundus images via convolutional neural ...

引用

14th International Forum of Digital TV and Wireless Multimedia communication, IFTC 2017

作者： Li, Guo Zheng, Shibao Li, Xinzhe Shanghai Key Labs of Digital Media Processing and Transmission Institute of Image Communication and Network Engineering Shanghai Jiao Tong University Shanghai200240 China

ISBN: (纸本)9789811081071

Exudate detection in fundus images is an important task for the screening of people with diabetic retinopathy. In this paper, Convolutional Neural Network (CNN) is used to detect the exudates in fundus images. An auxiliary loss for classification is designed to better train the CNN architecture. Besides, we use a boosted training method to improve and speed-up the CNN training. The trained model has been evaluated on our own annotated dataset and three public available databases, obtaining an AUC of 0.98, 0.96, 0.94, 0.91 respectively. © 2018, Springer Nature Singapore Pte Ltd.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Improving Mandarin End-to-End Speech Synthesis by Self-Attention and Learnable Gaussian Bias

Improving Mandarin End-to-End Speech Synthesis by Self-Atten...

引用

IEEE Workshop on Automatic Speech Recognition and Understanding

作者： Fengyu Yang Shan Yang Pengcheng Zhu Pengju Yan Lei Xie Shaanxi Provincial Key Laboratory of Speech and Image Information Processing School of Computer Science Northwestern Polytechnical University Xian China Tongdun AI Lab

ISBN: (数字)9781728103068

ISBN: (纸本)9781728103075

Compared to conventional speech synthesis, end-to-end speech synthesis has achieved much better naturalness with more simplified system building pipeline. End-to-end framework can generate natural speech directly from characters for English. But for other languages like Chinese, recent studies have indicated that extra engineering features are still needed for model robustness and naturalness, e.g, word boundaries and prosody boundaries, which makes the front-end pipeline as complicated as the traditional approach. To maintain the naturalness of generated speech and discard language-specific expertise as much as possible, in Mandarin TTS, we introduce a novel self-attention based encoder with learnable Gaussian bias in Tacotron. We evaluate different systems with and without complex prosody information and results show that the proposed approach has the ability to generate stable and natural speech with minimum language-dependent front-end modules.

关键词： Decoding Speech synthesis Pipelines Robustness Spectrogram Natural languages

来源：评论

学校读者我要写书评

暂无评论

A Siamese Network Tracking Algorithm Based on Hierarchical Attention Mechanism

引用

Journal of Physics: Conference Series 2021年第1期1828卷

作者： Hu Zhang Dong Hu Yingcan Qiu Jiangsu Province's Key Lab of Image Procession and Image Communications Nanjing 210003 China Nanjing University of Posts and Telecommunications Nanjing 210003 China Education Ministry's Key Lab of Broadband Wireless Communication and Sensor Network Technology Nanjing 210003 China Education Ministry's Engineering Research Center of Ubiquitous Network and Heath Service Nanjing 210003 China

A siamese network tracking algorithm based on hierarchical attention mechanism is proposed in this paper. In order to obtain more robust target tracking results, different layer features are fused effectively. In the process of extracting features, attention mechanism is used to recalibrate the feature map, and AdaBoost algorithm is used to weight the target feature map, which improves the reliability of the response map. Besides, the Inception module is also introduced which not only increases the width of the network and the adaptability of the siamese network to the scale, but also reduces the parameters and improves the speed of network training. Experimental results show that this method can effectively solve the impact of background clutter and improve the accuracy of tracking.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：