检索结果-内蒙古大学图书馆

17th IEEE International Conference on Signal processing, ICSP 2024

作者： Cao, xinlong Li, Jiaojiao Li, Yuanqing Yan, Jiapeng Zheng, Zhenxing Xi'an University of Posts & Telecommunications School of Communications and Information Engineering School of Artificial Intelligence Xi'an710121 China Xi'an Key Laboratory of Image Processing Technology and Applications for Public Security Xi'an710121 China International Joint Research Center for Wireless Communication and Information Processing Technology of Shaanxi Province Xi'an710121 China

ISBN: (纸本)9798350387384

Aurora spectral image lossless compression has seen significant advancements in recent years. However, most compression algorithms are based on traditional image compression techniques, focusing solely on spectral and spatial correlations without considering temporal correlations. To further enhance compression performance, this paper proposes a Transformer-based point-by-point prediction algorithm for auroral spectral images, which simultaneously uses spatial, spectral and temporal contexts for prediction. Our prediction network consists of two parts: an encoder and a decoder. The encoder is composed of the encoding unit of the original Transformer and is used to extract features from spatial, spectral and temporal contexts. The decoder consists of fully connected layers and is used for prediction. Experimental results show that the average bitrate of this method is reduced by 0.179 bpp compared with the JPEG2000 algorithm, and the average bitrate is reduced by 0.054 bpp compared with the online DPCM algorithm. © 2024 IEEE.

关键词： image compression

来源：评论

学校读者我要写书评

暂无评论

C2P-Net: Comprehensive Depth Map to Planar Depth Conversion for Room Layout Estimation

引用

IEEE Transactions on Pattern Analysis and Machine Intelligence 2025年第7期PP卷 PP页

作者： Zhang, Weidong Zhou, Mengjie Cheng, Jiyu Liu, Ying Zhang, Wei Xi'an University of Posts & Telecommunications School of Communications and Information Engineering China Xi'an Key Laboratory of Image Processing Technology and Applications for Public Security China Shandong University School of Control Science and Engineering China Ministry of Education Key Laboratory of Machine Intelligence and System Control China

Room layout estimation seeks to infer the overall spatial configuration of indoor scenes using perspective or panoramic images. As the layout is determined by the dominant indoor planes, this problem inherently requires the reconstruction of these planes. Some studies reconstruct indoor planes from perspective images by learning pixel-level or instance-level plane parameters. However, directly learning these parameters has the problems of susceptibility to occlusions and position dependency. In this paper, we introduce the Comprehensive depth map to Planar depth (C2P) conversion, which reformulates planar depth reconstruction into the prediction of a comprehensive depth map and planar visibility confidence. Based on the parametric representation of planar depth we propose, the C2P conversion is applicable to both panoramic and perspective images. Accordingly, we present an effective framework for room layout estimation that jointly learns the comprehensive depth map and planar visibility confidence. Due to the differentiability of the C2P conversion, our network autonomously learns planar visibility confidence by constraining the estimated plane parameters and reconstructed planar depth map. We further propose a novel approach for 3D layout generation through sequential planar depth map integration. Experimental results demonstrate the superiority of our method across all evaluated panoramic and perspective datasets. © 1979-2012 IEEE.

关键词： image-Text Matching Medical Report Generation Sample-graph Consistency Self-boosting framework

来源：评论

学校读者我要写书评

暂无评论

Transformer-based Lossless Compression of Aurora Spectral images: A Spatial-Temporal-Spectral Joint Approach

Transformer-based Lossless Compression of Aurora Spectral Im...

引用

International Conference on Signal processing Proceedings (ICSP)

作者： xinlong Cao Jiaojiao Li Yuanqing Li Jiapeng Yan Zhenxing Zheng School of Communications and Information Engineering & School of Artificial Intelligence Xi'an University of Posts & Telecommunications Xi'an China Xi’an Key Laboratory of Image Processing Technology and Applications for Public Security Xi'an China International Joint Research Center for Wireless Communication and Information Processing Technology of Shaanxi Province Xi'an China

ISBN: (数字)9798350387384

ISBN: (纸本)9798350387391

关键词： Ion radiation effects image coding Correlation Signal processing algorithms Transformers Prediction algorithms Feature extraction Decoding Magnetosphere Compression algorithms

来源：评论

学校读者我要写书评

暂无评论

MGTN: Multi-scale Graph Transformer Network for 3D Point Cloud Semantic Segmentation

MGTN: Multi-scale Graph Transformer Network for 3D Point Clo...

引用

IEEE Visual Communications and image processing (VCIP)

作者： Da Ai Siyu Qin Zihe Nie Hui Yuan Ying Liu Xi’an Key Laboratory of Image Processing Technology and Applications for Public Security Xi’an University of Posts and Telecommunications Xi’an China School of Communication and Information Engineering Xi’an University of Posts and Telecommunications Xi’an China School of Control Science and Engineering Shandong University Jinan China

ISBN: (数字)9798331529543

ISBN: (纸本)9798331529550

The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-scale graph transformer network (MGTN) for 3D point cloud semantic segmentation. First, a multi-scale graph convolution (MSG-Conv) is devised to address the limitations faced by existing methods when extracting local and global features of point cloud data with varying densities simultaneously. Subsequently, we employ a graph-transformer (G-T) module to enhance edge details and spatial position information in the point cloud, thereby improving recognition accuracy for small objects and confusing elements such as columns and beams. Extensive testing on ShapeNet parts and S3DIS datasets was conducted to demonstrate the effectiveness of MGTN. Compared to the baseline network DGCNN, our proposed MGTN achieves substantial performance improvements, as evidenced by notable increases in mIoU of 1.5% and 18.5% on the ShapeNet parts and S3DIS datasets respectively. Additionally, MGTN outperforms the recent CFSA- Net by 2.3% and 3.4% on OA and mIoU respectively.

关键词： Point cloud compression Three-dimensional displays Visual communication Convolution Semantic segmentation Semantics Transformers Feature extraction Data mining Testing

来源：评论

学校读者我要写书评

暂无评论

A Survey of Text Detection Algorithms in images Based on Deep Learning 4

A Survey of Text Detection Algorithms in Images Based on Dee...

引用

4th International Conference on Natural Language processing, ICNLP 2022

作者： Li, Linna Hu, Cuicui Liu, Ying Xi'an University of Posts and Telecommunications Network Public Opinion Monitoring and Analysis Center Shaanxi Xi'an710121 China Xi'an University of Posts and Telecommunications College Key Laboratory of Electronic Information Application Technology for Scene Investigation Ministry of Public Security Center for Image and Information Processing Shaanxi Xi'an710121 China

ISBN: (数字)9781665495448

ISBN: (纸本)9781665495448

Nowadays, text detection has infiltrated various industries like banking, education, criminal investigation, network public opinion, and more. However, the traditional way of text detection is largely dependent on the characteristics of manual design, which is time-consuming, laborious and inaccuracy. With the emergence of deep learning technology, text detection has already been transformed from text detection in static documents to more practical text detection in natural scenes. To address such challenges as complex background, improper acquisition, and multilingual text in natural scene images, there have been various improved neural network structures constructed. Especially, the performance of text detection in various scenes has been significantly improved. In learning in different scenes are summarized in detail. According to the different text objects in detection, the existing methods are classified into two categories as top-down and bottom-up, and the logic structure, advantages and disadvantages of various algorithms are summarized. In addition, the commonly used data sets and performance evaluation indexes are analyzed and explained, the practical application of text detection in different fields is introduced, and the future trend of research and development are indicated. © 2022 IEEE.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Topic Embedded Representation Enhanced Variational Wasserstein Autoencoder for Text Modeling 5

Topic Embedded Representation Enhanced Variational Wasserste...

引用

5th IEEE International Conference on Electronics technology, ICET 2022

作者： xiang, Zheng Liu, xiaoming Yang, Guan Liu, Yang Zhongyuan University of Technology School of Computer Science Zhengzhou China Henan Key Laboratory on Public Opinion Intelligent Analysis Zhengzhou China Key Laboratory of text processing and image understanding Zhengzhou China Xidian University State Key Laboratory of Integrated Services Networks Xi'an China Shandong University Key Lab of Cryptologic Technology and Information Security Ministry of Education Xi'an China

ISBN: (纸本)9781665485081

Variational Autoencoder (VAE) is now popular in modeling and language generation tasks, which need to pay attention to the diversity of generation results. The existing models are insufficient in capturing the built-in relationships between topic representation and sequential words. At the same time, there is a massive contradiction between the commonly used simple Gaussian prior and the actual complex distribution of languages. To address the above problems, we introduce a hybrid Wasserstein Autoencoder (WAE) with Topic Embedded Representation (TER) for modeling. TER is obtained through an embedding-based topic model and can capture the dependencies and semantic similarities between topics and words. In this case, the learned latent variable has rich semantic knowledge with the help of TER and is easier to explain and control. Our experiments show that our method is competitive with other VAEs in modeling. © 2022 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Face Inpainting Combining Structured Forest Edge Information and Gated Convolution 3

Face Inpainting Combining Structured Forest Edge Information...

引用

3rd International Conference on Natural Language processing, ICNLP 2021

作者： Wang, Fuping Li, Wenlou Liu, Ying Gong, Yanchao Gao, Ziming Lu, Jin Center for Image and Information Processing Xi'An University of Posts and Telecommunications College Key Laboratory of Electronic Information Application Technology for Scene Investigation Ministry of Public Security Shaanxi Xi'an710121 China Center for Image and Information Processing Xi'An University of Posts and Telecommunications Shaanxi Xi'an710121 China

ISBN: (纸本)9781665414111

For the face inpainting under arbitrary shape occlusion, the existing methods are easy to produce edge blur and distortion of the inpainting results. In this paper, an algorithm for face inpainting combining structured forest edge information and gated convolution is proposed. Firstly, the edge contour of the occluded area is reconstructed by prior face knowledge to constrain the process of face inpainting. Secondary, the gated convolution holds the ability to extract accurate local feature when some pixels were missed, then a gated convolution based Generative Adversarial Network (GAN) for face inpainting is designed. The model consists of two parts: edge connection network and image inpainting network. The edge connection network accomplish the automatic completion and connection of the missing edge image. The image inpainting network takes the completed edge image as the guidance information, and combines the occlusion image to repair the missing face area. Compared with others, the experimental results show that the proposed algorithm has more precise detail information and better visual quality. © 2021 IEEE.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Person Re-identification Algorithm Based on Siamese LSTM 3

Person Re-identification Algorithm Based on Siamese LSTM

引用

3rd International Conference on Natural Language processing, ICNLP 2021

作者： Li, Daxiang Meng, Rui Liu, Ying Xi'An University of Posts and Telecommunications Ministry of Public Security Key Laboratory of Electronic Information Application Technology for Scene Investigation Xi'An University of Posts and Telecommunications School of Telecommunication and Information Engineering Xi'an710121 China Xi'An University of Posts and Telecommunications College Key Laboratory of Electronic Information Application Technology for Scene Investigation Ministry of Public Security Center for Image and Information Processing Xi'an710121 China

ISBN: (纸本)9781665414111

In this paper, we propose a new siamese long short-term memory (SLSTM) network model to solve the problem that the recognition accuracy is affected by the occlusion of pedestrian images. The backbone network is composed of two modules, CNN and LSTM, in order to use the local detail information of pedestrian images, firstly, the image is divided into blocks and the features of each image sub-block are extracted, then use the LSTM module to learn the dependency relationship between the features of the pedestrian image block, obtain the feature representation of pedestrian images through memory coding;at the same time, A new loss function is designed by combining the new triplet verification loss and Softmax recognition loss to reduce the intra-class difference and increase the inter-class difference. Comparative experiments were carried out on three pedestrian benchmark datasets of PRID-2011, Market-1501 and CUHK03. The experimental results show that the model proposed in this paper can reduce the impact of occlusion phenomena on re-identification, and the experimental results are better than other state-of-the-art methods. © 2021 IEEE.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

A comprehensive survey on shadow removal from document images: datasets, methods, and opportunities

引用

Vicinagearth 2025年第1期2卷 1-18页

作者： Wang, Bingshu Li, Changping Zou, Wenbin Zhang, Yongjun Chen, Xuhang Chen, C.L. Philip School of Software Northwestern Polytechnical University Xi’an China Guangdong Provincial Key Laboratory of Intelligent Information Processing & Shenzhen Key Laboratory of Media Security Shenzhen University Shenzhen China Guangdong Key Laboratory of Intelligent Information Processing College of Electronics and Information Engineering Shenzhen University Shenzhen China Yongjun Zhang is with the State Key Laboratory of Public Big Data College of Computer Science and Technology Guizhou University Guiyang China School of Computer Science and Engineering Huizhou University Huizhou China School of Computer Science and Engineering South China University of Technology and Pazhou Lab Guangzhou China

With the rapid development of document digitization, people have become accustomed to capturing and processing documents using electronic devices such as smartphones. However, the captured document images often suffer from issues like shadows and noise due to environmental factors, which can affect their readability. To improve the quality of captured document images, researchers have proposed a series of models or frameworks and applied them in distinct scenarios such as image enhancement, and document information extraction. In this paper, we primarily focus on shadow removal methods and open-source datasets. We concentrate on recent advancements in this area, first organizing and analyzing nine available datasets. Then, the methods are categorized into conventional methods and neural network-based methods. Conventional methods use manually designed features and include shadow map-based approaches and illumination-based approaches. Neural network-based methods automatically generate features from data and are divided into single-stage approaches and multi-stage approaches. We detail representative algorithms and briefly describe some typical techniques. Finally, we analyze and discuss experimental results, identifying the limitations of datasets and methods. Future research directions are discussed, and nine suggestions for shadow removal from document images are proposed. To our knowledge, this is the first survey of shadow removal methods and related datasets from document images.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Densely Connected Face Super-Resolution Network Based on Attention Mechanism 15

A Densely Connected Face Super-Resolution Network Based on A...

引用

15th IEEE Conference on Industrial Electronics and applications, ICIEA 2020

作者： Liu, Ying Dong, Zhanlong Pang Lim, Keng Ling, Nam Ministry of Public Security Key Laboratory of Electronic Information Application Technology for Scene Investigation Xi'an Shaanxi China Xi'an University of Posts Telecommunications Center for Image and Information Processing Xi'an China

ISBN: (纸本)9781728151694

Super resolution reconstruction of human face is a cost effective way to obtain high resolution images from its corresponding low resolution face. It is also known as face illusion. In order to obtain clearer texture details, this paper proposes a densely connected super-resolution algorithm based on attention mechanism which consists of feature extraction and image reconstruction. By integrating channel and spatial domain information of the feature map, the Multi Attention Domain Module (MADM) is proposed: Features are weighted and recombined by analyzing the relationship between channels and spatial information of feature maps. The features of different layers are fused using dense connections. Experimental results show that the proposed algorithm can improve by up to 0.5dB in PSNR and the reconstructed face image has clearer texture details compared to existing algorithms. © 2020 IEEE.

关键词： Textures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：