Aurora spectral image lossless compression has seen significant advancements in recent years. However, most compression algorithms are based on traditional image compression techniques, focusing solely on spectral and...
详细信息
Room layout estimation seeks to infer the overall spatial configuration of indoor scenes using perspective or panoramic images. As the layout is determined by the dominant indoor planes, this problem inherently requir...
详细信息
Aurora spectral image lossless compression has seen significant advancements in recent years. However, most compression algorithms are based on traditional image compression techniques, focusing solely on spectral and...
详细信息
ISBN:
(数字)9798350387384
ISBN:
(纸本)9798350387391
Aurora spectral image lossless compression has seen significant advancements in recent years. However, most compression algorithms are based on traditional image compression techniques, focusing solely on spectral and spatial correlations without considering temporal correlations. To further enhance compression performance, this paper proposes a Transformer-based point-by-point prediction algorithm for auroral spectral images, which simultaneously uses spatial, spectral and temporal contexts for prediction. Our prediction network consists of two parts: an encoder and a decoder. The encoder is composed of the encoding unit of the original Transformer and is used to extract features from spatial, spectral and temporal contexts. The decoder consists of fully connected layers and is used for prediction. Experimental results show that the average bitrate of this method is reduced by 0.179 bpp compared with the JPEG2000 algorithm, and the average bitrate is reduced by 0.054 bpp compared with the online DPCM algorithm.
The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-sca...
详细信息
ISBN:
(数字)9798331529543
ISBN:
(纸本)9798331529550
The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-scale graph transformer network (MGTN) for 3D point cloud semantic segmentation. First, a multi-scale graph convolution (MSG-Conv) is devised to address the limitations faced by existing methods when extracting local and global features of point cloud data with varying densities simultaneously. Subsequently, we employ a graph-transformer (G-T) module to enhance edge details and spatial position information in the point cloud, thereby improving recognition accuracy for small objects and confusing elements such as columns and beams. Extensive testing on ShapeNet parts and S3DIS datasets was conducted to demonstrate the effectiveness of MGTN. Compared to the baseline network DGCNN, our proposed MGTN achieves substantial performance improvements, as evidenced by notable increases in mIoU of 1.5% and 18.5% on the ShapeNet parts and S3DIS datasets respectively. Additionally, MGTN outperforms the recent CFSA- Net by 2.3% and 3.4% on OA and mIoU respectively.
Nowadays, text detection has infiltrated various industries like banking, education, criminal investigation, network public opinion, and more. However, the traditional way of text detection is largely dependent on the...
详细信息
Variational Autoencoder (VAE) is now popular in modeling and language generation tasks, which need to pay attention to the diversity of generation results. The existing models are insufficient in capturing the built-i...
详细信息
For the face inpainting under arbitrary shape occlusion, the existing methods are easy to produce edge blur and distortion of the inpainting results. In this paper, an algorithm for face inpainting combining structure...
详细信息
作者:
Li, DaxiangMeng, RuiLiu, YingXi'An University of Posts and Telecommunications
Ministry of Public Security Key Laboratory of Electronic Information Application Technology for Scene Investigation Xi'An University of Posts and Telecommunications School of Telecommunication and Information Engineering Xi'an710121 China Xi'An University of Posts and Telecommunications
College Key Laboratory of Electronic Information Application Technology for Scene Investigation Ministry of Public Security Center for Image and Information Processing Xi'an710121 China
In this paper, we propose a new siamese long short-term memory (SLSTM) network model to solve the problem that the recognition accuracy is affected by the occlusion of pedestrian images. The backbone network is compos...
详细信息
With the rapid development of document digitization, people have become accustomed to capturing and processing documents using electronic devices such as smartphones. However, the captured document images often suffer...
With the rapid development of document digitization, people have become accustomed to capturing and processing documents using electronic devices such as smartphones. However, the captured document images often suffer from issues like shadows and noise due to environmental factors, which can affect their readability. To improve the quality of captured document images, researchers have proposed a series of models or frameworks and applied them in distinct scenarios such as image enhancement, and document information extraction. In this paper, we primarily focus on shadow removal methods and open-source datasets. We concentrate on recent advancements in this area, first organizing and analyzing nine available datasets. Then, the methods are categorized into conventional methods and neural network-based methods. Conventional methods use manually designed features and include shadow map-based approaches and illumination-based approaches. Neural network-based methods automatically generate features from data and are divided into single-stage approaches and multi-stage approaches. We detail representative algorithms and briefly describe some typical techniques. Finally, we analyze and discuss experimental results, identifying the limitations of datasets and methods. Future research directions are discussed, and nine suggestions for shadow removal from document images are proposed. To our knowledge, this is the first survey of shadow removal methods and related datasets from document images.
Super resolution reconstruction of human face is a cost effective way to obtain high resolution images from its corresponding low resolution face. It is also known as face illusion. In order to obtain clearer texture ...
详细信息
暂无评论