As an important part of attitude estimation for space target, the accuracy of ISAR image registration directly affects the result of attitude estimation. In this paper, we analyze the model of ISAR image registration ...
详细信息
In the realm of deep learning, the traditional approach has been to train specialized models for individual tasks, which, although effective, is resource-intensive. The advent of large, universal models has mitigated ...
详细信息
In the contemporary age of advanced technology, imageprocessing plays a vital role as an efficient and indispensable tool across industries, enterprises, monitoring systems, and various applications. This study focus...
详细信息
Automated text-to-image (image synthesis) and image-to-text (image captioning) generation are two of the most challenging and cutting-edge fields of study in computer Vision (CV) in conjunction with Natural Language P...
详细信息
Light field (LF) rescaling is indispensable in accommodating different LF image resolutions for different applications. Unlikely most recent studies which only execute learned LF upscaling from a predefined downscalin...
详细信息
ISBN:
(纸本)9781728198354
Light field (LF) rescaling is indispensable in accommodating different LF image resolutions for different applications. Unlikely most recent studies which only execute learned LF upscaling from a predefined downscaling method, we propose a novel LF rescaling framework by jointly optimizing learned LF downscaling and upscaling as a combined task. Specifically, our light field rescaling network (LFRN) simultaneously extracts features from different 2D subspaces of LF data (e.g., spatial-angular and epipolar subspaces) to fully handle 4D LF image information. Our newly designed attention fusion module (AFM) adaptively combines these two data features based on learnable embedding weights. Due to joint optimization of the learned LF downscaling and upscaling tasks, our LFRN method can achieve significant performance gain in both objective and subjective visual qualities compared to conventional predefined downscaling with learned LF upscaling task.
Based on Kinect motion capture technology, this paper developed a set of auxiliary teaching software suitable for Tai Chi practice in view of the movement requirements of beginners. This teaching system can show stude...
Based on Kinect motion capture technology, this paper developed a set of auxiliary teaching software suitable for Tai Chi practice in view of the movement requirements of beginners. This teaching system can show students the standard moves of Tai Chi. A Kinect software system based on motion capture technology is established to realize the standardized measurement of motion capture. The collected data are filtered and de-nosed and stored to form a standard motion database. By collecting the data of the subjects' tai chi movement process, the subjects' movement process is presented in the form of color images in the interface window. The bone data is corresponding to the color image, and the key nodes in the image are identified. After the two sets of data are collected, the Dynamic Time Wide (DTW) method is used to deal with the corresponding frame matching problem. By assigning the rotation information of the connection points to the corresponding Unity3D human body model. Experiments show that the method adopts motion capture technology, DTW and other methods, and has been effectively coordinated with Unity3 3D simulation software for practice and scoring.
image matching is a fundamental and critical task in various visual applications, such as Simultaneous Localization and Mapping (SLAM) and image retrieval, which require accurate pose estimation. However, most existin...
ISBN:
(纸本)9798350307184
image matching is a fundamental and critical task in various visual applications, such as Simultaneous Localization and Mapping (SLAM) and image retrieval, which require accurate pose estimation. However, most existing methods ignore the occlusion relations between objects caused by camera motion and scene structure. In this paper, we propose Occ(2)Net, a novel image matching method that models occlusion relations using 3D occupancy and infers matching points in occluded regions. Thanks to the inductive bias encoded in the Occupancy Estimation (OE) module, it greatly simplifies bootstrapping of a multi-view consistent 3D representation that can then integrate information from multiple views. Together with an Occlusion- Aware ( OA) module, it incorporates attention layers and rotation alignment to enable matching between occluded and visible points. We evaluate our method on both real-world and simulated datasets and demonstrate its superior performance over state-of- the-art methods on several metrics, especially in occlusion scenarios.
Low-light image enhancement aims at improving human perception or the effectiveness of computer vision tasks of images taken in dark. The low-light images are usually seriously lack in visual information. To tackle th...
详细信息
ISBN:
(纸本)9781728198354
Low-light image enhancement aims at improving human perception or the effectiveness of computer vision tasks of images taken in dark. The low-light images are usually seriously lack in visual information. To tackle this problem, we propose a general Low-light image Enhancement Transformer Network (LLIEFormer) with a degraded restoration model in this paper. The network of LLIEFormer synthesizes the advantages of Transformer to extract global information and convolutional neural networks to capture local details. We conduct extensive experiments on various low-illumination enhanced datasets including PairL1.6K and FiveK to demonstrate the effectiveness of our method. The results show that our LLIEFormer has better performance and wider applicability than other advanced methods. Our code will be available at https://***/xunpengyi/LLIEFormer.
Approximate computing has become a widely recognized method for designing energy-efficient arithmetic architectures in the context of error-tolerant applications. This paper presents the design and analysis of a 4-bit...
详细信息
image completion with large-scale free-form missing regions is one of the most challenging tasks for the computer vision community. While researchers pursue better solutions, drawbacks such as pattern unawareness, blu...
详细信息
ISBN:
(纸本)9781665493468
image completion with large-scale free-form missing regions is one of the most challenging tasks for the computer vision community. While researchers pursue better solutions, drawbacks such as pattern unawareness, blurry textures, and structure distortion remain noticeable, and thus leave space for improvement. To overcome these challenges, we propose a new StyleGAN-based image completion network, Spectral Hint GAN (SH-GAN), inside which a carefully designed spectral processing module, Spectral Hint Unit, is introduced. We also propose two novel 2D spectral processing strategies, Heterogeneous Filtering and Gaussian Split that well-fit modern deep learning models and may further be extended to other tasks. From our inclusive experiments, we demonstrate that our model can reach FID scores of 3.4134 and 7.0277 on the benchmark datasets FFHQ and Places2, and therefore outperforms prior works and reaches a new state-of-the-art. We also prove the effectiveness of our design via ablation studies, from which one may notice that the aforementioned challenges, i.e. pattern unawareness, blurry textures, and structure distortion, can be noticeably resolved. Our code will be open-sourced at: https://***/SHI-Labs/SH-GAN.
暂无评论