Traditional image coding standards are typically optimized with a focus on human perception, which conflicts with the fact that most of the images are now analyzed by machines. To enable a variety of downstream intell...
详细信息
images captured by thermal cameras are independent of lighting conditions. However, it can be challenging for human examiners to identify thermal face photos. Facial recognition technology enables automatic identifica...
详细信息
ISBN:
(数字)9783031585357
ISBN:
(纸本)9783031585340;9783031585357
images captured by thermal cameras are independent of lighting conditions. However, it can be challenging for human examiners to identify thermal face photos. Facial recognition technology enables automatic identification or verification of individuals in digital images or video frames extracted from video sequences. There are multiple methods employed by facial recognition systems, but they typically involve comparing the features extracted from a specific image with those stored in a database. This technology finds application in various areas, including access control and identification systems. It is worth noting that facial features can exhibit unique characteristics specific to an individual throughout their lifetime. In this paper, the process of colorizing thermal facial images into the visible spectrum based on Cycle GAN is undertaken. There are many variations of the GAN but to translate or map from the one domain image into another domain image the cycle GAN fits with its application. CycleGAN aims to acquire knowledge of the relationship between two distinct image collections originating from separate domains, each possessing unique styles, textures, or visual attributes. The RGB-GAN which is proposed in this paper refers to the red, green, and blue channel generative adversarial network that individually takes the independent images in the thermal format to colorize in the independent domain channel network-merging three generated channel results combined to make one RGB-colored image. One more generator network involves identifying and comparing the result with the original visible colored image to give feedback to the network. At last after training the final model, the classification task involves generating and classifying the correct person out group of persons when the thermal image is given as input. The output includes face recognition accuracy of generated images, comparative analysis with protocols and state-of-the-art techniques.
作者:
Shalupriya, M.Indira, K.P.Saveetha University
Saveetha School of Engineering Saveetha Institute of Medical and Technical Sciences Department of Electronics and Communication Engineering Chennai India Saveetha University
Saveetha School of Engineering Saveetha Institute of Medical and Technical Sciences Department of Nanobiomaterials Chennai India
In micro turning, tool placement, wear, and breakage are crucial as they affect precision and surface finish. Flank wear is problematic since it impacts dimensional accuracy, and current methods that rely on CNC encod...
详细信息
This study proposes a water quality detection system that combines imageprocessing and Convolutional Neural Network (CNN) models to accurately identify and classify water quality based on visual features. The quality...
详细信息
One of the most important information needed while performing unmanned aerial vehicles (UAV) operations is about the platform location and the environment. Such platforms mostly use GNSS signals outdoors. However, in ...
详细信息
ISBN:
(数字)9781665450928
ISBN:
(纸本)9781665450928
One of the most important information needed while performing unmanned aerial vehicles (UAV) operations is about the platform location and the environment. Such platforms mostly use GNSS signals outdoors. However, in indoor areas where GNSS signals cannot be received or in situations where signals are jammed, it is not possible to obtain location information using these signals. For that reason, alternative navigation systems have become so crucial. One of the most preferred systems among navigation technologies is the visual simultaneous localization and mapping (vSLAM) method performed using RGB cameras on the UAVs. In this study, an open monocular image dataset called AG-Mono was created and published online to test the performance of vSLAM algorithms. This dataset was created at three different exposure times using a handheld platform, and it includes video sequences at 640x480 image resolution. The experimental area where the images were created is a closed corridor with 16.5 x 4.5 meters and four sharp corners.
Entropy modeling plays an important role in estimating the rates of latent representations and optimizing the rate-distortion performance for learned image compression. Autoregression modules are demonstrated to elimi...
详细信息
High spatial resolution is necessary for several applications such as visual inspection. However, the conflict between resolution and image distance limits the applications of image devices. In this paper, a super-res...
详细信息
Synthetic Aperture Radar (SAR) is an active and coherent imaging system that utilizes radio waves to illuminate the Earth's surface, generating complex-valued images of the ground. It is widely employed in environ...
详细信息
The CNN-based end-To-end learned image compression methods have already achieved a significant improvement in terms of coding efficiency. Moreover, with the capability of modeling long-range global correlation, the tr...
详细信息
Medical visual Question Answering (Med-VQA) is a task in the field of Artificial Intelligence where a medical image is given with a related question, and the task is to provide an accurate answer to the question. It i...
详细信息
ISBN:
(纸本)9789819913534;9789819913541
Medical visual Question Answering (Med-VQA) is a task in the field of Artificial Intelligence where a medical image is given with a related question, and the task is to provide an accurate answer to the question. It involves the integration of computer vision, natural language processing, and medical domain knowledge. Furthermore, incorporating medical knowledge in Med-VQA can improve the reasoning ability and accuracy of the answers. While knowledge-enhanced visual Question Answering (VQA) in the general domain has been widely researched, medical VQA requires further examination due to its unique features. In the paper, we gather information on and analyze the current publicly accessible Med-VQA datasets with external knowledge. We also critically review the key technologies combined with knowledge in Med-VQA tasks in terms of the advancements and limitations. Finally, we discuss the existing challenges and future directions for Med-VQA.
暂无评论