This paper describes an approach to solving the problem of fast pattern recognition with image co-ordinate detection and measurement under undefined noise and signal conditions. An analysis of the use of the W-transfo...
详细信息
ISBN:
(纸本)9781510672895;9781510672888
This paper describes an approach to solving the problem of fast pattern recognition with image co-ordinate detection and measurement under undefined noise and signal conditions. An analysis of the use of the W-transform method as a basis for image comparison algorithms was carried out. image comparison algorithms with noise robustness were developed.
The dominant method of processing sonar data is using image-based representations, requiring the preprocessing of image data on autonomous systems. We propose an alternative data processing method for remote sensing a...
详细信息
The image compression field is witnessing a shift in paradigm thanks to the rise of neural network-based models. In this context, the JPEG committee is in the process of standardizing the first learning-based image co...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
The image compression field is witnessing a shift in paradigm thanks to the rise of neural network-based models. In this context, the JPEG committee is in the process of standardizing the first learning-based image compression standard, known as JPEG AI. While most of the research to date has focused on image coding for humans, JPEG AI plans to address both human and machine vision by presenting several non-normative decoders addressing multiple imageprocessing and computer vision tasks in addition to standard reconstruction. While the impact of conventional image compression on computer vision tasks has already been addressed, no study has been conducted to assess the impact of learning-based image compression on such tasks. In this paper, the impact of learning-based image compression, including JPEG AI, on computer vision tasks is reviewed and discussed, mainly focusing on the image classification task along with object detection and segmentation. This study reviews the impact of image compression with JPEG AI on various computer vision models. It shows the superiority of JPEG AI over other conventional and learning-based compression models.
Face recognition has become an advanced area in the field of recognition and has various applications especially in biometrics, forensic investigations, smart advertising, national database for identity cards etc. Var...
详细信息
Evaluating the visual quality of distorted images is a crucial task in the field of image compression, as artifacts may significantly impair the appeal and fidelity of images, therefore reducing the user experience. A...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
Evaluating the visual quality of distorted images is a crucial task in the field of image compression, as artifacts may significantly impair the appeal and fidelity of images, therefore reducing the user experience. As assessing the visual quality of images through subjective visual quality experiments is often not feasible, objective image quality metrics are considered to be very attractive alternatives. Traditional objective image quality metrics, such as the Peak signal-to-Noise Ratio and Structural Similarity Index, have long been used to assess compression artifacts. However, due to the complexity of human perception, estimated objective visual quality scores often diverge from their subjective counterparts. Recent advancements in deep learning have led to the development of learning-based metrics that promise to estimate the perceived visual quality of images with better accuracy. While learning-based methods have demonstrated enhanced performance compared to conventional methods on a number of datasets, their generalization performance across different quality ranges and artifacts has not been assessed yet. This paper presents a benchmarking study of conventional and learning-based objective image quality metrics while focusing solely on image compression artifacts. The experimental framework includes five source images with various contents compressed with five legacy and recent compression algorithms at four different quality levels, specifically focusing on the high-quality range. Results indicate that, in many cases, learning-based metrics present a higher correlation with human visual perception when compared to conventional methods, highlighting the potential of integrating such metrics in the development and refinement of image compression techniques.
image inpainting is filling the missing or corrupted pixels in an image in a realistic way that cannot be differentiated by human eye. Deep learning is widely used in image inpainting and it exhibits better performanc...
详细信息
ISBN:
(纸本)9781665482370
image inpainting is filling the missing or corrupted pixels in an image in a realistic way that cannot be differentiated by human eye. Deep learning is widely used in image inpainting and it exhibits better performance than classical inpainting methods, but it requires high processing resources and longer time to train the model. In this paper, we propose an autoencoder architecture that outperforms other deep learning techniques in literature methods with lower processing and time complexity.
image recognition technology, as an important branch in the field of artificial intelligence, aims to process and analyze images to recognize objects in images. The aim of this paper is to summarize the general approa...
详细信息
In this study, we present a novel method for pinpointing landmarks in x-ray images, which simultaneously offers computational efficiency and localization precision. Our method leverages a cyclic coordinate-guided stra...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In this study, we present a novel method for pinpointing landmarks in x-ray images, which simultaneously offers computational efficiency and localization precision. Our method leverages a cyclic coordinate-guided strategy that requires fewer model parameters and lower computational costs than traditional heatmap-based supervised methods. This is crucial for medical imaging applications where imaging devices often have limited computational resources yet require high-precision landmark localization. Our methodology involves a two-stage process that employs cyclic inference to optimize landmark localization. In the first stage, non-uniform sampling is used to capture the multi-scale features of landmarks. This is followed by a second stage in which cyclic training fine-tunes the landmark coordinates towards their optimal positions. Our results indicate that our two-stage process achieves competitive localization performance with state-of-the-art methods yet with added benefits of lower computational overhead and smaller parameter count. Additionally, a global block was developed to capture global position information of landmarks, and experiments showed its effectiveness and its contribution in enhancing the model's landmark localization accuracy. We validated our method using two publicly available datasets, and the source code for our experiments is available on GitHub: https://***/switch626/***.
Current video coding standards, including H.264/AVC, HEVC, and VVC, utilize discrete cosine transform (DCT), discrete sine transform (DST), to decorrelate the intra-prediction residuals. However, these transforms ofte...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
Current video coding standards, including H.264/AVC, HEVC, and VVC, utilize discrete cosine transform (DCT), discrete sine transform (DST), to decorrelate the intra-prediction residuals. However, these transforms often face challenges in effectively decorrelating signals with complex, non-smooth, and non-periodic structures. Even in smooth areas, an abrupt transition (due to noise or prediction artifacts) can limit their effectiveness. This paper presents a novel block-adaptive separable path graph-based transform (GBT) that is particularly adept at handling such signals. This new method focuses on adaptively modifying the block size and learning GBT to enhance the performance. The GBT is learned in an online scenario using sequential K-means clustering, where each available block size has K clusters and K GBT kernels. This approach allows the GBT to be dynamically learned for the current block based on previously reconstructed areas with same block size and similar characteristics. Our evaluation, integrating this method with H.264/AVC intra-coding tools, shows significant improvement over the traditional H.264/AVC DCT in processing high-resolution natural images.
The use of DNA molecules as a storage medium has been recently proposed as a solution to the exponentially increasing demand for data storage, achieving lower energy consumption while offering better lifespan and high...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
The use of DNA molecules as a storage medium has been recently proposed as a solution to the exponentially increasing demand for data storage, achieving lower energy consumption while offering better lifespan and higher information density. The nucleotides composing the molecules can be regarded as quaternary symbols, but constraints are generally imposed to avoid sequences prone to errors during synthesis, archival, and sequencing. While the majority of previous works in the field have proposed methods for translating general binary data into nucleotides, others have presented algorithms tailored for specific data types such as images as well as joining source and channel coding into a single process. This paper proposes and evaluates a method that integrates DNA Fountain codes with state-of-the-art compression techniques, targeting the storage of images and three-dimensional point clouds. Results demonstrate that the proposed method outperforms previous techniques for coding images directly into DNA, putting forward a first benchmark for the coding of point clouds.
暂无评论