Quantum Machine Learning (QML) leverages quantum mechanical properties to enhance computational capabilities. With its emergence, there is a need to integrate QML models into machine learning pipelines for real-life a...
详细信息
ISBN:
(纸本)9798331541378
Quantum Machine Learning (QML) leverages quantum mechanical properties to enhance computational capabilities. With its emergence, there is a need to integrate QML models into machine learning pipelines for real-life applications such as imageprocessing. While standalone programs exist to demonstrate the performance of QML models, a well-defined model workflow is noticeably absent. We present piQture, an open-source Python and Qiskit-based library that streamlines the implementation and training of QML models tailored for imageprocessing. its design and structure prioritize usability among users familiar with classical machine learning without prior QML experience.
The security of the color images is challenged by the illegal copying and distribution over various communication networks. This work proposed a novel color image encryption scheme based on the chaos and Deoxyribonucl...
详细信息
ISBN:
(纸本)9798350350920
The security of the color images is challenged by the illegal copying and distribution over various communication networks. This work proposed a novel color image encryption scheme based on the chaos and Deoxyribonucleic acid (DNA). Firstly, A novel four dimensions chaotic system is constructed and analyzed. Secondly, the DNA coding rules are controlled by the chaos-based pseudorandom sequence to substitute the image pixels. The column and row rotation are designed to disturb the image pixels. Finally, The DNA matrix coded from the color image is divided into blocks to implement the XOR and addition operation with the pseudorandom sequence. its security is simulated and compared with other schemes. The comparison show that the proposed scheme has a higher key sensitivity, lower correlation of adjacent pixels, and higher NPCR and information entropy. Therefore, the proposed image encryption scheme is more capable against statical attacks and efficiency while keeping high robustness.
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional Neural Networks (CNNs), have...
详细信息
ISBN:
(纸本)9798350350920
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional Neural Networks (CNNs), have revolutionized HSI classification by extracting intangible semantic features and maintaining the spatial structure during feature extraction. However, the efficacy of these techniques can be constrained by the limited availability of labeled samples in HSI data. To address the issue of small-sample HSI classification, a Lightweight Multiscale Feature Fusion Network (L-MFFN) is introduced. The Multiscale Feature Extraction Module (MFEM) and the enhanced Spectral-Spatial Attention Module (SSAM) are designed and combined in L-MFFN, optimizing the use of deep and shallow features. This integration improves the extraction and fusion of multiscale spectral-spatial features, enhancing classification performance. The proposed model demonstrates state-of-the-art performance across two HSI datasets and stands out in situations with limited labeled samples, highlighting its capability to effectively tackle the challenge of small-sample HSI classification.
Recently, most coverless steganography (CIS) methods are based on robust mapping rules. However, due to the limited mapping expression relationship between secret information and hash sequence, it is a challenge to fu...
详细信息
ISBN:
(纸本)9798350350920
Recently, most coverless steganography (CIS) methods are based on robust mapping rules. However, due to the limited mapping expression relationship between secret information and hash sequence, it is a challenge to further improve the hiding ability of coverless information hiding. Hence we propose a robust coverless steganography based on controllable semantics camouflage. In this scheme, sender transform stego images into camouflage images through generative hiding process by changing images semantics. Receiver perform revealing process to restore stego images. During hiding and revealing, secret information is not involved, thus this scheme can resist steganalysis effectively. Experiment shows the remarkable performance of this scheme, particularly in terms of its resilience to noise, surpassing other methods significantly and capable of achieving a successful extraction rate up to 100%.
Durian, a renowned fruit in Southeast Asia, particularly in countries like Malaysia, Thailand, and Indonesia, poses a challenge during its season as the manual classification of a large stock of durians based on grade...
详细信息
ISBN:
(纸本)9798350348798;9798350348804
Durian, a renowned fruit in Southeast Asia, particularly in countries like Malaysia, Thailand, and Indonesia, poses a challenge during its season as the manual classification of a large stock of durians based on grade and quality becomes a laborious task for sellers. This project aims to streamline the process by employing imageprocessing techniques for the classification of durians into defect and non-defect categories. The methodology involves image collection and image filtering analysis using Gaussian and Median filters, followed by applying Canny edge detection techniques to identify the durian region. Subsequently, classification algorithms based on pixel connection are deployed to distinguish between defect and non-defect durians. The obtained results reveal comparable accuracy and precision rates for both defect and non-defect durian images, standing at 87% and 75%, respectively. This project successfully demonstrates the feasibility of automating durian classification through imageprocessing methods.
The accurate classification of fresh fruit bunch ripeness is crucial for optimizing oil quality and yield in the palm oil industry. Traditional manual inspection methods are labor-intensive, subjective, and prone to e...
详细信息
ISBN:
(纸本)9798350352368
The accurate classification of fresh fruit bunch ripeness is crucial for optimizing oil quality and yield in the palm oil industry. Traditional manual inspection methods are labor-intensive, subjective, and prone to errors, motivating the exploration of automated solutions. This paper examined the potential of vision language models, including LLaVA 1.5, YiVL, and PaliGemma, to automate and enhance FFB ripeness assessment. The models were evaluated on their ability to classify ripeness stages and the accuracy of generated descriptive text using metrics like BLEU and ROUGE scores. Yi-VL achieved the highest descriptive accuracy with a ROUGE-L score of 93.14. However, it processes 0.18 samples per second, which is slower than PaliGemma (0.53 samples/second). PaliGemma is 194.44% more efficient in samples/second than Yi-VL, making it better suited for realtime applications despite its lower accuracy (ROUGE-L: 26.15). LLaVA 1.5 offers a balance between accuracy (ROUGE-L: 82.16) and efficiency (0.22 samples/second). This research highlighted the trade-offs between different VLMs for FFB ripeness assessment, demonstrating their potential to revolutionize the agriculture industry. Future work may focus on optimizing model performance and deploying these technologies in real-world scenarios.
The incorporation of LiDAR technology into some high-end smartphones has unlocked numerous possibilities across various applications, including photography, image restoration, augmented reality, and more. In this pape...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
The incorporation of LiDAR technology into some high-end smartphones has unlocked numerous possibilities across various applications, including photography, image restoration, augmented reality, and more. In this paper, we introduce a novel direction that harnesses LiDAR depth maps to enhance the compression of the corresponding RGB camera images. To the best of our knowledge, this represents the initial exploration in this particular research direction. Specifically, we propose a Transformer-based learned image compression system capable of achieving variable-rate compression using a single model while utilizing the LiDAR depth map as supplementary information for both the encoding and decoding processes. Experimental results demonstrate that integrating LiDAR yields an average PSNR gain of 0.83 dB and an average bitrate reduction of 16% as compared to its absence.
This article presents an algorithm for determining reference brightness correction coefficients to improve image quality. The algorithm utilizes a combination of statistical analysis and imageprocessing techniques to...
详细信息
This work presents a seminal approach for synthesizing images from WiFi Channel State Information (CSI) in through-wall scenarios. Leveraging the strengths of WiFi, such as cost-effectiveness, illumination invariance,...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
This work presents a seminal approach for synthesizing images from WiFi Channel State Information (CSI) in through-wall scenarios. Leveraging the strengths of WiFi, such as cost-effectiveness, illumination invariance, and wall-penetrating capabilities, our approach enables visual monitoring of indoor environments beyond room boundaries and without the need for cameras. More generally, it improves the interpretability of WiFi CSI by unlocking the option to perform image-based downstream tasks, e.g., visual activity recognition. In order to achieve this crossmodal translation from WiFi CSI to images, we rely on a multimodal Variational Autoencoder (VAE) adapted to our problem specifics. We extensively evaluate our proposed methodology through an ablation study on architecture configuration and a quantitative/qualitative assessment of reconstructed images. Our results demonstrate the viability of our method and highlight its potential for practical applications.
This paper proposes a novel logo image recognition approach incorporating a localization technique based on reinforcement learning. Logo recognition is an image classification task identifying a brand in an image. As ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
This paper proposes a novel logo image recognition approach incorporating a localization technique based on reinforcement learning. Logo recognition is an image classification task identifying a brand in an image. As the size and position of a logo vary widely from image to image, it is necessary to determine its position for accurate recognition. However, because there is no annotation for the position coordinates, it is impossible to train and infer the location of the logo in the image. Therefore, we propose a deep reinforcement learning localization method for logo recognition (RL-LOGO). It utilizes deep reinforcement learning to identify a logo region in images without annotations of the positions, thereby improving classification accuracy. We demonstrated a significant improvement in accuracy compared with existing methods in several published benchmarks. Specifically, we achieved an 18-point accuracy improvement over competitive methods on the complex dataset Logo-2K+. This demonstrates that the proposed method is a promising approach to logo recognition in real-world applications.
暂无评论