Electrohydrodynamic (EHD) printing is an additive manufacturing technique capable of microscale and nanoscale structures for biomedical, aerospace, and electronic applications. To realize stable printing at its full r...
详细信息
Electrohydrodynamic (EHD) printing is an additive manufacturing technique capable of microscale and nanoscale structures for biomedical, aerospace, and electronic applications. To realize stable printing at its full resolution, the monitoring of jetting behavior while printing and optimization of the printing process are necessary. Various machinevision control schemes have been developed for EHD printing. However, in-line machinevision systems are currently limited because only limited information can be captured in situ toward quality assurance and process optimization. In this article, we presented a machine learning-embedded machinevision control scheme that is able to characterize jetting and recognize the printing quality by using only low-resolution observations of the Taylor Cone. An innovative approach was introduced to identify and measure cone-jet behavior using low-fidelity image data at various applied voltage levels, stand-off distances, and printing speeds. The scaling law between voltages and the line widths enables quality prediction of final printed patterns. A voting ensemble composed of k-nearest neighbor (KNN), classification and regression tree (CART), random forest, logistic regression, gradient boost classifier, and bagging models was employed with optimized hyperparameters to classify the jets to their corresponding applied voltages, achieving an 88.43% accuracy on new experimental data. These findings demonstrate that it is possible to analyze jetting status and predict high-resolution pattern dimensions by using low-fidelity data. The voltage analysis based on the in situ data will provide additional insights for system stability, and it can be used to establish the error functions for future advanced control schemes.
Bananas, renowned for their delightful flavor, exceptional nutritional value, and digestibility, are among the most widely consumed fruits globally. The advent of advanced imageprocessing, computer vision, and deep l...
详细信息
Bananas, renowned for their delightful flavor, exceptional nutritional value, and digestibility, are among the most widely consumed fruits globally. The advent of advanced imageprocessing, computer vision, and deep learning (DL) techniques has revolutionized agricultural diagnostics, offering innovative and automated solutions for detecting and classifying fruit varieties. Despite significant progress in DL, the accurate classification of banana varieties remains challenging, particularly due to the difficulty in identifying subtle features at early developmental stages. To address these challenges, this study presents a novel hybrid framework that integrates the vision Transformer (ViT) model for global semantic feature representation with the robust classification capabilities of Support Vector machines. The proposed framework was rigorously evaluated on two datasets: the four-class BananaimageBD and the six-class BananaSet. To mitigate data imbalance issues, a robust evaluation strategy was employed, resulting in a remarkable classification accuracy rate (CAR) of 99.86%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\:\pm\:$$\end{document}0.099 for BananaSet and 99.70%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\:\pm\:$$\end{document}0.17 for BananaimageBD, surpassing traditional methods by a margin of 1.77%. The ViT model, leveraging self-supervised and semi-supervised learning mechanisms, demonstrated exceptional promise in extracting nuanced features critical for agricultural applications. By combining ViT features with cutting-edge machine learning classifiers, the proposed system establishes a ne
In various fields such as medical imaging, object detection, and video surveillance, multi view natural language query systems utilize image data to provide a more comprehensive perspective, allowing users to intuitiv...
详细信息
Artificial technologies have made rapid progress and achieved various superior tasks in the past few years, including but not limited to classification, detection, image generation and data processing. Particularly, t...
详细信息
Artificial technologies have made rapid progress and achieved various superior tasks in the past few years, including but not limited to classification, detection, image generation and data processing. Particularly, the very recent emerging Sora has demonstrated the exceptional ability of text-to-video generation lasting for 1 minute long with impressive quality. It provides a huge potential for many new applications across industries, especially social interaction in intelligent vehicles. The emergence of innovative intelligence vehicle applications has given rise to novel requirements for social and human-vehicle interaction within the associated contexts, where Sora and social vision could play an important role. In this perspective, we present a new Social Interaction framework based on Sora and parallel intelligence in intelligent vehicles and provide a novel perspective for conducting new social and human-vehicle interaction in the context of intelligent vehicles.
imageprocessing is a fundamental task in computer vision, which aims at enhancing image quality and extracting essential features for subsequent visionapplications. Traditionally, task-specific models are developed ...
详细信息
imageprocessing is a fundamental task in computer vision, which aims at enhancing image quality and extracting essential features for subsequent visionapplications. Traditionally, task-specific models are developed for individual tasks and designing such models requires distinct expertise. Building upon the success of large language models (LLMs) in natural language processing (NLP), there is a similar trend in computer vision, which focuses on developing large-scale models through pretraining and in-context learning. This paradigm shift reduces the reliance on task-specific models, yielding a powerful unified model to deal with various tasks. However, these advances have predominantly concentrated on high-level vision tasks, with less attention paid to low-level vision tasks. To address this issue, we propose a universal model for general imageprocessing that covers image restoration, image enhancement, image feature extraction tasks, etc. Our proposed framework, named PromptGIP, unifies these diverse imageprocessing tasks within a universal framework. Inspired by NLP question answering (QA) techniques, we employ a visual prompting question answering paradigm. Specifically, we treat the input-output image pair as a structured question-answer sentence, thereby reprogramming the imageprocessing task as a prompting QA problem. PromptGIP can undertake diverse cross-domain tasks using provided visual prompts, eliminating the need for task-specific finetuning. Capable of handling up to 15 different imageprocessing tasks, PromptGIP represents a versatile and adaptive approach to general imageprocessing. Codes will be available at https://***/lyh-18/PromptGIP. Copyright 2024 by the author(s)
Traditional thresholding methods are widely used to extract objects of interest from image backgrounds in various practical applications. However, these methods often face challenges in complex scenes due to poor unif...
详细信息
Traditional thresholding methods are widely used to extract objects of interest from image backgrounds in various practical applications. However, these methods often face challenges in complex scenes due to poor uniformity, noise, and low contrast. To overcome these limitations, this paper proposes a peak-weaken Otsu method (PWOTSU) that improves the segmentation performance of the Otsu method for automatically extracting objects in complex scenes. The proposed approach uses a set of cross parameters as weights for the Otsu criterion function to adaptively weaken the between-class variance at the peak of the histogram. This ensures that an appropriate threshold value is always obtained for images with different types of histogram distribution. The improved criterion function has the advantage of obtaining a more accurate threshold value without the need for additional parameters, making it easily applicable to various practical applications. Experimental results demonstrate that the proposed method effectively improves the segmentation accuracy and robustness compared to the standard Otsu method and its modifications, as evidenced by qualitative and quantitative evaluations.
Precision agriculture has recently gained significant importance in computer vision technologies. Various processes as a part of agricultural production cycle from planting to harvesting can be carried out automatical...
详细信息
The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural languag...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OPENCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of imageNET pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.
As the basis work of imageprocessing, rain removal from a single image has always been an important and challenging problem. Due to the lack of real rain images and corresponding clean images, most rain removal netwo...
详细信息
As the basis work of imageprocessing, rain removal from a single image has always been an important and challenging problem. Due to the lack of real rain images and corresponding clean images, most rain removal networks are trained by synthetic datasets, which makes the output images unsatisfactory in practical applications. In this work, we propose a new feature decoupling network for unsupervised image rain removal. Its purpose is to decompose the rain image into two distinguishable layers: clean image layer and rain layer. In order to fully decouple the features of different attributes, we use contrastive learning to constrain this process. Specifically, the image patch with similarity is pulled together as a positive sample, while the rain layer patch is pushed away as a negative sample. We not only make use of the inherent self-similarity within the sample, but also make use of the mutual exclusion between the two layers, so as to better distinguish the rain layer from the clean image. We implicitly constrain the embedding of different samples in the depth feature space to better promote rainline removal and image restoration. Our method achieves a PSNR of 25.80 on Test100, surpassing other unsupervised methods.
Ensuring the reliability and safety of electrical equipment is essential for industrial and residential applications. Traditional fault diagnosis methods involving physical inspections are time-consuming and ineffecti...
详细信息
Ensuring the reliability and safety of electrical equipment is essential for industrial and residential applications. Traditional fault diagnosis methods involving physical inspections are time-consuming and ineffective for early fault detection. Infrared (IR) thermography offers a non-invasive and efficient solution by identifying anomalies in temperature profiles. This review explores thermal vision-based fault diagnosis techniques, including region of interest (ROI) segmentation, image pre-processing, and fault diagnosis algorithms, with a focus on deep learning approaches. The study highlights the effectiveness of machine learning models in enhancing fault detection accuracy while identifying challenges such as environmental variations, data inconsistencies, and system integration issues. The review discusses the role of real-time applications, wireless technologies, and AI-based automation in improving fault detection. Research gaps are identified, and future directions are proposed to enhance efficiency, reliability, and industrial adoption.
暂无评论