In this paper, the 3D space imaging model of machinevision is constructed. Starting from the traditional machinevisionimageprocessing algorithm flow, the image denoising process and target tracking process are opt...
详细信息
Explainable Deep Learning has gained significant attention in the field of artificial intelligence (AI), particularly in domains such as medical imaging, where accurate and interpretable machine learning models are cr...
详细信息
ISBN:
(数字)9783031581816
ISBN:
(纸本)9783031581809;9783031581816
Explainable Deep Learning has gained significant attention in the field of artificial intelligence (AI), particularly in domains such as medical imaging, where accurate and interpretable machine learning models are crucial for effective diagnosis and treatment planning. Grad-CAM is a baseline that highlights the most critical regions of an image used in a deep learning model's decision-making process, increasing interpretability and trust in the results. It is applied in many computer vision (CV) tasks such as classification and explanation. This study explores the principles of Explainable Deep Learning and its relevance to medical imaging, discusses various explainability techniques and their limitations, and examines medical imaging applications of Grad-CAM. The findings highlight the potential of Explainable Deep Learning and Grad-CAM in improving the accuracy and interpretability of deep learning models in medical imaging. The code is available in (https://***/ beasthunter758/GradEML).
Infrared-visible image fusion combines complementary information from both modalities, enhancing scene perception in applications such as surveillance and autonomous driving. However, existing deep learning-based meth...
详细信息
Large-scale models trained on extensive datasets, have emerged as the preferred approach due to their high generalizability across various tasks. In-context learning (ICL), a popular strategy in natural language proce...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Large-scale models trained on extensive datasets, have emerged as the preferred approach due to their high generalizability across various tasks. In-context learning (ICL), a popular strategy in natural language processing, uses such models for different tasks by providing instructive prompts but without updating model parameters. This idea is now being explored in computer vision, where an input-output image pair (called an in-context pair) is supplied to the model with a query image as a prompt to exemplify the desired output. The efficacy of visual ICL often depends on the quality of the prompts. We thus introduce a method coined Instruct Me More (InMeMo), which augments in-context pairs with a learnable perturbation (prompt), to explore its potential. Our experiments on mainstream tasks reveal that InMeMo surpasses the current state-of-the-art performance. Specifically, compared to the baseline without learnable prompt, InMeMo boosts mIoU scores by 7.35 and 15.13 for foreground segmentation and single object detection tasks, respectively. Our findings suggest that InMeMo offers a versatile and efficient way to enhance the performance of visual ICL with lightweight training. Code is available at https://***/Jackieam/InMeMo.
The rapid growth of computer vision-based applications, including smart cities and autonomous driving, has created a pressing demand for efficient 360∘ image compression and computer vision analytics. In most circums...
machine Learning applications Practical resource on the importance of machine Learning and Deep Learning applications in various technologies and real-world situations machine Learning applications discusses methodolo...
详细信息
ISBN:
(数字)9781394173358
ISBN:
(纸本)9781394173327
machine Learning applications Practical resource on the importance of machine Learning and Deep Learning applications in various technologies and real-world situations machine Learning applications discusses methodological advancements of machine learning and deep learning, presents applications in imageprocessing, including face and vehicle detection, image classification, object detection, image segmentation, and delivers real-world applications in healthcare to identify diseases and diagnosis, such as creating smart health records and medical imaging diagnosis, and provides real-world examples, case studies, use cases, and techniques to enable the reader’s active learning. Composed of 13 chapters, this book also introduces real-world applications of machine and deep learning in blockchain technology, cyber security, and climate change. An explanation of AI and robotic applications in mechanical design is also discussed, including robot-assisted surgeries, security, and space exploration. The book describes the importance of each subject area and detail why they are so important to us from a societal and human perspective. Edited by two highly qualified academics and contributed to by established thought leaders in their respective fields, machine Learning applications includes information on: Content based medical image retrieval (CBMIR), covering face and vehicle detection, multi-resolution and multisource analysis, manifold and imageprocessing, and morphological processing Smart medicine, including machine learning and artificial intelligence in medicine, risk identification, tailored interventions, and association rules AI and robotics application for transportation and infrastructure (e.g., autonomous cars and smart cities), along with global warming and climate change Identifying diseases and diagnosis, drug discovery and manufacturing, medical imaging diagnosis, personalized medicine, and smart health records With its practical approach to the subject, Ma
When an underwater camera captures aerial targets, the received light undergoes refraction at the water-air interface. In particular, the calm water compresses the image, while turbulent water causes nonlinear distort...
详细信息
When an underwater camera captures aerial targets, the received light undergoes refraction at the water-air interface. In particular, the calm water compresses the image, while turbulent water causes nonlinear distortion in the captured images. However, existing methods for correcting water-to-air distortion often cause images with distortion or overall shifts. To address the above issue, we propose a multi-strategy hybrid framework to process image sequences effectively, particularly for high-precision applications. Our framework includes a spatiotemporal crossover block to transform and merge features, effectively addressing the template-free problem. Additionally, we introduce an enhancement network to produce a high-quality template in the first stage and a histogram template method to maintain high chromaticity and reduce template noise in the correction stage. Furthermore, our framework incorporates a new registration scheme to facilitate sequence transfer and processing. Compared to existing algorithms, our approach achieves a high restoration level in terms of morphology and color for publicly available image sequences. (c) 2024 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.
When the color of moving object is close to the background, the accuracy of moving object recognition is affected. So the method of moving object recognition based on machinevision is designed. In order to reduce the...
详细信息
ISBN:
(纸本)9783031288661;9783031288678
When the color of moving object is close to the background, the accuracy of moving object recognition is affected. So the method of moving object recognition based on machinevision is designed. In order to reduce the distortion of image edge position, the moving object is calibrated and corrected by vision. In order to reduce the influence of noise to a controllable range, the full information mobile monitoring image is enhanced to preserve the image details. The edge features obtained from view and template are calculated by moment, and the similarity is obtained. Then the contour feature of moving monitoring target is extracted based on machinevision. Segmentation of the background region, according to the moving object trajectory center point information such as speed, direction and so on to determine whether the trajectory is abnormal events. The proposed method is tested on INRIA dataset and Vehicle Reld dataset, and the results show that the proposed method can improve the accuracy and recall rate and has good detection performance.
This paper presents a comprehensive examination of innovative strategies aimed at enhancing machinevision technology, particularly in the context of energy efficiency and processing speed, critical factors for applic...
详细信息
ISBN:
(纸本)9798350376258
This paper presents a comprehensive examination of innovative strategies aimed at enhancing machinevision technology, particularly in the context of energy efficiency and processing speed, critical factors for applications like facial recognition. The study focuses on three distinct approaches: an optimized two-dimensional convolution algorithm, a novel Field-Programmable Gate Array (FPGA) implementation, and advancements in multichannel meta-imagers. Firstly, the paper discusses an optimized algorithm for two-dimensional convolutions, a fundamental operation in machinevision. This advanced algorithm significantly reduces computational complexity. For instance, in executing a two-dimensional 3×3 cyclic convolution, the proposed method reduces the number of necessary multiplications from 81 to merely 13, offering a substantial improvement in efficiency. Secondly, the paper explores an innovative FPGA implementation of the two-dimensional convolution algorithm. This implementation is designed to minimize the use of shift registers, multipliers, and adders. As a result, it utilizes fewer Look-Up Tables (LUTs), leading to energy and time savings in executing the convolution process. The paper details the architecture of this FPGA-based approach and its implications for energy consumption and processing speed in machinevisionapplications. Finally, the paper introduces a novel technique called the Avg-Topk method, addressing a critical challenge in the pooling layer of convolutional neural networks. This method combines the benefits of average pooling with the advantages of max pooling, aiming to enhance the accuracy of the pooling layer without compromising on efficiency. The Avg-Topk method represents a significant step forward in optimizing the pooling process within machinevision systems. In summary, this paper delves into groundbreaking methods to improve the speed and energy efficiency of machinevision systems, offering valuable insights and potential solution
Crops and weeds are involved in a continuous competition for equal resources, which may result in a potential decrease in crop yields by up to 31% and an increase in the costs of agricultural inputs by up to 22% of cu...
详细信息
Crops and weeds are involved in a continuous competition for equal resources, which may result in a potential decrease in crop yields by up to 31% and an increase in the costs of agricultural inputs by up to 22% of cultivation. Weeds further impact crop production, and their detection is crucial for effective crop management. In this research, we targeted common weeds of cotton field, specifically i) Digitaria sanguinalis (L.) Scop, ii) Amaranthus retroflexus L., iii) Acalypha australis, L., iv) Cephalanoplos segetum, and v) Chenopodium album L. Additionally, imageprocessing techniques such as grayscale conversion, binarization, and Gaussian and morphological filters were also utilized. These methods are based on machinevision and facilitate rapid and straightforward weed detection by segmenting, scrutinizing, and comparing input images. The plant height and area were obtained during cotton planting within 32 days and fitted to develop the growth law concerning planting days for achieving the function of distinguishing cotton from weeds. We conducted recognition experiments by dividing images into four quadrants and categorizing weeds as either inter-row or intra-row. Meanwhile, the inter-row planting information was used to identify weeds, and the leaf pixel area and circularity were used as the identification methods for intra-row weeds, which reduced the algorithm's running time and improved real-time performance. The experimental results indicated that the inter-row weed recognition rate was 89.4%, with an average processing time of 102ms. Whereas in the case of intra-row weeds, the recognition rate was measured at 84.6%, and the overall recognition rate for cotton was 85.0%, with a mean time consumption of 437ms. Furthermore, the present research underscores recent advancements such as machinevision and high-resolution imaging, which have significantly improved the accuracy of automated weed identification in cotton fields while acknowledging ongoing challen
暂无评论