Handheld ultrasound devices are becoming more prevalent in point-of-care ultrasound workflows. However, these devices are computationally constrained which challenges the advances in deeplearning methodology for real...
详细信息
Handheld ultrasound devices are becoming more prevalent in point-of-care ultrasound workflows. However, these devices are computationally constrained which challenges the advances in deeplearning methodology for real-time use on mobile Point-of-care ultrasound (POCUS) devices. In this work, we explore the feasibility of running MimickNet, a deeplearning clinical post-processing model, on Tensor processing Units (TPUs), hardware designed for deeplearning operations capable of running on only 2 watts of power at 1.8 V with a form factor of 10 mm x 15 mm. We show that real-timedeeplearning based post-processing is feasible at 20 - 120 FPS for 1472x160 to 224x224 axial sample x B-mode scan line configurations. We refer to the TPU based model as MimickNet Mobile. MimickNet Mobile achieves outputs nearly identical to the original MimickNet with a structural similarity index measurement (SSIM) of 0.98 +/- 0.001 and a mean squared error (MSE) of 0.0001 +/- 0.0 over our test set of 588 frames consisting of 168 phantom frames and 420 prospectively acquired human liver frames. We investigate the latency of other common mobile architectures such as separable convolution. Finally, we investigate the distribution of model parameter error when quantizing MimickNet float32 weights to MimickNet Mobile int8 weights. This work demonstrates that real-time POCUS deeplearningimage enhancement is feasible using TPUs. Future ultrasound device manufacturers can consider incorporating a TPU for the added flexibility of supporting several deeplearning architectures without compromising on power management and form factor.
Before export, fruit should be classified to improve quality, meet customer requirements and increase product value. This article proposes a method to identify defects on the surface of tomato skin using image process...
Before export, fruit should be classified to improve quality, meet customer requirements and increase product value. This article proposes a method to identify defects on the surface of tomato skin using imageprocessing techniques combined with deeplearning models. The identification method includes the following main steps: (i) data collection (image of tomato: green, ripe, diseased, scratched), (ii) image labeling, (iii) data file division, (iv) model training, (v) selection and using models. The results of using Faster R-CNN model combining Resnet-10l and testing on YOLOv5 to identify and classify tomatoes that met and failed export for high accuracy (95.3 %) and met get realtime.
Interactive robots are intelligent auxiliary tools that can monitor sports events in real-time and provide entertainment information services. This article studies the application of entertainment interactive robots b...
详细信息
Interactive robots are intelligent auxiliary tools that can monitor sports events in real-time and provide entertainment information services. This article studies the application of entertainment interactive robots based on deeplearning in the referee assistance mode of sports competitions. The system first uses a camera to capture real-timeimages of volleyball matches, then preprocesses the images using imageprocessing algorithms, and uses deeplearning algorithms to recognize and track the balls and players in the images. By training the model, the system can accurately determine key information such as player actions. deeplearning technology is used to train interactive entertainment robots to identify and analyze key decision events in games. Through image recognition, action analysis and rule matching, the robot monitors the game process in realtime and determines the referee's possible errors in the decision. The system generates the penalty results based on the penalty rules and competition rules, and displays them to the referee and audience through display screens or sound prompts. After experimental verification, the volleyball match referee judgment assistance system based on imageprocessing and deeplearning has performed excellently in terms of accuracy and speed. Compared to manual referees, the system can identify and track the ball and players more quickly, reduce the possibility of misjudgments, and improve the fairness of the game.
deeplearning models have been a huge success in image recognition which hence can be used for the purpose of text generation. In the field of imaging science, captioning images and videos is regarded as an intellectu...
详细信息
ISBN:
(数字)9798350391770
ISBN:
(纸本)9798350391787
deeplearning models have been a huge success in image recognition which hence can be used for the purpose of text generation. In the field of imaging science, captioning images and videos is regarded as an intellectually difficult job. Visual Geometry Group (VGG); is a standard deep Convolutional Neural Network (CNN) architecture with multiple layers, specifically focusing on the integration of CNN for image feature extraction. Exploring this underlying method, the use of another model is essential for caption generation. Here the Recurrent Neural Network (RNN) comes in use for caption generation from the extracted features. Models named Long Short-Term Memory (LSTM) based on RNN and Bidirectional encoder representation transformer (BERT) based on Transformers have been prominent in ensuring accurate results. The Flicker8k dataset is used which provides a variety of information useful for model training. By testing validation data along with evaluation metrics, we analyze the effectiveness of different models to create consistent and descriptive headlines. Extending our inquiry to encompass title generation using transformer models, while also exploring learning techniques for real-time title generation and delivery using the Open-CV library available in Python to get the output from the camera and display it on screen. The result shows that the LSTM is the best model for captioning, with an accuracy of 65.07% at the epochs of 300 and the BERT model has an accuracy of 31% at the epochs of 2. The findings of this study not only contribute to advancing subtitle enhancement methodologies but also broaden the potential applications of deeplearning techniques in this domain.
The incorporation of distributed deeplearning for medical imageprocessing in cloud settings is the subject of this study. The findings demonstrate the high viability and significant performance advantages realized b...
The incorporation of distributed deeplearning for medical imageprocessing in cloud settings is the subject of this study. The findings demonstrate the high viability and significant performance advantages realized by cloud-based distributed systems, notably significant processingtime savings, outstanding diagnostic accuracy, as well as improved scalability. The consequences for security and privacy have been discussed, with a focus on effective safeguards for private medical information. There is a void in the literature about resource and cost-effectiveness optimization tactics used in cloud-based systems. Future research must concentrate on resource optimization tactics for economic sustainability, study developing security risks and privacy techniques, and incorporate real-world implementations in order to improve this topic. This study informs the use of distributed deeplearning in cloud-based medical imageprocessing as well as adds to the body of knowledge in healthcare technology.
The medical image is a set of all organizations, institutions, and resources whose primary goal is to improve health. The extensive growth of medical data increases the utility of machine learning and deeplearning in...
详细信息
Recent advances in camera design and imaging technology have enabled the capture of high-quality images using smartphones. However, due to the limited dynamic range of digital cameras, the quality of photographs captu...
详细信息
Recent advances in camera design and imaging technology have enabled the capture of high-quality images using smartphones. However, due to the limited dynamic range of digital cameras, the quality of photographs captured in environments with highly imbalanced lighting often results in poor-quality images. To address this issue, most devices capture multi-exposure frames and then use some multi-exposure fusion method to merge those frames into a final fused image. Nevertheless, most traditional and current deeplearning approaches are unsuitable for real-time applications on mobile devices due to their heavy computational and memory requirements. We propose MobileMEF, a new method for multi-exposure fusion based on an encoder-decoder deeplearning architecture with efficient building blocks tailored for mobile devices. This efficient design makes MobileMEF capable of processing 4K resolution images in less than 2 s on mid-range smartphones. MobileMEF outperforms state-of-the-art techniques regarding full-reference quality measures and computational efficiency (runtime and memory usage), making it ideal for real-time applications on hardware-constrained devices. Our code is available at: https://***/LucasKirsten/MobileMEF.
The proceedings contain 27 papers. The topics discussed include: fast multi-modal reuse: co-occurrence pre-trained deeplearning models;deeplearning for fast super-resolution reconstruction from multiple images;an ef...
ISBN:
(纸本)9781510626577
The proceedings contain 27 papers. The topics discussed include: fast multi-modal reuse: co-occurrence pre-trained deeplearning models;deeplearning for fast super-resolution reconstruction from multiple images;an efficient algorithm for fast block matching motion estimation using an adaptive threshold scheme;low exposure image frame generation algorithms for feature extraction and classification;parallel image and video self-recovery scheme with high recovery capability;learning optimal actions with imperfect images;CNN classification based on global and local features;kalman-based motion estimation in video surveillance systems for safety applications;and recent advances in integrated photonic-electronic technologies for high-speed processing and communication circuits for light-based transducers.
One of the most important occupations in India is agriculture. Out of all the crops, cotton is the best and is crucial to the agricultural economy of the country. In India, 40-50 million people work in the cotton trad...
详细信息
ISBN:
(数字)9798350319019
ISBN:
(纸本)9798350319026
One of the most important occupations in India is agriculture. Out of all the crops, cotton is the best and is crucial to the agricultural economy of the country. In India, 40-50 million people work in the cotton trade and processing, while six million farmers directly depend on the crop. The cotton leaf disease has grown in importance over the last few decades, resulting in losses to crops, farming operations, and financial resources. To achieve this aim, we first need to acquire different images of cotton plants. We can use imageprocessing techniques to analyze dead leaf images and extract features like color, texture, and other characteristics with the deep CNN model’s assistance. In addition to being less expensive and more straightforward, automatic disease detection supports machine vision, which offers image-based automated process control and inspection. To properly train the algorithm, we will be using a dataset of approximately 1752(approximately 440 images in each class) images classified into different categories according to the diseases. This model will be developed using tools present in Anaconda such as Jupyter Notebook, Spyder etc. The results of this project will demonstrate whether using it in real-time applications is feasible and whether traditional or manual disease and pest identification could benefit from the use of IT- based solutions.
The coffee industry contributes to the economic restructuring of many countries, often associated with a closed process from production to consumption. The green coffee bean grading standard provided by the Specialty ...
详细信息
The coffee industry contributes to the economic restructuring of many countries, often associated with a closed process from production to consumption. The green coffee bean grading standard provided by the Specialty Coffee Association (SCA) is one of the best methods for grading coffee beans. Traditionally, the assessment of quality and classification of coffee beans relies on visual examination, which demands significant time and effort and is easily inaccurate. deeplearning technology, characterized by precision, velocity, and veracity, can be adopted to empower the reduction of human labor and improve the productivity, quality, and efficiency of these tasks. Therefore, this paper aims to address these issues by implementing deeplearning to classify coffee bean quality in realtime by integrating the system with a cloud-based solution. First, imageprocessing and data augmentation techniques are employed to handle the coffee bean image data. Subsequently, the model is trained using YOLOv8, a framework for object recognition, and OpenCV, an open-source imageprocessing technology, to classify coffee beans. Finally, an application is developed for real-time video and image-streaming coffee bean recognition using React Native, NodeJS, and Python. The experimental results provide empirical evidence that our system enhances accuracy and efficiency in the tasks of classifying coffee bean quality in nine distinct varieties of coffee beans, with the time required reduced to a mere 1 to 3 seconds. Our system can be a useful solution for coffee producers, processors, and traders without relying on stationary equipment, especially in large farms or warehouses.
暂无评论