At present, with the advance of satellite imageprocessing technology, remote sensing images are becoming more widely used in real scenes. However, due to the limitations of current remote sensing imaging technology a...
详细信息
At present, with the advance of satellite imageprocessing technology, remote sensing images are becoming more widely used in real scenes. However, due to the limitations of current remote sensing imaging technology and the influence of the external environment, the resolution of remote sensing images often struggles to meet application requirements. In order to obtain high-resolution remote sensing images, image super-resolution methods are gradually being applied to the recovery and reconstruction of remote sensing images. The use of image super-resolution methods can overcome the current limitations of remote sensing image acquisition systems and acquisition environments, solving the problems of poor-quality remote sensing images, blurred regions of interest, and the requirement for high-efficiency image reconstruction, a research topic that is of significant relevance to imageprocessing. In recent years, there has been tremendous progress made in image super-resolution methods, driven by the continuous development of deeplearning algorithms. In this paper, we provide a comprehensive overview and analysis of deep-learning-based image super-resolution methods. Specifically, we first introduce the research background and details of image super-resolution techniques. Second, we present some important works on remote sensing image super-resolution, such as training and testing datasets, image quality and model performance evaluation methods, model design principles, related applications, etc. Finally, we point out some existing problems and future directions in the field of remote sensing image super-resolution.
Industry 4.0 and recent deeplearning progress make it possible to solve problems that traditional methods could not. This is the case for anomaly detection that received a particular attention from the machine learni...
详细信息
Industry 4.0 and recent deeplearning progress make it possible to solve problems that traditional methods could not. This is the case for anomaly detection that received a particular attention from the machine learning community, and resulted in a use of generative adversarial networks (GANs). In this work, we propose to use intermediate patches for the inference step, after aWGAN training procedure suitable for highly imbalanced datasets, to make the anomaly detection possible on full size Printed Circuit Board Assembly (PCBA) images. We therefore show that our technique can be used to support or replace actual industrial imageprocessing algorithms, as well as to avoid a waste of time for industries.
deeplearning comes under Machine learning that accomplishes more power and flexibility by learning to present different concepts or relations of real world to simpler concepts. We use deeplearning fundaments in this...
deeplearning comes under Machine learning that accomplishes more power and flexibility by learning to present different concepts or relations of real world to simpler concepts. We use deeplearning fundaments in this paper because it has massive amount of data that helps in innovations. We include these neural networks of deeplearning because it comes with a high accuracy rate with lower computations. Natural processing Language (NLP) and Generative Adversarial Network (GAN) are the methods that individually contribute to the text generation method. Although these are two different technologies giving the output for some common motive where Text generation plays a very important role in smart translations and dialogue systems. This review paper presents a model centered around text generation. This is done because combinedly we want to present what can be different approaches to look at a model like this. To solve the problem of unnecessarily used large texts, unsatisfactory feedback, NLP is used for text generation, GANN is used for text generation model, image generation etc. Finally, this is done to reduce time complexities, speed, efficiency in process because this is noticed that learning for a problem plays a vital role in education to enhance features.
This paper presents the real-time implementation of deep neural networks on smartphone platforms to detect and classify diabetic retinopathy from eye fundus images. This implementation is an extension of a previously ...
详细信息
ISBN:
(纸本)9781510635807
This paper presents the real-time implementation of deep neural networks on smartphone platforms to detect and classify diabetic retinopathy from eye fundus images. This implementation is an extension of a previously reported implementation by considering all the five stages of diabetic retinopathy. Two deep neural networks are first trained, one for detecting four stages and the other to further classify the last stage into two more stages, based on the EyePACS and APTOS datasets fundus images and by using transfer learning. Then, it is shown how these trained networks are turned into a smartphone app, both Android and iOS versions, to process images captured by smartphone cameras in real-time. The app is designed in such a way that fundus images can be captured and processed in real-time by smartphones together with lens attachments that are commercially available. The developed real-time smartphone app provides a cost-effective and widely accessible approach for conducting first-pass diabetic retinopathy eye exams in remote clinics or areas with limited access to fundus cameras and ophthalmologists.
Today, the number of vehicles using the road including highways and single carriage way is increasing. road structure safety monitoring system that is safe for road users and also important to ensure long-term vehicle...
详细信息
Due to the ongoing COVID-19 pandemic's impact on public health and safety, there is an immediate requirement for creative measures to limit the transmission of the virus. In the paper, we present a computer vision...
Due to the ongoing COVID-19 pandemic's impact on public health and safety, there is an immediate requirement for creative measures to limit the transmission of the virus. In the paper, we present a computer vision-based system for detecting social distancing violations and mask-wearing compliance in crowded public spaces. The system uses a combination of deeplearning algorithms and imageprocessing techniques to analyze camera feeds and identify violations in real-time. We describe the architecture of the system, which includes a camera network, edge devices for imageprocessing and analysis, and a central server for data management and reporting. We also evaluate the accuracy and efficiency of the system using a dataset of simulated crowd scenarios and real-world tests in public spaces.
Background In the past decade, deeplearning has revolutionized medical imageprocessing. This technique may advance laparoscopic surgery. Study objective was to evaluate whether deeplearning networks accurately anal...
详细信息
Background In the past decade, deeplearning has revolutionized medical imageprocessing. This technique may advance laparoscopic surgery. Study objective was to evaluate whether deeplearning networks accurately analyze videos of laparoscopic procedures. Methods Medline, Embase, IEEE Xplore, and the Web of science databases were searched from January 2012 to May 5, 2020. Selected studies tested a deeplearning model, specifically convolutional neural networks, for video analysis of laparoscopic surgery. Study characteristics including the dataset source, type of operation, number of videos, and prediction application were compared. A random effects model was used for estimating pooled sensitivity and specificity of the computer algorithms. Summary receiver operating characteristic curves were calculated by the bivariate model of Reitsma. Results Thirty-two out of 508 studies identified met inclusion criteria. Applications included instrument recognition and detection (45%), phase recognition (20%), anatomy recognition and detection (15%), action recognition (13%), surgery time prediction (5%), and gauze recognition (3%). The most common tested procedures were cholecystectomy (51%) and gynecological-mainly hysterectomy and myomectomy (26%). A total of 3004 videos were analyzed. Publications in clinical journals increased in 2020 compared to bio-computational ones. Four studies provided enough data to construct 8 contingency tables, enabling calculation of test accuracy with a pooled sensitivity of 0.93 (95% CI 0.85-0.97) and specificity of 0.96 (95% CI 0.84-0.99). Yet, the majority of papers had a high risk of bias. Conclusions deeplearning research holds potential in laparoscopic surgery, but is limited in methodologies. Clinicians may advance AI in surgery, specifically by offering standardized visual databases and reporting.
Recently, state space models (SSM) based on efficient hardware-aware design, such as the Mamba deeplearning model, have demonstrated exceptional efficacy in visual feature recognition functions. However, few studies ...
详细信息
ISBN:
(数字)9798350368604
ISBN:
(纸本)9798350368611
Recently, state space models (SSM) based on efficient hardware-aware design, such as the Mamba deeplearning model, have demonstrated exceptional efficacy in visual feature recognition functions. However, few studies have explored the potential of this novel architecture for pose estimation tasks. In this paper, we propose ViMPose, a baseline model for human pose estimation based on Vision Mamba. We demonstrate the model's excellent performance in pose estimation from multiple aspects, including model simplicity, inference speed, and lightweight parameters. Specifically, ViMPose employs a new backbone with bidirectional Mamba blocks to extract features from given human instances and uses a lightweight decoder for human pose estimation. Employing the scalable capacity and lightweight nature of Vision Mamba, ViMPose achieves high recognition accuracy with fewer parameters, striking a new balance between real-time efficiency and performance. Furthermore, ViMPose exhibits lower memory usage when processing high-resolution image inputs. Findings of the COCO dataset experiments highlight the ViMPose model's effectiveness and considerable promise for human pose estimation tasks.
Although deeplearning has achieved remarkable successes over the past years, few reports have been published about applying deep neural networks to Wireless Sensor Networks (WSNs) for image targets recognition where ...
详细信息
Although deeplearning has achieved remarkable successes over the past years, few reports have been published about applying deep neural networks to Wireless Sensor Networks (WSNs) for image targets recognition where data, energy, computation resources are limited. In this work, a Cost-Effective Domain Generalization (CEDG) algorithm has been proposed to train an efficient network with minimum labor requirements. CEDG transfers networks from a publicly available source domain to an application specific target domain through an automatically allocated synthetic domain. The target domain is isolated from parameters tuning and used for model selection and testing only. The target domain is significantly different from the source domain because it has new target categories and is consisted of low quality images that are out of focus, low in resolution, low in illumination, low in photographing angle. The trained network has about 7 M (ResNet-20 is about 41 M) multiplications per prediction that is small enough to allow a digital signal processor chip to do real-time recognitions in our WSN. The category level averaged error on the unseen and unbalanced target domain has been decreased by 41.12%. (c) 2020 Published by Elsevier B.V.
暂无评论