This research introduces "Jaddah,"an innovative AI-based system for the automated detection of road infrastructure defects using advanced computer vision and machine learning techniques. The system addresses...
详细信息
作者:
Garcia, Ernest, vEmory Univ
Sch Med Dept Radiol & Imaging Sci 101 Woodruff CircleRoom 1203 Atlanta GA 30322 USA
Natural language processing (NLP) offers many opportunities in Nuclear Cardiology. These opportunities include applications in converting nuclear cardiology imaging reports to digital searchable information that may b...
详细信息
Natural language processing (NLP) offers many opportunities in Nuclear Cardiology. These opportunities include applications in converting nuclear cardiology imaging reports to digital searchable information that may be used as Big Data for machine learning and registries. Another major NLP application is, with the support of AI, in automatically translating MPI image features directly into nuclear cardiology reports. This review describes the symbiotic relationship between AI and NLP in that NLP is being used to facilitate AI applications and, AI techniques are being used to facilitate NLP. This article reviews the fundamentals of NLP and describes various conventional and AI techniques that have been applied in imaging. Key nuclear cardiology applications are reviewed such as conversion of MPI free-text reports to digital documents as well as direct conversion of MPI images into structured medical reports.
This paper proposes a method based on the contour method to solve the problem of difficulty in measuring the wear state of diamond beaded wire during processing. The edge contour image of the diamond beaded wire was c...
详细信息
The recent times have witnessed a rise in the use of imageprocessing, computer vision, and machine learning in the field of medical imaging, thus offering more accurate diagnoses with a reduction of the cost of labor...
详细信息
ISBN:
(纸本)9798350395334;9798350395327
The recent times have witnessed a rise in the use of imageprocessing, computer vision, and machine learning in the field of medical imaging, thus offering more accurate diagnoses with a reduction of the cost of labor while at the same time, minimizing the scope for human error. Dental X-ray images are often challenging and time-consuming to study consequently making diagnosis more arduous. Furthermore, only an experienced clinician can endeavor to provide an accurate diagnosis from a two-dimensional X-ray image. Manual investigation of dental diseases and abnormalities is still the most prevalent method in the field of dentistry. This article aims to introduce a novel method to automate the process of obtaining an initial diagnosis from orthopantamogram(OPG) X-rays by using state-of-the-art object detection models which are currently proving to be effective in medical image diagnosis. By providing an effective comparison between popular object detection frameworks, we aim to determine the computer vision model that provides the most promising results by accurately diagnosing dental abnormalities and identifying treatments from a dental X-ray image in an error-free and efficient manner.
machinevisionapplications are commonly utilised in manufacturing lines as low cost, high precision measuring devices. Output facilities can accomplish high production numbers without mistakes thanks to these solutio...
详细信息
For safety and security reasons, the indoor/outdoor working environments of various industries require the use of many cameras for automated surveillance. In such context, a major challenge for automated monitoring sy...
详细信息
Out-of-distribution (OOD) learning presents a major challenge in machine learning as models must effectively generalize to previously unseen data. This challenge is prevalent in deep learning models, which tend to foc...
详细信息
Out-of-distribution (OOD) learning presents a major challenge in machine learning as models must effectively generalize to previously unseen data. This challenge is prevalent in deep learning models, which tend to focus on the most dominant features in images. This narrow focus impedes OOD learning, where critical features are concealed or absent during testing, leading to reduced prediction accuracy. To address this issue, we introduce a novel data augmentation approach termed Dominant Feature Masking (DFM), inspired by human visual holistic processing. DFM strategically conceals and reveals the most prominent features within images, allowing neural networks to simultaneously capture both dominant and non-dominant attributes, thereby enhancing adaptability to OOD data. We evaluated DFM using a novel set of learning challenges termed versatile Evaluation Benchmark (vEB), which assesses model performance on three distinct tasks: (i) augmented MNIST images to test resilience against diverse transformations;(ii) a novel dataset of unseen image classes to examine performance on new instances within familiar categories;and (iii) a dataset created by DALL-E to challenge class differentiation with artificially mixed features. Our results demonstrate that DFM significantly improves OOD generalization compared to traditional augmentation techniques, achieving marked enhancements across various conditions without compromising in-distribution testing accuracy. These findings underscore the potential of DFM to improve the performance of computer vision systems in various real-world scenarios, making them more robust and adaptable to unexpected data variations. By leveraging vEB, researchers will gain a deeper understanding of their models' generalization performance, ensuring that CNNs are well-equipped to handle the complexities of real-world applications. The source code and vEB datasets are available at https://***/Deepvisionary/DFM.
Generative adversarial networks (GANs) have recently become a hot research topic;however, they have been studied since 2014, and a large number of algorithms have been proposed. Nevertheless, few comprehensive studies...
详细信息
Generative adversarial networks (GANs) have recently become a hot research topic;however, they have been studied since 2014, and a large number of algorithms have been proposed. Nevertheless, few comprehensive studies explain the connections among different GAN variants and how they have evolved. In this paper, we attempt to provide a review of the various GAN methods from the perspectives of algorithms, theory, and applications. First, the motivations, mathematical representations, and structures of most GAN algorithms are introduced in detail, and we compare their commonalities and differences. Second, theoretical issues related to GANs are investigated. Finally, typical applications of GANs in imageprocessing and computer vision, natural language processing, music, speech and audio, the medical field, and data science are discussed.
Biological vision systems inspire processing methods in computer visionapplications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierar...
详细信息
Biological vision systems inspire processing methods in computer visionapplications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierarchical architecture for smart image sensors. The architecture aims to bring computation close to the sensor to achieve high acceleration for different machinevisionapplications while consuming low power. We logically divide the image into multiple regions and perform pixel-level and region-level processing after removing spatiotemporal redundancy. Those processors use bio-inspired algorithms to activate the regions with region of interest of a scene. The hierarchical processing breaks the traditional sequential imageprocessing and introduces parallelism for machinevisionapplications. Also, we make the hardware design reconfigurable even after fabrication to make the hardware reusable for different applications. Simulation results show that the area overhead and power penalty for adding reconfigurable features stay in an acceptable range. We emphasize to maximize the operating speed and obtain 800 MHz. Besides, the design saves 84.01% and 96.91% dynamic power at the first and second stages of the hierarchy by removing redundant information. Furthermore, the sequential deployment of high-level reasoning only on the selected regions of the image becomes computationally inexpensive to execute a complex task in real time.
Hand gesture recognition plays an important role in developing effective human-machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to ide...
详细信息
Hand gesture recognition plays an important role in developing effective human-machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to identify the correct hand gesture to control an application while moving the hands. To address this issue, in this work, a low-cost hand gesture recognition system based human-computer interface (HCI) is presented in real-time scenarios. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) feature extraction and gesture classification using five pre-trained convolutional neural network models (CNN) and vision transformer (viT), (4) building an interactive human-machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) smoothing of virtual mouse pointer using of Kalman filter. In our work, five pre-trained CNN models (vGG16, vGG19, ResNet50, ResNet101, and Inception-v1) and viT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model's performances, it is observed that Inception-v1 has significantly shown a better classification performance compared to the other four CNN models and viT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some multimedia applications (such as vLC player, audio player, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.
暂无评论