Medical image arbitrary-scale super-resolution (MI-ASSR) has recently gained widespread attention, aiming to supersample medical volumes at arbitrary scales via a single model. However, existing MIASSR methods face tw...
ISBN:
(纸本)9798350307184
Medical image arbitrary-scale super-resolution (MI-ASSR) has recently gained widespread attention, aiming to supersample medical volumes at arbitrary scales via a single model. However, existing MIASSR methods face two major limitations: (i) reliance on high-resolution (HR) volumes and (ii) limited generalization ability, which restricts their applications in various scenarios. To overcome these limitations, we propose Cube-based Neural Radiance Field (CuNeRF), a zero-shot MIASSR framework that is able to yield medical images at arbitrary scales and free viewpoints in a continuous domain. Unlike existing MISR methods that only fit the mapping between low-resolution (LR) and HR volumes, CuNeRF focuses on building a continuous volumetric representation from each LR volume without the knowledge of the corresponding HR one. This is achieved by the proposed differentiable modules: cube-based sampling, isotropic volume rendering, and cube-based hierarchical rendering. Through extensive experiments on magnetic resource imaging (MRI) and computed tomography (CT) modalities, we demonstrate that CuNeRF can synthesize high-quality SR medical images, which outperforms state-of-the-art MISR methods, achieving better visual verisimilitude and fewer objectionable artifacts. Compared to existing MISR methods, our CuNeRF is more applicable in practice.
Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently address...
详细信息
Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently addresses the inverse problems in imageprocessing and has induced extensive applications in various domains. In paper, we propose the self-reinforcement deep image prior (SDIP) as an improved version of the original We observed that the changes in the DIP networks' input and output are highly correlated during each iteration. SDIP efficiently utilizes this discovery in a reinforcement learning manner, where the current iteration's output is utilized by a steering algorithm to update the network input for the next iteration, guiding the algorithm towards improved results. Experimental results across multiple applications demonstrate that our proposed SDIP framework offers improvement compared to the original DIP method, especially when the corresponding inverse problem is highly ill-posed.
vision in the deep sea is acquiring increasing interest from many fields as the deep seafloor represents the largest surface portion on Earth. Unlike common shallow underwater imaging, deep sea imaging requires artifi...
详细信息
vision in the deep sea is acquiring increasing interest from many fields as the deep seafloor represents the largest surface portion on Earth. Unlike common shallow underwater imaging, deep sea imaging requires artificial lighting to illuminate the scene in perpetual darkness. Deep sea images suffer from degradation caused by scattering, attenuation and effects of artificial light sources and have a very different appearance to images in shallow water or on land. This impairs transferring current vision methods to deep sea applications. Development of adequate algorithms requires some data with ground truth in order to evaluate the methods. However, it is practically impossible to capture a deep sea scene also without water or artificial lighting effects. This situation impairs progress in deep sea vision research, where already synthesized images with ground truth could be a good solution. Most current methods either render a virtual 3D model, or use atmospheric image formation models to convert real world scenes to appear as in shallow water appearance illuminated by sunlight. Currently, there is a lack of image datasets dedicated to deep sea vision evaluation. This paper introduces a pipeline to synthesize deep sea images using existing real world RGB-D benchmarks, and exemplarily generates the deep sea twin datasets for the well known Middlebury stereo benchmarks. They can be used both for testing underwater stereo matching methods and for training and evaluating underwater imageprocessing algorithms. This work aims towards establishing an image benchmark, which is intended particularly for deep sea vision developments.
Computer vision is a subfield of artificial intelligence that relies on training computers to obtain a high level of understanding of vision data. A computer vision system aims at identifying objects through the acqui...
详细信息
Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user p...
详细信息
ISBN:
(纸本)9783031667428;9783031667435
Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.
Fusarium wilt disease(FWD) caused by Fusarium oxysporum f. sp. ciceris (Padwick) is the most important disease affecting chickpea yield among biotic stresses. Fusarium wilt is a vascular disease that causes permanent ...
详细信息
While Deep Neural Network (DNN) models have transformed machinevision capabilities, their extremely high computational complexity and model sizes present a formidable deployment roadblock for AIoT applications. We sh...
详细信息
ISBN:
(纸本)9781665458221
While Deep Neural Network (DNN) models have transformed machinevision capabilities, their extremely high computational complexity and model sizes present a formidable deployment roadblock for AIoT applications. We show that the complexity-vs-accuracy-vs-communication tradeoffs for such DNN models can be significantly addressed via a novel, lightweight form of "collaborative machine intelligence" that requires only runtime changes to the inference process. In our proposed approach, called ComAI, the DNN pipelines of different vision sensors share intermediate processing state with one another, effectively providing hints about objects located within their mutually-overlapping Field-of-Views (FoVs). CoMAI uses two novel techniques: (a) a secondary shallow ML model that uses features from early layers of a peer DNN to predict object confidence values in the image, and (b) a pipelined sharing of such confidence values, by collaborators, that is then used to bias a reference DNN's outputs. We demonstrate that CoMAI (a) can boost accuracy (recall) of DNN inference by 20-50%, (b) works across heterogeneous DNN models and deployments, and (c) incurs negligible processing, bandwidth and processing overheads compared to non-collaborative baselines.
作者:
Alyami, JaberKing Abdulaziz Univ
Fac Appl Med Sci Dept Radiol Sci Jeddah 21589 Saudi Arabia King Abdulaziz Univ
King Fahd Med Res Ctr Jeddah 21589 Saudi Arabia King Abdulaziz Univ
Smart Med Imaging Res Grp Jeddah 21589 Saudi Arabia King Abdulaziz Univ
Ctr Modern Math Sci & its Applicat Med Imaging & Artificial Intelligence Res Unit Jeddah 21589 Saudi Arabia
Radiological image analysis using machine learning has been extensively applied to enhance biopsy diagnosis accuracy and assist radiologists with precise cures. With improvements in the medical industry and its techno...
详细信息
Radiological image analysis using machine learning has been extensively applied to enhance biopsy diagnosis accuracy and assist radiologists with precise cures. With improvements in the medical industry and its technology, computer-aided diagnosis (CAD) systems have been essential in detecting early cancer signs in patients that could not be observed physically, exclusive of introducing errors. CAD is a detection system that combines artificially intelligent techniques with imageprocessingapplications thru computer vision. Several manual procedures are reported in state of the art for cancer diagnosis. Still, they are costly, time-consuming and diagnose cancer in late stages such as CT scans, radiography, and MRI scan. In this research, numerous state-of-the-art approaches on multi-organs detection using clinical practices are evaluated, such as cancer, neurological, psychiatric, cardiovascular and abdominal imaging. Additionally, numerous sound approaches are clustered together and their results are assessed and compared on benchmark datasets. Standard metrics such as accuracy, sensitivity, specificity and false-positive rate are employed to check the validity of the current models reported in the literature. Finally, existing issues are highlighted and possible directions for future work are also suggested.
In recent years, the field of image captioning has gained substantial attention, posing a complex challenge that necessitates the integration of computer vision (CV), natural language processing (NLP), and machine lea...
详细信息
Convolutional neural networks (CNNs) play an important role in an increasing number of imageprocessing tasks. There is an obvious demand to improve their classification performance and efficiency. Current research in...
详细信息
ISBN:
(数字)9783031611377
ISBN:
(纸本)9783031611360;9783031611377
Convolutional neural networks (CNNs) play an important role in an increasing number of imageprocessing tasks. There is an obvious demand to improve their classification performance and efficiency. Current research in this area tends to focus on developing increasingly complex models and algorithms to achieve this end. However, research into computer vision techniques and data augmentation tends to be neglected. This paper demonstrates that even a very simple CNN model achieves high performance in surface defect classification on the NEU dataset thanks to image preprocessing and data augmentation. The initial F1-score of 0.9646 without image preprocessing increases to 0.9727 when preprocessing is carried out. The simple CNN then achieves an F1-score of 0.9854 after data augmentation.
暂无评论