With the advancement of digital imaging in agriculture and crop production, ideas are being adopted for real-time health status. In all parts of the plant, the leaf is a direct indicator of its health status, so the u...
详细信息
Pneumonia, a lung infection typically caused by bacteria, requires swift and accurate diagnosis, especially in critical care. Optical endomicroscopy (OEM) facilitates real-time acquisition of in vivo and in situ optic...
详细信息
ISBN:
(数字)9789464593617
ISBN:
(纸本)9798331519773
Pneumonia, a lung infection typically caused by bacteria, requires swift and accurate diagnosis, especially in critical care. Optical endomicroscopy (OEM) facilitates real-time acquisition of in vivo and in situ optical biopsies, aiding in the quick identification of bacteria. However, the challenge of visually analyzing the vast number of images generated by the OEM in real-time can lead to delays in necessary treatments. To address this, we introduce Back2Seg, a novel approach for the segmentation of bacteria in OEM image sequences. Prior research mainly focused on exploiting bacteria motion or relied on less accurate unsupervised background estimation methods. In this regard, to enhance the background estimation and thus bacteria segmentation, Back2Seg employs a two-stage architecture with one sub-network dedicated to estimating the background using a Convolutional Neural Network (CNN)-Transformer architecture and the other is a dual-input network, processing both the original and the estimated background sequences to accurately segment the bacteria. Our experiments demonstrate that Back2Seg effectively integrates the advantages of both supervised and unsupervised learning techniques, showing a 4.62% increase in correlation with annotations over unsupervised models and a 1.05 reduction in root mean squared error (RMSE), outperforming the top supervised approach.
Snapshot compressive imaging (SCI) is a compressed sensing (CS)-based high-speed imaging modality. Recent efforts have explored the underlying 3D representation from only an SCI image using neural radiance fields (NeR...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Snapshot compressive imaging (SCI) is a compressed sensing (CS)-based high-speed imaging modality. Recent efforts have explored the underlying 3D representation from only an SCI image using neural radiance fields (NeRF), yet the training time, rendering computation cost, and reconstruction quality limitations are general issues that have limited wider adoption. This paper introduces SCI-Gaussian, the first 3D-aware SCI reconstruction based on 3D Gaussian splatting (3D-GS). This method utilizes an explicit 3D representation to achieve efficient and high-quality scene reconstruction. The motivation stems from the highly efficient representation and surprising quality of 3D-GS, despite when applied to SCI system, it encounters difficulties in generating point initialization for explicit Gaussians and accurate pose recovery from a single SCI measured image. Specifically, we effectively initialize these Gaussians through sampling a coarsely trained NeRF at various hash structures, then model the physical formation of the SCI measurement and jointly optimize Gaussians and camera trajectories with a bundle adjustment formulation during exposure time. Extensive experiments on synthetic and real-world datasets demonstrate that SCI-Gaussian outperforms the state-of-the-art (SOTA) methods, achieving comparable or better results with significantly 10× faster training and 1000× faster rendering speed than the most recent NeRF-based method.
In the context of big data analytics, this study examines the use of algorithms based on deep learning for feature extraction. Traditional methods usually have trouble sifting through the complexity and volume of data...
详细信息
In this paper, we address the problem of reconstructing geometry, material and illumination from multi-view images of a scene captured in an unknown environment. Compared to prevalent NeRF-based methods, which are typ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In this paper, we address the problem of reconstructing geometry, material and illumination from multi-view images of a scene captured in an unknown environment. Compared to prevalent NeRF-based methods, which are typically hindered by slow training and time-intensive evaluating, our approach capitalizes on recent advancements in Volume Rendering: point-based 3D Gaussian Splatting (3DGS) techniques. However, the inherent lack of geometric priors in this point-based representation and the material-lighting ambiguity of the radiance field make it intractable for inverse rendering tasks. To overcome this challenge, our method adopt a two-step approach which we call InvGS. First, our method optimize the original parameters of the initial 3DGS point cloud using flattening and alignment regularization to make the Gaussians closer to the object surfaces for reconstructing a more compact geometry. In the second stage, our method jointly optimize lighting and material parameters of 3DGS point cloud through the differentiable rendering process, thereby following the physically-based rendering equation. Benefiting from the fast rasterization of 3DGS and real-time rendering techniques in physically-based rendering, InvGS can reasonably shade each Gaussian under image-based HDR illumination without compromising real-time performance. We demonstrate that our method is comparable to the state of the arts, and even outperform classic and deep learning-based approaches.
video surveillance has drawn much interest in monitoring physical assets, spaces and events over time for detection of threats as well as business and process monitoring purposes. However, the rising number of recorde...
详细信息
ISBN:
(纸本)9789881476890
video surveillance has drawn much interest in monitoring physical assets, spaces and events over time for detection of threats as well as business and process monitoring purposes. However, the rising number of recorded videos has significantly increased the time and effort in manual event analysis and video content management. Therefore, automatic moving object detection is of great importance. Nowadays, for storage and transmission purposes, video usually appears in the compressed form. Therefore, in this paper, an automatic moving object detection method is proposed for HEVC video. Specifically, the number of bits spent on coding a frame, which can be extracted during encoding or retrieved from an encoded video bit stream, is exploited as the key feature for moving object detection. In addition, temporal sub-layering feature of HEVC is utilized to reduce the number of frame to be processed, which in turn magnifies the energy of the coded video frames without losing most of the predicted information. A coarse background / foreground mask is then formed based on bit consumption, and it is further refined via post processing to remove noise and to smooth the mask image. The proposed method achieves encouraging results in detecting slow moving objects, even with dynamic background.
Despite researchers interest toward style transfer problem, there is still no foremost method available. Difficulties in problem formalization make a comparison of methods especially complicated. This paper covers twe...
详细信息
Purpose: Surgical training could be improved by automatic detection of workflow steps, and similar applications of imageprocessing. A platform to collect and organize tracking and video data would enable rapid develo...
详细信息
ISBN:
(纸本)9781510633988
Purpose: Surgical training could be improved by automatic detection of workflow steps, and similar applications of imageprocessing. A platform to collect and organize tracking and video data would enable rapid development of imageprocessing solutions for surgical training. The purpose of this research is to demonstrate 3D Slicer / PLUS Toolkit as a platform for automatic labelled data collection and model deployment. Methods: We use PLUS and 3D Slicer to collect a labelled dataset of tools interacting with tissues in simulated hernia repair, comprised of optical tracking data and video data from a camera. To demonstrate the platform, we train a neural network on this data to automatically identify tissues, and the tracking data is used to identify what tool is in use. The solution is deployed with a custom Slicer module. Results: This platform allowed the collection of 128,548 labelled frames, with 98.5% correctly labelled. A CNN was trained on this data and applied to new data with an accuracy of 98%. With minimal code, this model was deployed in 3D Slicer on real-time data at 30fps. Conclusion: We found the 3D Slicer and PLUS Toolkit platform to be a viable platform for collecting labelled training data and deploying a solution that combines automatic videoprocessing and optical tool tracking. We designed an accurate proof-of-concept system to identify tissue-tool interactions with a trained CNN and optical tracking.
In the context of the Chandrayaan 3 Lunar Mission, this research paper introduces a real-timeimage retrieval and denoising system powered by autoencoders, designed to tackle the challenge of noisy space imagery. Leve...
详细信息
ISBN:
(数字)9798350350067
ISBN:
(纸本)9798350350074
In the context of the Chandrayaan 3 Lunar Mission, this research paper introduces a real-timeimage retrieval and denoising system powered by autoencoders, designed to tackle the challenge of noisy space imagery. Leveraging advanced deep learning techniques, our system employs autoencoders to extract essential features from the noisy lunar images and subsequently retrieves and integrates similar, noise-free reference images from a comprehensive database. By doing so, it achieves real-time denoising, ensuring that the mission’s acquired lunar images are of high quality, thus facilitating more accurate scientific analysis. The paper details the architecture of the autoencoder-based denoising system, its training process using a meticulously curated lunar image dataset, and its seamless integration into the mission’s imageprocessing pipeline. Experimental results underscore the system’s remarkable noise reduction capabilities, thereby playing a pivotal role in enhancing the Chandrayaan 3 mission’s scientific contributions, ultimately advancing our understanding of the lunar environment and bolstering the success of lunar exploration endeavours.
processing hyperspectral image data can be computationally expensive and difficult to employ for real-time applications due to its extensive spatial and spectral information. Further, applications in which computation...
详细信息
ISBN:
(数字)9781510643109
ISBN:
(纸本)9781510643109
processing hyperspectral image data can be computationally expensive and difficult to employ for real-time applications due to its extensive spatial and spectral information. Further, applications in which computational resources may be limited, such as those requiring artificial intelligence at the edge, can be hindered by the volume of data that is common with airborne hyperspectral image data. This paper proposes utilizing band selection to down-select the number of spectral bands considering a given classification task so that classification can be done at the edge with lower computational complexity. Specifically, we consider popular techniques for band selection and investigate their feasibility to identify discriminative bands such that classification performance is not drastically hindered. This would greatly benefit applications where time-sensitive solutions are needed to ensure optimal outcomes (this could be related to defense, natural disaster relief/response, agriculture, etc.). Performance of the proposed approach is measured in terms of classification accuracy and run time.
暂无评论