An imageprocessing algorithm for real-time examination of LED light strips is proposed, which enables quick detection of blind LED beads in strips. It is successfully used in production line to replace manual inspect...
详细信息
One of the most important information needed while performing unmanned aerial vehicles (UAV) operations is about the platform location and the environment. Such platforms mostly use GNSS signals outdoors. However, in ...
详细信息
ISBN:
(数字)9781665450928
ISBN:
(纸本)9781665450928
One of the most important information needed while performing unmanned aerial vehicles (UAV) operations is about the platform location and the environment. Such platforms mostly use GNSS signals outdoors. However, in indoor areas where GNSS signals cannot be received or in situations where signals are jammed, it is not possible to obtain location information using these signals. For that reason, alternative navigation systems have become so crucial. One of the most preferred systems among navigation technologies is the visual simultaneous localization and mapping (vSLAM) method performed using RGB cameras on the UAVs. In this study, an open monocular image dataset called AG-Mono was created and published online to test the performance of vSLAM algorithms. This dataset was created at three different exposure times using a handheld platform, and it includes video sequences at 640x480 image resolution. The experimental area where the images were created is a closed corridor with 16.5 x 4.5 meters and four sharp corners.
Learned image Compression (LIC), which uses neural networks to compress images, has experienced significant growth in recent years. The hyperprior-module-based LIC model has achieved higher performance than classical ...
详细信息
image denoising is a crucial step in image acquisition and processing that helps improve the image quality by removing the unwanted noise. In this paper Gaussian and median filters are used as denoiser and performance...
详细信息
Given the growing dependence on medical imaging, there is a significant requirement for automated report generation, which can save the radiologist's time and reduce the possibility of diagnostic errors. Existing ...
详细信息
ISBN:
(纸本)9798350377873;9798350377866
Given the growing dependence on medical imaging, there is a significant requirement for automated report generation, which can save the radiologist's time and reduce the possibility of diagnostic errors. Existing approaches face various difficulties, including insufficient professionalism, a variety of diseases, and fluency in reports. These problems are the result of the use of an encoder-decoder deep learning architecture to establish a uni-directional image-to-report relationship and neglect the bidirectional connections between images and reports, making it challenging to establish the intrinsic medical correlations between them. To this end, we propose a novel approach for chest radiology report generation based on multimodal feature fusion. Our method uses textual and visual features that are taken from medical chest X-ray images and their real reports. Firstly, we use a vision transformer to extract visual features from medical images;on the other hand, we use the Word2Vec model to extract semantic features from textual medical reports. Additionally, we employ advanced techniques such as channel attention networks and cross- modal information fusion modules to enhance the quality and coherence of the generated reports. We have evaluated our proposed approach on two publicly available chest X-ray datasets, IU X-ray and NIH. The results show that our approach outperforms state-of-the-art methods. Particularly in the ROUGE metric and BLEU metric.
Recent deep learning based visual simultaneous localization and mapping (SLAM) methods have made significant progress. However, how to make full use of visual information as well as better integrate with inertial meas...
详细信息
ISBN:
(纸本)9798350384581;9798350384574
Recent deep learning based visual simultaneous localization and mapping (SLAM) methods have made significant progress. However, how to make full use of visual information as well as better integrate with inertial measurement unit (IMU) in visual SLAM has potential research value. This paper proposes a novel deep SLAM network with dual visual factors. The basic idea is to integrate both photometric factor and re-projection factor into the end-to-end differentiable structure through multi-factor data association module. We show that the proposed network dynamically learns and adjusts the confidence maps of both visual factors and it can be further extended to include the IMU factors as well. Extensive experiments validate that our proposed method significantly outperforms the state-of-the-art methods on several public datasets, including TartanAir, EuRoC and ETH3D-SLAM. Specifically, when dynamically fusing the three factors together, the absolute trajectory error for both monocular and stereo configurations on EuRoC dataset has reduced by 45.3% and 36.2% respectively.
Scene text image super-resolution has significantly improved the accuracy of scene text recognition. However, many existing methods emphasize performance over efficiency and ignore the practical need for lightweight s...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Scene text image super-resolution has significantly improved the accuracy of scene text recognition. However, many existing methods emphasize performance over efficiency and ignore the practical need for lightweight solutions in deployment scenarios. Faced with the issues, our work proposes an efficient framework called SGENet to facilitate deployment on resource-limited platforms. SGENet contains two branches: super-resolution branch and semantic guidance branch. We apply a lightweight pre-trained recognizer as a semantic extractor to enhance the understanding of text information. Meanwhile, we design the visual-semantic alignment module to achieve bidirectional alignment between image features and semantics, resulting in the generation of high-quality prior guidance. We conduct extensive experiments on benchmark dataset, and the proposed SGENet achieves excellent performance with fewer computational costs.
Bolus covers the patient's skin surface in cancer care for desired dose distribution and minimal damage to the healthy tissue. The existing bolus shaping method is mainly a manual process which is inaccurate and i...
详细信息
ISBN:
(纸本)9798350390797;9789532901351
Bolus covers the patient's skin surface in cancer care for desired dose distribution and minimal damage to the healthy tissue. The existing bolus shaping method is mainly a manual process which is inaccurate and inefficient. This paper proposes a model retrieval method based on feature skeletons of the model and model image. Mesh nodes in a bolus model are embedded into a feature space by the spectral analysis. Skeletons are formed from features of the model to build a skeleton base. visual entropies are applied to detect edges of the model image. The edges are then classified into the object and background pixels for contours of the object using a spectral clustering method. The skeleton of the image is compared with skeletons in the model skeleton base to find the best-matched bolus model using an iterative closest point method. The proposed method is verified in the case studies.
image dehazing is a meaningful low-level computer vision task and can be applied to a variety of contexts. In our industrial deployment scenario based on remote sensing (RS) images, the quality of image dehazing direc...
详细信息
Synthetic aperture radar (SAR) images are inherently affected by speckle noise. Deep learning-based methods have shown good potential in image denoising task. Most deep learning methods for denoising focus on additive...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Synthetic aperture radar (SAR) images are inherently affected by speckle noise. Deep learning-based methods have shown good potential in image denoising task. Most deep learning methods for denoising focus on additive Gaussian noise removal. However, SAR images are usually contaminated by non-Gaussian multiplicative speckle noise. In this paper, we propose a novel deep unrolling network named SAR-DURNet to deal with the SAR image despeckling problem. We establish optimization problem of speckle noise removal by using the priori of noise distribution, which can be sovled by half-quadratic splitting (HQS) method with iterative steps. We unroll the iterative process into a trainable deep unrolling network(SAR-DURNet). The parameters of the SAR-DURNet are trained end-to-end with simulated SAR image dataset. Experimental results on simulated test data and real SAR data show that the proposed approach has superior results in terms of quantitative performance metrics and the preservation of intricate visual details, compared to several well-known SAR image despeckling methods.
暂无评论