A recurring challenge in integrated lithography is subnanoscale misalignment sensing. In widely-used moire'-based misalignment sensing schemes, measurement accuracy is restricted by the performance of the image pr...
详细信息
A recurring challenge in integrated lithography is subnanoscale misalignment sensing. In widely-used moire'-based misalignment sensing schemes, measurement accuracy is restricted by the performance of the imageprocessing schemes. This is also a fundamental problem in the field of Fourier optics that has received extensive attention in the science and engineering fields. This paper proposes a Fourier-attention neural network that can achieve realtime-lapse misalignment sensing with an accuracy of 0.23 nm. This is enabled by the system's robustness to system errors and noise. We hope that this strategy can provide an effective solution for various misalignment sensing applications and that the approach can be applied to future problems.
The emergence of Edge Computing has shifted the processing capabilities in proximity to the Internet of Things (IoT) data sources, offering solutions to latency and bandwidth constraints applications. This shift compl...
详细信息
The emergence of Edge Computing has shifted the processing capabilities in proximity to the Internet of Things (IoT) data sources, offering solutions to latency and bandwidth constraints applications. This shift complements Cloud Computing, especially in handling real-time data processing and enhancing processing. imageprocessing, particularly image captioning for smart monitoring systems, benefits greatly from this synergy. image captioning plays a crucial role in understanding visual data. While early methods excelled in encoder-decoder frameworks and attention mechanisms, they often overlooked semantic representations which are essential for comprehensive image understanding. To address this gap, we introduce the EdgeScan framework, leveraging Edge Computing for image analysis and semantic feature extractions closer to data sources. EdgeScan integrates visual and semantic features to create more informative and enriched image captions that enhance image captioning accuracy. The EdgeScan image captioning model architecture is capable of 1) learning the salient image region's specific feature representation and 2) co-embedding visual attention and semantic attributes in one space for feature fusion. This improves models' ability to interpret and respond to data in a meaningful way, which is particularly valuable for IoT applications that require a deep understanding of the semantics of diverse and constantly changing data for efficient operation. Extensive experiments were conducted on the MS-COCO dataset to demonstrate the superiority of EdgeScan in both quantitative and qualitative performance, achieving highest consensus-based image description evaluation score of 120.9, as well as notable scores of 78.6 for BLEU@1 and 57.7 for recall-oriented understudy for gisting evaluation metrics, promising advancements in IoT-driven image understanding and competitiveness against the state of the art.
real-time,contact-free temperature monitoring of low to medium range(30℃-150℃)has been extensively used in industry and agriculture,which is usually realized by costly infrared temperature detection *** paper propos...
详细信息
real-time,contact-free temperature monitoring of low to medium range(30℃-150℃)has been extensively used in industry and agriculture,which is usually realized by costly infrared temperature detection *** paper proposes an alternative approach of extracting temperature information in realtime from the visible light images of the monitoring target using a convolutional neural network(CNN).A mean-square error of<1.119℃was reached in the temperature measurements of low to medium range using the CNN and the visible light *** angle and imaging distance do not affect the temperature detection using visible optical images by the ***,the CNN has a certain illuminance generalization ability capable of detection temperature information from the images which were collected under different illuminance and were not used for *** to the conventional machine learning algorithms mentioned in the recent literatures,this real-time,contact-free temperature measurement approach that does not require any further imageprocessing operations facilitates temperature monitoring applications in the industrial and civil fields.
In the modern food processing industry, which is more complex than in the past, it is important to utilize real-time computer vision for active food processing technology using artificial intelligence. An integrated s...
详细信息
In the modern food processing industry, which is more complex than in the past, it is important to utilize real-time computer vision for active food processing technology using artificial intelligence. An integrated solution of computer vision and deeplearning (DL) technology provides quality control and optimization of food processing in complex environments with obstacles. In this study, Coffee Bean Classification Model (CBCM) made by Machine learning (ML) showed excellent performance, accurately distinguishing coffee beans through the avoidance of obstacles and empty spaces inside a rotating roasting machine. CBCM achieved a maximum validation accuracy of 98.44% and a minimum validation loss of 5.40% after the fifth epoch. Using a test dataset of 137 samples, CBCM achieved an accuracy of 99.27% and a loss of 2.82%. The developed solution using the CBCM was able to quantify the color change of the coffee beans during roasting.
Blind image deblurring is a challenging imageprocessing problem, and a proper solution for this problem has many applications in the real world. This is an ill-posed problem, as both the sharp image and blur kernel a...
详细信息
Blind image deblurring is a challenging imageprocessing problem, and a proper solution for this problem has many applications in the real world. This is an ill-posed problem, as both the sharp image and blur kernel are unknown. The traditional methods based on maximum a posterior (MAP) apply heavy constraints on the latent image or blur kernel to find the solution. However, these constraints are not always effective;meanwhile, they are very time-consuming. Recently, new approaches based on deeplearning have emerged. The methods based on this approach suffer from two problems: the need for a large number of images and kernels for training and also the dependency of the result on the training data. In this paper, we propose a multiscale method based on MAP framework for image motion deblurring. In this method, we represent the blurry image in different scales. We suggest segmenting the image of each scale using kappa-means clustering. Using the image information at dominant edges guided by the segmented images, the blur kernel is estimated at each scale. The blur kernel at the finest level of the pyramid is estimated from the coarser levels in a coarse-to-fine manner. Unlike the existing MAP-based methods, the proposed method does not need mathematically complicated assumptions to estimate the intermediate latent image. So the proposed image deblurring is run fast. We evaluated the proposed method and compared it to the existing methods. The experimental results on real and synthetic blurry images demonstrate that the proposed scheme has promising results. The proposed method competes with the existing MAP-based methods for reconstructing qualitative sharp images, while the execution time for our method is considerably less.
This paper presents an integrated distributed acoustic sensing (DAS) system with artificial intelligence to provide real-time system monitoring for fence perimeter and buried system applications. The DAS system is a R...
详细信息
This paper presents an integrated distributed acoustic sensing (DAS) system with artificial intelligence to provide real-time system monitoring for fence perimeter and buried system applications. The DAS system is a Rayleigh backscatter based fibre optic sensing system that has been deployed in two real-world, commercial applications to detect acoustic wave propagation and scattering along perimeter lines, and classify intrusions accurately. What we believe to be three novel signal processing methods are proposed to train filters for automatically selecting frequency bands from the power spectrum and generating hyper-spectral images from the data gathered by the DAS system without expert knowledge. The hyper-spectral images are analyzed by a neural network based object detection model. The system achieves 81.8% accuracy on a fence perimeter installation and 60.4% accuracy on a buried system application in detecting and classifying various intrusion events. The evaluation interval of the integrated DAS system framework between event sensing and detection does not exceed 5 s. (c) 2025 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
Belt conveyors are widely used in multiple industries, including coal, steel, port, power, metallurgy, and chemical, etc. One major challenge faced by these industries is belt deviation, which can negatively impact pr...
详细信息
Belt conveyors are widely used in multiple industries, including coal, steel, port, power, metallurgy, and chemical, etc. One major challenge faced by these industries is belt deviation, which can negatively impact production efficiency and safety. Despite previous research on improving belt edge detection accuracy, there is still a need to prioritize system efficiency and light-weight models for practical industrial applications. To meet this need, a new semantic segmentation network called FastBeltNet has been developed specifically for real-time and highly accurate conveyor belt edge line segmentation while maintaining a light-weight design. This network uses a dual-branch structure that combines a shallow spatial branch for extracting high-resolution spatial information with a context branch for deep contextual semantic information. It also incorporates the Ghost blocks, Downsample blocks, and Input Injection blocks to reduce computational load, increase processing frame rate, and enhance feature representation. Experimental results have shown that FastBeltNet has performed comparatively better than some existing methods in different real-world production settings, achieving promising performance metrics. Specifically, FastBeltNet achieves 80.49% mIoU accuracy, 99.89 FPS processing speed, 895 k parameters, 8.23 GFLOPs, and 430.95 MB peak CUDA memory use, effectively balancing accuracy and speed for industrial production.
Frequency domain photoacoustic (FDPA) imaging has great potential in a clinical setting compared to time-domain photoacoustic imaging due to its reduced cost and small form factors. However FDPA system struggles with ...
详细信息
Frequency domain photoacoustic (FDPA) imaging has great potential in a clinical setting compared to time-domain photoacoustic imaging due to its reduced cost and small form factors. However FDPA system struggles with lower signal to noise ratio, necessitating the need for advanced image reconstruction methods. Most of the image reconstruction approaches in FDPA imaging are based on analytical or model based schemes. Very less emphasis has been placed on developing deeplearning based approaches for FDPA imaging. In this work, a image translation network was developed with the ability to directly map from sinogram data (complex-valued) to the initial pressure rise distribution (real-valued) for FD-PA imaging. This architecture was based on a Long Short Term Memory (LSTM) backbone (with adjoined real and imaginary parts of the complex sinogram data as input) followed by a fully connected layer, which is then passed through a convolution and transposed convolution layer pair. The result of the FDPA-LSTM architecture was compared with direct translational networks based on ResNet, UNet and AUTOMAP and found to have an improvement of about 15% in terms of PSNR and 10% in terms of SSIM with 150 degrees data acquisition limited-view angle. Further, the FDPA-LSTM was also compared with post-processing UNet architecture on backprojection and Tikhonov regularized reconstruction. A 20% improvement in terms of PSNR with backprojection and post-processing UNet was observed. Further FDPA-LSTM had similar performance as Tikhonov and post-processing UNet (with 75 times acceleration). The developed scheme will indeed be very useful for achieving accelerated and accurate frequency domain photoacoustic imaging.
The remarkable increase in published medical imaging datasets for chest X-rays has significantly improved the performance of deeplearning techniques to classify lung diseases efficiently. However, large datasets requ...
详细信息
The remarkable increase in published medical imaging datasets for chest X-rays has significantly improved the performance of deeplearning techniques to classify lung diseases efficiently. However, large datasets require special arrangements to make them suitable, accessible, and practically usable in remote clinics and emergency rooms. Additionally, it increases the computational time and image-processing complexity. This study investigates the efficiency of converting the 2D chest X-ray into one-dimensional texture representation data using descriptive statistics and local binary patterns, enabling the use of feed-forward neural networks to efficiently classify lung diseases within a short time and with cost effectiveness. This method bridges diagnostic gaps in healthcare services and improves patient outcomes in remote hospitals and emergency rooms. It also could reinforce the crucial role of technology in advancing healthcare. Utilizing the Guangzhou and PA datasets, our one-dimensional texture representation achieved 99% accuracy with a training time of 10.85 s and 0.19 s for testing. In the PA dataset, it achieved 96% accuracy with a training time of 38.14 s and a testing time of 0.17 s, outperforming EfficientNet, EfficientNet-V2-Small, and MobileNet-V3-Small. Therefore, this study suggests that the dimensional texture representation is fast and effective for lung disease classification.
Dispersion of nanofiller in the polymer matrix is a vital factor that influences the properties of the fabricated nanocomposites at the laboratory level. Characterization techniques like TEM and FESEM, having a small ...
详细信息
Dispersion of nanofiller in the polymer matrix is a vital factor that influences the properties of the fabricated nanocomposites at the laboratory level. Characterization techniques like TEM and FESEM, having a small sample size, tend to miss out on the big picture for the analysis of mixing on a larger scale. At the industrial level, conducting such testing with varying experimental conditions is not viable in terms of both cost and time. Through this study, we propose a simple method to examine the extent of dispersion using a simple camera and employing deeplearning (DL) models. For this purpose, an analogous study has been performed to study the sensitivity of the processing techniques, to better understand the findings in our previous articles. A two component-colored solution (oil-water) was utilized as a proxy for the nanofiller-polymer matrix system. Different processing methods were employed namely ultrasonication, homogenization, sequential ultrasonication and homogenization and simultaneous ultrasonication and homogenization. The variation in processing technique significantly affects the dispersion which is attributed to the different mixing mechanisms (turbulent, diffusive, and convective) incurred in these processing techniques. Inferences are withdrawn by detecting patterns in a large sample size which highlights that DL models provide us with a holistic viewpoint of real-time observations. It also ameliorates human interpretation by unraveling obscure information which can go unnoticed by human eyes.
暂无评论