We propose a computer vision architecture based on Hyperbolic networks, contrastive learning and knowledge distillation to detect unsafe behavior in energy production and oil & gas plants. Data scarcity poses a si...
详细信息
ISBN:
(纸本)9798350370058;9798350370164
We propose a computer vision architecture based on Hyperbolic networks, contrastive learning and knowledge distillation to detect unsafe behavior in energy production and oil & gas plants. Data scarcity poses a significant challenge to develop machine learning applications in industry. Indeed, the data may be incomplete, inconsistent, or biased, making it difficult to develop accurate and reliable models. Insufficient data during training phase has direct impact on the models' representation learning capabilities;with the aid of vision Transformers (ViTs), we are able to solve data crunch situations by learning efficient representations of the existing data. We harnessed the power of ViTs, as it incorporates more global information, leading to quantitatively stronger intermediate feature representations. Further, we approached the task with contrastive learning and obtained pairs of samples which are similar, to tackle the limited data availability in our industrial use case. The proposed approach by applying hyperbolic embeddings helps in extracting complex relationships in the data. Furthermore, the size of the model makes it suitable for devices with low computational capabilities such as unmanned robots.
The ever-increasing amount of unstructured data, including text, images, audio, and video, poses a serious challenge to traditional data mining techniques. machine learning (ML) offers powerful tools and techniques to...
详细信息
Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them. There are many types of label noise, such as symmetric, asymmetric and instance-dependent noise (I...
详细信息
ISBN:
(纸本)9781665493468
Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them. There are many types of label noise, such as symmetric, asymmetric and instance-dependent noise (IDN), with IDN being the only type that depends on image information. Such dependence on image information makes IDN a critical type of label noise to study, given that labelling mistakes are caused in large part by insufficient or ambiguous information about the visual classes present in images. Aiming to provide an effective technique to address IDN, we present a new graphical modelling approach called InstanceGM, that combines discriminative and generative models. The main contributions of InstanceGM are: i) the use of the continuous Bernoulli distribution to train the generative model, offering significant training advantages, and ii) the exploration of a state-of-the-art noisy-label discriminative classifier to generate clean labels from instance-dependent noisy-label samples. InstanceGM is competitive with current noisy-label learning approaches, particularly in IDN benchmarks using synthetic and real-world datasets, where our method shows better accuracy than the competitors in most experiments(1).
images captured under low lighting frequently exhibit low brightness, low contrast, and a small grayscale. These features can affect the individual's view and severely limit the performance of machinevision syste...
详细信息
images captured under low lighting frequently exhibit low brightness, low contrast, and a small grayscale. These features can affect the individual's view and severely limit the performance of machinevision systems, particularly when data annotation is involved. Hence, the issues motivate this study to examine the effectiveness of advanced fuzzified histogram equalization for image enhancement. A comparative study was conducted based on the low lighting condition of iris images to evaluate three image enhancement methods: Advanced Fuzzified Histogram Equalization (AFHE), Contrast Limited Adaptive Histogram Equalization (CLAHE), and Fuzzy Contrast Enhancement (FCE) using the MIREIS dataset. The Gaussian membership functions (GMF) were modified accordingly to satisfy the suitable pixel intensity of the input iris images. The results were compared using the peak signal-to-noise ratio (PSNR) value, including the central processing unit (CPU) times. As a result, the AFHE showed a better PSNR value at 76.02db with faster CPU times at 4.04s compared to CLAHE and FCE. Although the PSNR value of HE is slightly lower than CLAHE (0.3%) and FCE (0.7%), AFHE improved the image's quality and brightness, which can help other researchers with the data annotation process. The performance of the proposed methods was validated by comparing them with state-of-the-art methods. The results demonstrated that AFHE, CLAHE, and FCE exceeded other HE, AHE, CLAHE, and hybrid HE using fuzzy approaches that employed PSNR metrics.
In the next decade, machinevision technology will have an enormous impact on industrial works because of the latest technological advances in this field. These advances are so significant that the use of this technol...
详细信息
In the next decade, machinevision technology will have an enormous impact on industrial works because of the latest technological advances in this field. These advances are so significant that the use of this technology is now essential. machinevision is the process of using a wide range of technologies and methods in providing automated inspections in an industrial setting based on imaging, process control, and robot guidance. One of the applications of machinevision is to diagnose traffic accidents. Moreover, car vision is utilized for detecting the amount of damage to vehicles during traffic accidents. In this article, using imageprocessing and machine learning techniques, a new method is presented to improve the accuracy of detecting damaged areas in traffic accidents. Evaluating the proposed method and comparing it with previous works showed that the proposed method is more accurate in identifying damaged areas and it has a shorter execution time.
Depth sensing is an essential technology in robotics and many other fields. Many depth sensing (or RGB-D) cameras are available on the market and selecting the best one for your application can be challenging. In this...
详细信息
Depth sensing is an essential technology in robotics and many other fields. Many depth sensing (or RGB-D) cameras are available on the market and selecting the best one for your application can be challenging. In this work, we tested four stereoscopic RGB-D cameras that sense the distance by using two images from slightly different views. We empirically compared four cameras (Intel RealSense D435, Intel RealSense D455, StereoLabs ZED 2, and Luxonis OAK-D Pro) in three scenarios: (i) planar surface perception, (ii) plastic doll perception, (iii) household object perception (YCB dataset). We recorded and evaluated more than 3,000 RGB-D frames for each camera. For table-top robotics scenarios with distance to objects up to one meter, the best performance is provided by the D435 camera that is able to perceive with an error under 1 cm in all of the tested scenarios. For longer distances, the other three models perform better, making them more suitable for some mobile robotics applications. OAK-D Pro additionally offers integrated AI modules (e.g., object and human keypoint detection). ZED 2 is overall the best camera which is able to keep the error under 3 cm even at 4 meters. However, it is not a standalone device and requires a computer with a GPU for depth data acquisition. All data (more than 12,000 RGB-D frames) are publicly available at https://***/rgbd-comparison
Crack detection in civil infrastructure, including roads, bridges, and buildings, is crucial for maintaining structural safety and functionality. Traditional manual inspection methods are time-consuming and prone to e...
详细信息
Crack detection in civil infrastructure, including roads, bridges, and buildings, is crucial for maintaining structural safety and functionality. Traditional manual inspection methods are time-consuming and prone to errors, highlighting the need for automated solutions. This study evaluates state-of-the-art computer vision techniques for automatically detecting cracks in both asphalt and concrete surfaces from 2013 to 2024. The study assesses the effectiveness and limitations of imageprocessing, traditional machine learning, and deep learning methods for crack detection. A comparative analysis of commonly used models is presented, utilizing public datasets: SDNET2018, CCIC, and BCD for concrete images, and AigleRN, CFD, CRACK500, and GAPs for asphalt images. Based on the comparison results, advanced deep learning models such as YOLOv5 and U-Net have demonstrated superior performance in crack detection for both asphalt and concrete structures, significantly outperforming traditional methods. For concrete crack detection, YOLOv5l exhibited exceptional performance on the SDNET2018 dataset, achieving a precision of 97.7%, recall of 96.7%, and a mAP@.5 of 99.3%, with a rapid inference time of 1.1 ms, making it highly suitable for real-time applications. For asphalt crack detection, U-Net achieved outstanding results, particularly on the GAPs dataset, with a near-perfect precision of 99.53%, and on the CFD dataset, with a precision of 92.54% and an F1-score of 89.90%. The study also highlights public concrete and asphalt datasets, providing details on methodology, including the number of images, image sizes, and noted noise factors. Additionally, it discusses the impact of data source variability on crack detection methods, showcasing the applications, strengths, and limitations of multi-sensor fusion techniques. Finally, unresolved challenges such as imbalanced datasets, high inference times, and complex network architectures are identified, with suggestions for future
IntroductionEsophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients'...
详细信息
IntroductionEsophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients' outcomes, as over 40% of patients with EC are diagnosed after metastasis. Recent advances in machine learning (ML) techniques, particularly in computer vision, have demonstrated promising applications in medical imageprocessing, assisting clinicians in making more accurate and faster diagnostic decisions. Given the significance of early detection of EC, this systematic review aims to summarize and discuss the current state of research on ML-based methods for the early detection of *** conducted a comprehensive systematic search of five databases (PubMed, Scopus, Web of Science, Wiley, and IEEE) using search terms such as "ML", "Deep Learning (DL (", "Neural Networks (NN)", "Esophagus", "EC" and "Early Detection". After applying inclusion and exclusion criteria, 31 articles were retained for full *** results of this review highlight the potential of ML-based methods in the early detection of EC. The average accuracy of the reviewed methods in the analysis of endoscopic and computed tomography (CT (images of the esophagus was over 89%, indicating a high impact on early detection of EC. Additionally, the highest percentage of clinical images used in the early detection of EC with the use of ML was related to white light imaging (WLI) images. Among all ML techniques, methods based on convolutional neural networks (CNN) achieved higher accuracy and sensitivity in the early detection of EC compared to other *** findings suggest that ML methods may improve accuracy in the early detection of EC, potentially supporting radiologists, endoscopists, and pathologists in diagnosis and treatment planning. However, the current literature is limited, and more studies are needed to investigate the clinical applications of these met
In automatic feeding systems, feeding of characteristic workpieces by mechanical tools causes accuracy and cost difficulties. For this reason, in systems where special workpieces are fed, imageprocessingapplications...
详细信息
On the battlefield, early detection of armored vehicles can have a positive effect. Because according to this issue, timely and appropriate reactions can be done. The purpose of this study is to achieve the required a...
详细信息
On the battlefield, early detection of armored vehicles can have a positive effect. Because according to this issue, timely and appropriate reactions can be done. The purpose of this study is to achieve the required algorithm in the vehicle control system by considering the car sensor vision, which is necessary to identify and determine the equipment needed to control the military drone based on car sensor vision. Today, the use of wireless networks, especially inter-vehicle wireless networks, in military applications is inevitable. Therefore, in the first step of this research, a new method has been proposed to control and steer unmanned vehicles based on car vision. In the proposed method, two 180-degree panoramic cameras with horizontal vision are used from the recorded images. The simulation results of the proposed method show increased accuracy and reduced implementation cost compared to using LIDAR and RADAR technologies. In the second step, a new approach is introduced to identify four common classes of armored vehicles (tanks, personnel carriers, firing tanks, and military vehicles) that are more likely to be present on battlefields. For this purpose, the latest imageprocessing methods, which is deep learning, have been used. The results of the simulation of the proposed approach show the high accuracy of the proposed approach in detecting armored vehicles in a short time. In the third step of this research, a new method has been proposed to increase the connection of wireless networks. In the proposed method, queue theory is used and the results of the simulation of the proposed method show the high efficiency of the method. As a result, accurate and fast detection with unique features makes the users of the system superior.
暂无评论