this article proposes methods for maximising the detection rates of thermal fiducial markers using thermography. By exploring the combination of imageprocessing techniques withthe use of an affordable thermographic ...
详细信息
ISBN:
(数字)9783031774324
ISBN:
(纸本)9783031774317;9783031774324
this article proposes methods for maximising the detection rates of thermal fiducial markers using thermography. By exploring the combination of imageprocessing techniques withthe use of an affordable thermographic camera, the aim is to mitigate the negative effects of thermography and improve accurate marker identification in a variety of mounting and distance conditions. the research identified a diversity of processing techniques capable of improving thermal marker recognition, offering the potential to surpass previous results. the results highlight the possibility of using low-cost thermographic cameras for this purpose, which could democratise and reduce the costs of recognition processes. this methodology validates the proposed approach, providing a robust basis for future improvements in thermal marker detection and promoting the feasibility of practical, low-cost applications in an assortment of fields.
All animals that can sight, including humans, do so in a way that appears instinctive and natural. Creatures use their vision system early in life often even from birth to traverse their surroundings, recognize and co...
详细信息
Existing deep learning methods encounter challenges when confronted withthe scene recognition task, particularly in cases with significant intra-category differences, primarily due to limited available data. To bette...
详细信息
ISBN:
(纸本)9798350362770;9798350362763
Existing deep learning methods encounter challenges when confronted withthe scene recognition task, particularly in cases with significant intra-category differences, primarily due to limited available data. To better address this problem, in this paper, we propose Contrastive Language-image Pre-Training with DataStd and Similarity networks (CLIP-DS), a novel scene recognition method built upon CLIP architecture, which utilizes extensive pre-training knowledge. In addition, to comprehensively analyze the similarity sequence provided by CLIP, we enhance it withthe specialized Similarity networks, comprising of a Multi-stream Multilayer Perceptron block. To improve the data space of similarity sequences and ensure stable training, we further propose the DataStd network with learnable parameters. To the best of our knowledge, this work proposes the first dataset for the challenging intra-category difference industrial scene recognition task, encompassing 120k images. the experimental results demonstrate that the proposed CLIP-DS outperforms the other methods, achieving an accuracy of 87.59% with fine-tuning on only 90 images. Ablation Studies also demonstrate the effectiveness of the proposed Similarity and DataStd networks and identify optimized-prompt combination as the most suitable prompt strategy.
image segmentation base on deep learning methods is an important direction in computervision field. However, these models over-rely on color features in image segmentation tasks, which leads to poor segmentation effe...
详细信息
ISBN:
(纸本)9798331530372;9798331530365
image segmentation base on deep learning methods is an important direction in computervision field. However, these models over-rely on color features in image segmentation tasks, which leads to poor segmentation effect in scenes withthe interference of similar background colors. To solve this problem, this paper successfully improves the U-Net model by introducing the technical means of combining gray channel and attention mechanism. the experimental results show that compared withthe original U-Net model, the average accuracy of the improved U-Net with gray channel attention has increased from 81.69% to 82.61%. At the same time, we apply this method mechanism to improved models of U-Net such as Attention U-Net and R2U-net, and similar effect is verified. these results verify that the combination of gray channel and attention mechanism can effectively improve the robustness and accuracy of deep learning model when processing color-similar background in image segmentation tasks. this work has important practical application value and provides a new solution for image segmentation tasks in complex scenes.
the interpretability of multivariate time series anomaly detection is crucial for understanding the reasons behind anomalies, enhancing the usability and credibility of models, and ensuring successful real-world appli...
详细信息
the recycling and reuse of car parts aligns with current energy conservation and environmental protection trends. Reused car parts may exhibit scratches and dents, necessitating their detection before being reintroduc...
详细信息
Transformer based architectures have become the common choice in natural language processing and are now achieving SOTA performance in computervision tasks such as image classification, object detection. However, the...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Transformer based architectures have become the common choice in natural language processing and are now achieving SOTA performance in computervision tasks such as image classification, object detection. However, the convolutional method still keeps SOTA performance in many approaches of 3D human pose estimation. Inspired by recent development in vision transformers, we design a heatmap-free structure using standard transformer architecture and learnable object queries to model the human joint relation within each frame and then output accurate joint positions and types, we also present a transformer based pose recognition architecture without any greedy algorithm to post-processing predicted bones during runtime. In the experiments, we achieve the best performance among methods that directly regress 3D joint position from a single RGB image, and report competitive results with many 2D to 3D Lifting approaches.
Watermarks in historical manuscripts are figural shapes serving as tokens for provenance research (e.g. scribe identification, dating, papermill attribution, scribe-papermaker relation, trading, etc.) in Humanities su...
详细信息
ISBN:
(纸本)9783031705427;9783031705434
Watermarks in historical manuscripts are figural shapes serving as tokens for provenance research (e.g. scribe identification, dating, papermill attribution, scribe-papermaker relation, trading, etc.) in Humanities such as Musicology. As of today, they come in a variety of formats: digitized handtracings and rubbings, X-ray based imagery and, more recently, thermograms acquired with infrared (IR) cameras - all of which have been made accessible via image data bases in libraries or archives like the watermark information system (WZIS). A key use case from a scholar's perspective is the search for similar or even equal watermarks in whatever digitized data collections. Non-surprisingly, the prerequisite is the availability of a versatile, reliable, and user-friendly tool comprising methods from digital imageprocessing (IP) and patternrecognition (PR). In our paper, we focus on bridging the gap between digitized thermograms of music manuscripts and watermark classification for similarity-based search through (i) a state-of-the-art (SOTA) analysis, (ii) a resulting conceptual design based on well-understood SOTA as well as novel methods, (iii) an easy-to-use implementation, and (iv) an experimental validation as Proof-of-Concept (PoC). the current system performance is characterized using thermograms recently made openly available within the DRACMarkS project as well as WZIS. the experimental results clearly demonstrate success in bridging the existing gap hence also setting a baseline for an as yet lacking benchmark.
In recent years, stereo matching employing neural networks has emerged as a crucial research direction within the domain of computervision. Stereoscopic depth estimation hinges on the optimal correspondence between t...
详细信息
In order to enhance the intelligence of the power system, the functionality of the equipment is assessed in real time through the transmission of images and videos from a remote location. Although the integration of i...
详细信息
暂无评论