Augmented reality is a visualization technology that displays information by adding virtual images to the real world. Effective implementation of augmented reality requires recognition of the current scene. Identifyin...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Augmented reality is a visualization technology that displays information by adding virtual images to the real world. Effective implementation of augmented reality requires recognition of the current scene. Identifying objects in real-time video on computationally limited hardware requires significant effort. One way to solve this problem is to create a hybrid system that, based on machine learning and computer vision technology, processes and analyzes visual data to identify and classify real-world objects. The proposed architecture is based on a combination of the Vuforia augmented system, which provides good performance by balancing prediction accuracy and efficiency. First, the Vuforia neural network architecture allows convenient interaction with AR in Unity and provides initial conditions for detecting 3D objects. The augmented reality construction algorithm is based on the ARCore framework and the OpenGL interface for embedded systems. The system integrates recognition data with an AR platform to display corresponding 3D models, allowing users to interact with them through the functionality of the AR application. This method also involves the development of an enhanced user interface for AR, making the augmented environment more accessible for navigation and control. Experimental research has shown that the proposed method significantly improves the accuracy of object recognition and the ease of working with 3D models in AR.
In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem. Given an image, these benchmarks probe a model's ab...
详细信息
ISBN:
(纸本)9781713899921
In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem. Given an image, these benchmarks probe a model's ability to identify its associated caption amongst a set of compositional distractors. Surprisingly, we find significant biases in all these benchmarks rendering them hackable. This hackability is so dire that blind models with no access to the image outperform state-of-the-art vision-language models. To remedy this rampant vulnerability, we introduce SUGARCREPE, a new benchmark for vision-language compositionality evaluation. We employ large language models, instead of rule-based templates used in previous benchmarks, to generate fluent and sensical hard negatives, and utilize an adversarial refinement mechanism to maximally reduce biases. We re-evaluate state-of-the-art models and recently proposed compositionality inducing strategies, and find that their improvements were hugely overestimated, suggesting that more innovation is needed in this important direction. We release SUGARCREPE and the code for evaluation at: https://***/RAIVNLab/sugar-crepe.
This paper proposes a machinevision system to monitor the tension in the metal sheet on a small-scale roll-to-roll machine. The proposed tension monitoring is applied to a roll-to-roll chemical vapor deposition (R2R-...
详细信息
作者:
Shalupriya, M.Indira, K.P.Saveetha University
Saveetha School of Engineering Saveetha Institute of Medical and Technical Sciences Department of Electronics and Communication Engineering Chennai India Saveetha University
Saveetha School of Engineering Saveetha Institute of Medical and Technical Sciences Department of Nanobiomaterials Chennai India
In micro turning, tool placement, wear, and breakage are crucial as they affect precision and surface finish. Flank wear is problematic since it impacts dimensional accuracy, and current methods that rely on CNC encod...
详细信息
According to the order interest of gray code and machinevision binary key, a complete digital image cryptographic algorithm is presented. A new method based on wavelet transform is proposed. At the same time, gray co...
详细信息
In this paper, an online monitoring system of welding quality based on machinevision and machine learning was proposed. A high-speed CCD camera was used to monitor the tail end of the molten pool, and the remove smal...
详细信息
This article proposes a simple and effective method for image subject segmentation. Our research mainly focuses on the characteristics of material images in the experimental platform. Through in-depth research, we hav...
详细信息
Today, bionic models for vision applications base on the general information pathways, structure and characteristics of the visual system implemented in intelligent algorithms, mostly based on AI, to improve the resol...
详细信息
The integration of human-robot interaction (HRI) technologies with industrial automation has become increasingly essential for enhancing productivity and safety in manufacturing environments. In this paper, we propose...
详细信息
This paper proposes a method based on the contour method to solve the problem of difficulty in measuring the wear state of diamond beaded wire during processing. The edge contour image of the diamond beaded wire was c...
详细信息
暂无评论