检索结果-内蒙古大学图书馆

Baking Neural Radiance Fields for Real-Time View Synthesis

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2025年第5期47卷 3310-3321页

作者： Hedman, Peter Srinivasan, Pratul P. Mildenhall, Ben Reiser, Christian Barron, Jonathan T. Debevec, Paul Google Res Mountain View CA 94043 USA Netflix Los Gatos CA 95032 USA

Neural volumetric representations such as Neural Radiance Fields (NeRF) have emerged as a compelling technique for learning to represent 3D scenes from images with the goal of rendering photorealistic images of the scene from unobserved viewpoints. However, NeRF's computational requirements are prohibitive for real-time applications: rendering views from a trained NeRF requires querying a multilayer perceptron (MLP) hundreds of times per ray. We present a method to train a NeRF, then precompute and store (i.e., "bake") it as a novel representation called a Sparse Neural Radiance Grid (SNeRG) that enables real-time rendering on commodity hardware. To achieve this, we introduce 1) a reformulation of NeRF's architecture and 2) a sparse voxel grid representation with learned feature vectors. The resulting scene representation retains NeRF's ability to render fine geometric details and view-dependent appearance, is compact (averaging less than 90 MB per scene), and can be rendered in real-time (higher than 30 frames per second on a laptop GPU). Actual screen captures are shown in our video.

关键词： Rendering (computer graphics) Three-dimensional displays Real-time systems image color analysis Vectors Graphics processing units image reconstruction Computer vision neural rendering real-time rendering view synthesis

来源：评论

学校读者我要写书评

暂无评论

Semantic Threads Enabling image-Text Retrieval via VQA Transformers 1

Semantic Threads Enabling Image-Text Retrieval via VQA Trans...

引用

1st International Conference on Computational Intelligence for Security, Communication and Sustainable Development, CISCSD 2024

作者： Noorbhasha, Junnubabu Guddeti, Rohitha Lingutla, Satvika Etikikota, Sujitha Kumkumkari, Santhosh Madanapalle Institute of Technology and Science Computer Science and Technology Madanapalle India

ISBN: (纸本)9798350365405

The integration of vision and language has propelled the advancement of artificial intelligence systems. Visual Question Answering (VQA) stands at the nexus of computer vision and natural language processing, enabling machines to comprehend and respond to image-related queries. This paper introduces a novel VQA approach harnessing the capabilities of the BLIP (Bootstrapping Language image Pretrained) model, a transformer-based architecture esteemed for its natural language understanding prowess. The methodology involves image preprocessing, and question translation into a standardized language for efficient processing by BLIP. Mainly, the study integrates multilingual support into the VQA framework, facilitating seamless interaction with users across diverse linguistic backgrounds. Through rigorous experimentation, this paper demonstrates the effectiveness of our approach in accurately answering questions in various languages These findings underscore the robustness and adaptability of the BLIP model in handling multilingual inputs, thereby enhancing accessibility and usability in real-world applications. This research contributes to advancing the state-of-the-art in VQA systems by addressing language barriers and promoting inclusivity in human-machine interaction. © 2024 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

image Segmentation Based on Adaptive Quaternion Anisotropic Gradient for Optical Inspection applications 11

Image Segmentation Based on Adaptive Quaternion Anisotropic ...

引用

Optical Metrology and Inspection for Industrial applications XI 2024

作者： Zelensky, A. Gapon, N. Zhdanova, M. Voronin, V. Ilukhin, Y. Gribkov, A. Scientific-Manufacturing Complex «Technological Centre» Zelenograd Russia Don State Technical University Rostov-on-Don Russia Center for Cognitive Technology and Machine Vision Moscow State University of Technology «STANKIN» Moscow Russia

ISBN: (纸本)9781510682108

image segmentation is the critical step in different imaging and especially optical inspection applications: detection and recognition of objects, classification, analysis, and identification. Also, image gradient, as a preprocessing step, is an essential tool in image processing in many research areas, such as edge detection, segmentation, inpainting, etc. However, these tools have limitations and could be more accurate since the capture devices usually generate low-resolution images, which are primarily noisy and blurry. It is critical to receive useful gradient estimation on noisy color images while preserving the sharp edges. In the present paper, we develop a new gradient by integrating the quaternion framework with local polynomial approximation and the intersection of confidence intervals based on anisotropic gradient concepts for color image processing applications. We apply the proposed gradient technique in a modified active contour method to perform an automated segmentation for optical inspection applications. Computer simulations on the segmentation dataset for optical inspection applications show that the new adaptive quaternion anisotropic gradient exhibits fewer color artefacts than state-of-the-art techniques. © 2024 SPIE.

关键词： Polynomial approximation

来源：评论

学校读者我要写书评

暂无评论

Indian fake currency detection using image processing and machine learning

引用

International Journal of Information Technology (Singapore) 2024年第8期16卷 4953-4966页

作者： Bandu, Sai Charan Deep Kakileti, Murari Jannu Soloman, Shyam Sunder Baydeti, Nagaraju Department of Computer Science and Engineering National Institute of Technology Nagaland Chumukedima Nagaland Dimapur 797103 India

The escalating production of counterfeit notes, facilitated by advancements in color printing and scanning, poses a significant global challenge impacting economies and security. This issue, prevalent in countries like India, has negative ramifications, including the funding of illegal activities and terrorism. Despite efforts, such as demonetization in 2016, counterfeits persist, necessitating innovative solutions. The proposed model introduces a fake note detection system utilizing computer vision and machine learning, specifically a Convolutional Neural Network (CNN). CNN effectively extracts intricate features from input data, showcasing its proficiency in pattern recognition. Notably, the system focuses on individual security features within banknotes, distinguishing it from other approaches that analyze entire note images. The primary goal is swift and accurate detection and reduction of counterfeit circulation, contributing to the overall security of the economy. The proposed model resulted in an impressive accuracy of 91.66% for all the six security features in the Indian denomination of Rs. 500, 95.25% for all the six security features in the Indian denomination of Rs. 200, 92.66% for all the six security features in the Indian denomination of Rs.100. © Bharati Vidyapeeth's Institute of Computer applications and Management 2024.

关键词： Convolutional neural network Counterfeit notes image processing machine learning Security features extraction

来源：评论

学校读者我要写书评

暂无评论

Road Infrastructure Defect Detection using Yolo8Seg Based Approach 6

Road Infrastructure Defect Detection using Yolo8Seg Based Ap...

引用

6th IEEE International Conference on image processing, applications and Systems, IPAS 2025

作者： Alsubaie, Norah A. Almalki, Ghayda A. Almutairi, Ghada N. Alrumaih, Sarah A. Princess Nourah Bint Abdulrahman University Department of Computer Sciences Riyadh Saudi Arabia

ISBN: (纸本)9798331506520

This research introduces "Jaddah,"an innovative AI-based system for the automated detection of road infrastructure defects using advanced computer vision and machine learning techniques. The system addresses the limitations of traditional road inspection methods, which are often slow and prone to human error. Jaddah develops a mobile application that efficiently detects, classifies, and segments road defects at the pixel level. By utilizing a comprehensive dataset of high-resolution images, the model training process is significantly enhanced. The YOLOv8-seg model is implemented to achieve precise defect localization and segmentation, ensuring high accuracy in identifying and categorizing road defects. Performance metrics show an impressive 87% mAP50, demonstrating reliable defect detection. These results contribute to improved infrastructure maintenance, enhanced road safety, and greater operational efficiency. © 2025 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Autonomous Object Detection and Counting using Edge Detection and image processing Algorithms 7

Autonomous Object Detection and Counting using Edge Detectio...

引用

7th International Conference on Trends in Electronics and Informatics, ICOEI 2023

作者： Patil, Swati B. Shimpi, Jay Chandrakant Tanawade, Archana Girish Chavan, Pranali Gajanan Tandulkar, Vrushali Shrimant Information Technology Vishawakarma Institute of Information Technology Pune India T and T Infra Ltd Pune India

ISBN: (纸本)9798350397284

machine vision applications are commonly utilised in manufacturing lines as low cost, high precision measuring devices. Output facilities can accomplish high production numbers without mistakes thanks to these solutions that offer contactless control and measurement. A camera may be used to carry out machine vision tasks including product counting., error checking., and dimension measuring. This study makes a recommendation for a vision system application that can do inanimate object item enumeration. The recommended solution uses Otsu thresholding., Hough transformations., edge detection methods., and other image processing algorithms to accomplish automatic counting without taking into account the kind or colour of the product. The system primarily uses one camera. The general idea is to get image with balanced contrast., brightness and appropriate HSV values in it. A picture of the items being captured using camera using android device., and different image processing techniques are then applied to the picture. Further., a real-time machine vision programme was deployed and took photos taken from an actual experimental setup. The practical experiments conducted have shown that the suggested technique yields quick., precise., and trustworthy results based on the comparative study of various detection techniques. © 2023 IEEE.

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

RECOGNITION OF SURFACE CORROSION MORPHOLOGY ON COASTAL ENGINEERING STRUCTURES USING machine vision TECHNOLOGY 43

RECOGNITION OF SURFACE CORROSION MORPHOLOGY ON COASTAL ENGIN...

引用

ASME 43rd International Conference on Ocean, Offshore and Arctic Engineering (OMAE)

作者： Yu, Qifeng Han, Yudong Shanghai Maritime Univ Coll Transport & Commun Shanghai Peoples R China

ISBN: (纸本)9780791887806

The harsh marine atmospheric conditions, including high temperatures, humidity, and salt spray, prevalent in the coastal areas of Hainan, pose a significant challenge to the durability of ground facilities and equipment, often resulting in corrosion and functional degradation. Hence, accurate monitoring of corrosion is paramount for maintaining coastal engineering structures. This study leveraged data from the Chinese National Center for Materials Corrosion and Protection Science to extract corrosion morphology features using image analysis techniques. Subsequently, a corrosion identification model was developed using machine learning algorithms. The model's accuracy was validated through various evaluation metrics. The research outcomes have practical applications in the maintenance and management of coastal engineering structures by providing automatic corrosion morphology recognition. This enables maintenance personnel to promptly undertake repair measures, thereby reducing maintenance costs and enhancing structural sustainability.

关键词： Corrosion morphology machine learning techniques image processing YOLO v5 automatic recognition

来源：评论

学校读者我要写书评

暂无评论

Enhancing out-of-distribution learning in computer vision through dominant feature masking

引用

PATTERN ANALYSIS AND applications 2025年第2期28卷 1-30页

作者： Pilzak, Artem Thivierge, Jean-Philippe Univ Ottawa Sch Psychol Ottawa ON K1N 6N5 Canada Univ Ottawa Brain & Mind Res Inst Ottawa ON K1N 6N5 Canada

Out-of-distribution (OOD) learning presents a major challenge in machine learning as models must effectively generalize to previously unseen data. This challenge is prevalent in deep learning models, which tend to focus on the most dominant features in images. This narrow focus impedes OOD learning, where critical features are concealed or absent during testing, leading to reduced prediction accuracy. To address this issue, we introduce a novel data augmentation approach termed Dominant Feature Masking (DFM), inspired by human visual holistic processing. DFM strategically conceals and reveals the most prominent features within images, allowing neural networks to simultaneously capture both dominant and non-dominant attributes, thereby enhancing adaptability to OOD data. We evaluated DFM using a novel set of learning challenges termed Versatile Evaluation Benchmark (VEB), which assesses model performance on three distinct tasks: (i) augmented MNIST images to test resilience against diverse transformations;(ii) a novel dataset of unseen image classes to examine performance on new instances within familiar categories;and (iii) a dataset created by DALL-E to challenge class differentiation with artificially mixed features. Our results demonstrate that DFM significantly improves OOD generalization compared to traditional augmentation techniques, achieving marked enhancements across various conditions without compromising in-distribution testing accuracy. These findings underscore the potential of DFM to improve the performance of computer vision systems in various real-world scenarios, making them more robust and adaptable to unexpected data variations. By leveraging VEB, researchers will gain a deeper understanding of their models' generalization performance, ensuring that CNNs are well-equipped to handle the complexities of real-world applications. The source code and VEB datasets are available at https://***/Deepvisionary/DFM.

关键词： Out-of-distribution learning Convolutional neural networks Data augmentation Feature acquisition Domain generalization

来源：评论

学校读者我要写书评

暂无评论

A Review on Generative Adversarial Networks: Algorithms, Theory, and applications

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2023年第4期35卷 3313-3332页

作者： Gui, Jie Sun, Zhenan Wen, Yonggang Tao, Dacheng Ye, Jieping Southeast Univ Sch Cyber Sci & Engn Nanjing 211100 Jiangsu Peoples R China Purple Mt Labs Nanjing 210000 Peoples R China Univ Michigan Dept Computat Med & Bioinformat Ann Arbor MI USA Chinese Acad Sci Ctr Res Intelligent Percept & Comp Beijing 100190 Peoples R China Nanyang Technol Univ Sch Comp Sci & Engn Singapore 639798 Singapore JD Explore Acad Beijing Peoples R China Univ Sydney Sch Comp Sci Camperdown Australia Beike Beijing 100085 Peoples R China Univ Michigan Ann Arbor MI 48109 USA

Generative adversarial networks (GANs) have recently become a hot research topic;however, they have been studied since 2014, and a large number of algorithms have been proposed. Nevertheless, few comprehensive studies explain the connections among different GAN variants and how they have evolved. In this paper, we attempt to provide a review of the various GAN methods from the perspectives of algorithms, theory, and applications. First, the motivations, mathematical representations, and structures of most GAN algorithms are introduced in detail, and we compare their commonalities and differences. Second, theoretical issues related to GANs are investigated. Finally, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, the medical field, and data science are discussed.

关键词： Generators Generative adversarial networks Data models Linear programming Natural language processing machine learning algorithms Inference algorithms Deep learning generative adversarial networks algorithm theory applications

来源：评论

学校读者我要写书评

暂无评论

Bio-inspired smart vision sensor: toward a reconfigurable hardware modeling of the hierarchical processing in the brain

引用

JOURNAL OF REAL-TIME image processing 2021年第1期18卷 157-174页

作者： Bhowmik, Pankaj Pantho, Md Jubaer Hossain Bobda, Christophe Univ Florida Dept Elect & Comp Engn Gainesville FL 32611 USA

Biological vision systems inspire processing methods in computer vision applications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierarchical architecture for smart image sensors. The architecture aims to bring computation close to the sensor to achieve high acceleration for different machine vision applications while consuming low power. We logically divide the image into multiple regions and perform pixel-level and region-level processing after removing spatiotemporal redundancy. Those processors use bio-inspired algorithms to activate the regions with region of interest of a scene. The hierarchical processing breaks the traditional sequential image processing and introduces parallelism for machine vision applications. Also, we make the hardware design reconfigurable even after fabrication to make the hardware reusable for different applications. Simulation results show that the area overhead and power penalty for adding reconfigurable features stay in an acceptable range. We emphasize to maximize the operating speed and obtain 800 MHz. Besides, the design saves 84.01% and 96.91% dynamic power at the first and second stages of the hierarchy by removing redundant information. Furthermore, the sequential deployment of high-level reasoning only on the selected regions of the image becomes computationally inexpensive to execute a complex task in real time.

关键词： Biological vision Pixel-level processing Reconfigurability Predictive coding Attention module Smart image sensor FPGA ASIC

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：