检索结果-内蒙古大学图书馆

Advanced Techniques for Geospatial Referencing in Online Media Repositories

FUTURE INTERNET 2024年第3期16卷 87页

作者： Warch, Dominik Stellbauer, Patrick Neis, Pascal Mainz Univ Appl Sci Sch Technol Dept Appl Informat & Geodesy D-55128 Mainz Germany

In the digital transformation era, video media libraries' untapped potential is immense, restricted primarily by their non-machine-readable nature and basic search functionalities limited to standard metadata. This study presents a novel multimodal methodology that utilizes advances in artificial intelligence, including neural networks, computer vision, and natural language processing, to extract and geocode geospatial references from videos. Leveraging the geospatial information from videos enables semantic searches, enhances search relevance, and allows for targeted advertising, particularly on mobile platforms. The methodology involves a comprehensive process, including data acquisition from ARD Mediathek, image and text analysis using advanced machine learning models, and audio and subtitle processing with state-of-the-art linguistic models. Despite challenges like model interpretability and the complexity of geospatial data extraction, this study's findings indicate significant potential for advancing the precision of spatial data analysis within video content, promising to enrich media libraries with more navigable, contextually rich content. This advancement has implications for user engagement, targeted services, and broader urban planning and cultural heritage applications.

关键词： natural language processing named entity recognition geocoding online media repository geospatial information extraction image-to-text audio-to-text

来源：评论

学校读者我要写书评

暂无评论

Real-time detection of road hazards for autonomous vehicle systems

Real-time detection of road hazards for autonomous vehicle s...

引用

International image processing, applications and Systems Conference (IPAS)

作者： Máté András Szabó Anita Keszler László Tizedes Machine Perception Research Lab. HUN-REN SZTAKI Budapest Hungary

ISBN: (数字)9798331506520

ISBN: (纸本)9798331506537

Automated detection of road hazards such as speed bumps, has become an important area of research due to its potential to improve road safety in autonomous driving. Various techniques have been introduced to detect these hazards using camera vision and artificial intelligence-based image processing methods. However, estimating their distance is still challenging. To address this problem and to satisfy the requirement for real-time on-board data processing, the proposed system has the following properties: (1) high-accuracy road hazard detection by analyzing mono-images and videos with a re-trained YOLO neural network; (2) precise distance measurement utilizing a LiDAR; and (3) efficient local data processing using ROS, implemented on an NVIDIA Jetson AGX Xavier. An important contribution of this paper is introducing multiple classes of road hazards when training the network, instead of only focusing on speed bumps and potholes. Furthermore we have analyzed different LiDAR technologies (standard rotating and non-repetitive circular scanning) to evaluate and compare their precision and to demonstrate that our method can be successfully applied regardless of the scanning pattern of the LiDAR.

关键词： Laser radar Roads image processing Cameras Data processing Hazards Real-time systems Distance measurement Sensors Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

Development and future of information hiding in image transformation domain: A literature review 22

Development and future of information hiding in image transf...

引用

4th International Conference on image processing and machine vision, IPMV 2022

作者： Yang, Yuer College Of Cyber Security Jinan University China School Of Economics Jinan University China

ISBN: (纸本)9781450395823

Information hiding technology is a technique to hide meaningful information in the public carrier information. When data elements are becoming more and more important, information hiding technology has a better performance than traditional encryption and decryption technologies such as single table substitution and Virginia cipher. Since it contains redundant information, image is a common carrier for information hiding among many carrier types. As one of the primary means of information hiding technology, image transform-domain information hiding technology has been widely studied and used in academic and industrial fields. Information hiding in the image transform domain improves security and robustness effectively. This paper mainly introduces the concepts, principles, processes, mainstream algorithms and applications of image transform-domain information hiding techniques. A possible general future development of it is also depicted. © 2022 ACM.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Trends in integration of vision and language research: A survey of tasks, datasets, and methods

引用

Journal of Artificial Intelligence Research 2021年 71卷 1183-1317页

作者： Mogadala, Aditya Kalimuthu, Marimuthu Klakow, Dietrich Saarland Informatics Campus Saarland University Saarbrücken66123 Germany

Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications. ©2021 AI Access Foundation. All rights reserved.

关键词： Surveys

来源：评论

学校读者我要写书评

暂无评论

An Improved Reversible Data Hiding Scheme Based on Referred Frequencies for a VQ Compressed Index

An Improved Reversible Data Hiding Scheme Based on Referred ...

引用

machine vision, image processing and Imaging Technology (MVIPIT), International Conference on

作者： Min-Shiang Hwang Kanza Fatima Yu-Lun Wang Chi-Shiang Chan Chia-Chun Wu Department of Computer Science & Information Engineering Fintech and Blockchain Research Center Asia University Taichung Taiwan Department of Computer Science & Information Engineering Asia University Taichung Taiwan Department of Management Information Systems National Chung Hsing University Taichung Taiwan Department of M-Commerce and Multimedia Applications Asia University Taichung Taiwan Department of Computer Science and Information Engineering National Quemoy University Kinmen Taiwan

ISBN: (数字)9798331543037

ISBN: (纸本)9798331543044

Nowadays, we usually compress images before uploading them to social media. However, images on social media can easily be copied, so embedding secret messages in compressed images has become increasingly popular. There are many compression methods, such as Huffman, VQ, ZIP, AMBTC, RAR, JPEG, etc. In this article, we propose an improved data hiding in VQ compression method to achieve better capacity and high quality. Experimental results show that our data-hiding approach is practical.

关键词： Steganography image coding Social networking (online) Vector quantization Bit rate Transform coding Imaging Indexes

来源：评论

学校读者我要写书评

暂无评论

Comparative assessment of common pre-trained CNNs for vision-based surface defect detection of machined components

引用

EXPERT SYSTEMS WITH applications 2023年第1期218卷

作者： Singh, Swarit Anand Kumar, Aitha Sudheer Desai, K. A. Indian Inst Technol Jodhpur Dept Mech Engn Jodhpur 342030 Rajasthan India

Small and Medium Enterprises (SMEs) and Micro, Small, and Medium Enterprises (MSMEs) contemplate inte-grating machine vision with high throughput manufacturing lines to ensure a consistent quality of standardized components. The inspection productivity can improve considerably by substituting machine vision with manual activities. The pre-trained Convolutional Neural Networks (CNNs) can facilitate enhanced machine vision ca-pabilities compared to the rule-based classical image processing algorithms. However, the non-availability of labeled datasets and lack of expertise in model development restricts their utilities for SMEs and MSMEs. The present work examines the practicality of utilizing publicly available labeled datasets while developing surface defect detection algorithms using pre-trained CNNs considering case studies of typical machined components -flat washers and tapered rollers. It is shown that the publicly available surface defect datasets are ineffective for specific-case such as machined surfaces of flat washers and tapered rollers. The explicitly labeled image datasets can offer better prediction abilities in such cases. A comparative assessment of common pre-trained CNNs is conducted to identify an appropriate network while developing a surface defect detection framework for machined components. The common pre-trained CNNs VGG-19, GoogLeNet, ResNet-50, EfficientNet-b0, and DenseNet-201 showing prediction abilities for similar classification tasks have been examined. The pre-trained CNNs developed using explicit image datasets were implemented to segregate defective flat washers and tapered rollers as sample components manufactured by SMEs and MSMEs. The performance assessment was accomplished using parameters estimated from the confusion matrix. It is observed that EfficientNet-b0 out-performs other networks on most parameters, and it can be preferred while developing a surface defect detection algorithm. The outcomes of the present study form the b

关键词： machine vision Surface defect detection Pre-trained CNNs image classification Labeled datasets

来源：评论

学校读者我要写书评

暂无评论

Research on Two-stage Conveying Bar Counting and Dividing System Based on vision 21

Research on Two-stage Conveying Bar Counting and Dividing Sy...

引用

21st International Symposium on Distributed Computing and applications for Business Engineering and Science, DCABES 2022

作者： Chen, Guojun Wu, Yang Wuxi Taihu University Jiangsu Key Laboratory of IoT Application Technology Wuxi214064 China College of Computer Internet of Tings Engineering Wuxi Taihu University Wuxi214064 China

ISBN: (纸本)9781665454629

A vision-based automatic bar counting system for two-stage conveying bars is proposed. The system solves the counting problems of sticking and relative sliding of a large number of rebar stacks through image processing algorithms. Through the improvement of the splitting end splitting machine, the design of two-step splitting and visual verification is adopted to achieve accurate splitting of stacked rebars. The system obviously reduces the requirements for installation space, and can make more effective use of the space in the rebar workshop. The system has fast running speed, high accuracy, small changes to the production line, and easy maintenance. It can not only meet the needs of production, but also greatly improve the efficiency of the existing counting system. It is suitable for rebar workshops with limited installation space. The experimental results show that the system can accurately count the bars and can accurately realize the steel separation. For bars with a diameter of 8-12mm, the counting accuracy is over 99.85%, and for bars with a diameter greater than 12mm, the counting accuracy is over 99.96%. © 2022 IEEE.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

Very Efficient Convolutional Neural Network Based on the Discrete Hirschman Transform

Very Efficient Convolutional Neural Network Based on the Dis...

引用

Asilomar Conference on Signals, Systems & Computers

作者： Weiwei Wang Victor DeBrunner Linda S. DeBrunner Dingli Xue Hanqing Zhao Ranran Tao

ISBN: (数字)9798350354058

ISBN: (纸本)9798350354065

Convolutional Neural Networks (CNNs) play a crucial role in computer vision and machine learning applications, but they are often associated with high computational demands. To tackle this challenge, researchers have turned to the Fast Fourier Transform (FFT) for spectral convolution to help reduce complexity. However, the Discrete Hirschman Transform (DHT) has emerged as a more efficient alternative for performing linear convolutions. In this study, we introduce a novel CNN methodology based on the principles of the DHT. Our experimental results highlight the impressive efficiency of this approach, significantly lowering both computational complexity and processing time. Additionally, we implement the DHT-based method in hardware to validate its performance in real-world applications, demon-strating its effectiveness in practical scenarios.

关键词： Computer vision Fast Fourier transforms Convolution image processing machine learning Hardware Convolutional neural networks Computational complexity Low latency communication Field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Expoliting Confidence-Based Model Fusion for Boosting image Classification Accuracy

Expoliting Confidence-Based Model Fusion for Boosting Image ...

引用

image processing, Computer vision and machine Learning (ICICML), International Conference on

作者： Xinjian Jiang Department of Computer Science and Technology Nanjing University Nanjing China

In the realm of deep learning, the traditional approach has been to train specialized models for individual tasks, which, although effective, is resource-intensive. The advent of large, universal models has mitigated this issue by offering multitask capabilities, reduced training time, and lower computational costs. However, these generalized models often underperform on specific tasks compared to specialized models. This paper introduces an innovative ensemble approach that integrates specialized and generalized models, specifically focusing on Contrastive Language–image Pretraining (CLIP) and EfficientNet. This work proposes three fusion strategies: Weighted Voting, Confidence Comparison, and Fully Connected Network Fusion, and evaluate them using the CIFAR-100 dataset. The ensemble model significantly outperforms individual models, achieving an adjusted accuracy of up to 0.848. The paper also introduces a novel evaluation metric, Confidence-Accuracy Correlation, to assess the reliability of model confidence. The findings could revolutionize ensemble learning by making it more adaptive and suited for real-world applications, thereby pushing the boundaries of possibility in artificial intelligence.

关键词：

来源：评论

学校读者我要写书评

暂无评论

面向海洋的水下图像处理与视觉技术进展

引用

信号处理 2023年第10期39卷 1748-1763页

作者：陈炜玲邱艳玲赵铁松魏宏安程恩福州大学福建省媒体信息智能处理与无线传输重点实验室福建福州350108 中国福建光电信息科学与技术创新实验室(闽都创新实验室) 福建福州350108 厦门大学水声通信与海洋信息技术教育部重点实验室福建厦门361005

水下观测是探索海洋最直观的手段之一。受水下光学特性、声学特性以及杂波、水生生物等的影响,水下观测中所采集的图像并不总能满足观测需求。如何对水下图像进行有效的处理、分析与应用是一个具有挑战性的课题。尽管图像处理与计算机... 详细信息

水下观测是探索海洋最直观的手段之一。受水下光学特性、声学特性以及杂波、水生生物等的影响,水下观测中所采集的图像并不总能满足观测需求。如何对水下图像进行有效的处理、分析与应用是一个具有挑战性的课题。尽管图像处理与计算机视觉技术已在大气环境中得到广泛研究,但鉴于成像原理、应用背景等方面的差异,针对大气自然图像提出的算法无法直接移植到水下任务中,而针对水下场景提出的视觉应用仍存在对任务背景考虑不足、泛化性差等缺陷。本文面向光学图像以及声学图像这两类水下观测的主要手段,从图像特性入手,首次以任务为导向、以需求为脉络,通过梳理国内外成功的水下图像处理、质量评价案例,对水下观测方案的工作思路进行了更完备的总结与分析。此外,本文围绕水下机器视觉应用探讨其发展进程,详细讨论与展望了相关领域的前景与优化方向,为突破海洋视觉应用的瓶颈,建设智慧海洋系统带来新思路。

关键词：水下图像处理质量评价机器视觉智慧海洋

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：