检索结果-内蒙古大学图书馆

7th International Conference on machine vision and applications (ICMvA)

作者： Christian Matias, Angelo Patrick Del Gallego, Neil De La Salle Univ Manila Philippines De La Salle Univ Graph Animat Multimedia & Entertainment GAME Lab Manila Philippines

ISBN: (纸本)9798400716553

Modern smartphones usually have automatic camera adjustment features that predetermine how images will be processed. Without an intervention from the user (e.g., manual adjustment of exposure settings, addition/removal of certain image filters), the predetermined camera settings dictate the look and feel of images taken. Since higher-end mobile devices tend to gravitate towards a more visually appealing style and clearer images, image enhancement on entry-level devices could be performed by transferring the style from a higher-end device to a lower-end one. This paper proposes a learning-based, style-driven image enhancement for entry-level devices. Using a deep residual style transfer network, we train a model that learns the relationship between images taken from a high-end device and those taken from an entry-level device to create a filter that could be used to enhance the images captured from an entry-level device. Our quantitative and qualitative analyses show that our proposed method can enhance images to match the qualities produced by higher-end mobile device cameras.

关键词： image processing image enhancement super-resolution computational photography mobile devices

来源：评论

学校读者我要写书评

暂无评论

FruitQ: a new dataset of multiple fruit images for freshness evaluation

引用

MULTIMEDIA TOOLS AND applications 2024年第4期83卷 11433-11460页

作者： Abayomi-Alli, Olusola O. O. Damasevicius, Robertas Misra, Sanjay Abayomi-Alli, Adebayo Kaunas Univ Technol Dept Software Engn LT-44249 Kaunas Lithuania Inst Energy Technol Dept Appl Data Sci N-1777 Halden Norway Fed Univ Agr Dept Comp Sci Abeokuta 110124 Nigeria

Application of artificial intelligence methods in agriculture is gaining research attention with focus on improving planting, harvesting, post-harvesting, etc. Fruit quality recognition is crucial for farmers during harvesting and sorting, for food retailers for quality monitoring, and for consumers for freshness evaluation, etc. However, there is a lack of multi-fruit datasets to support real-time fruit quality evaluation. To address this gap, we present a new dataset of fruit images aimed at evaluating fruit freshness, which addresses the lack of multi-fruit datasets for real-time fruit quality evaluation. The dataset contains images of 11 fruits categorized into three freshness classes, and five well-known deep learning models (ShuffleNet, SqueezeNet, EfficientNet, ResNet18, and MobileNet-v2) were adopted as baseline models for fruit quality recognition using the dataset. The study provides a benchmark dataset for the classification task, which could improve research endeavors in the field of fruit quality recognition. The dataset is systematically organized and annotated, making it suitable for testing the performance of state-of-the-art methods and new learning classifiers. The research community in the fields of computer vision, machine learning, and pattern recognition could benefit from this dataset by applying it to various research tasks such as fruit classification and fruit quality recognition. The study achieved impressive results with the best classifier being ResNet-18 with an overall best performance of 99.8% for accuracy. The study also identified limitations, such as the small size of the dataset, and proposed future work to improve deep learning techniques for fruit quality classification tasks.

关键词： fruit freshness evaluation fruit decay detection precision agriculture image processing computer vision

来源：评论

学校读者我要写书评

暂无评论

Low-cost real-time traffic situational awareness system based on modified YOLO v8 and GWO-LSTM for edge deployment

引用

JOURNAL OF REAL-TIME image processing 2025年第2期22卷 1-14页

作者： Liu, Jianwen Gong, Ruyue Gong, Yi Li, Zeqin Chen, Zhiwei Neusoft Inst Guangdong Foshan 528200 Peoples R China South China Normal Univ Foshan 528200 Peoples R China Qingdao Univ Sci & Technol Qingdao 266033 Peoples R China

For a traditional traffic situational awareness system (TSAS), its "Road-side unit (RSU) + cloud-based analysis" structure is difficult to meet the demands of rapidly expanding urban areas. Relatively high costs of microwave speed detection modules and bandwidth requirements of information systems significantly increase construction costs. By computer vision (Cv) and edge computing technologies, traffic situational awareness tasks can be integrated into cheaper edge devices (roadside surveillance, RSS), effectively addressing such challenges. In this study, we present a low-cost TSAS developed based on YOLO v8 and grey wolf optimizer-long short-term memory (GWO-LSTM) neural network. Proposed system can automatically perform vehicle and license plate recognition, speed measurement, and data recording within the field of view of RSSs. Additionally, it accurately predicts the future traffic conditions of monitored roads using recorded information. Experimental results demonstrate that the proposed TSAS achieves a license plate recognition accuracy of 97.7%, vehicle type recognition accuracy of 98.1%, and speed measurement error of less than 0.45 km/h, with R2 of 0.8971 for GWO-LSTM predictions. This system is sufficiently effective for traffic monitoring and situational awareness tasks but enforcement forensic applications.

关键词： Traffic situational awareness system (TSAS) YOLO v8 Long short-term memory (LSTM) neural network Gray wolf optimization (GWO) Computer vision (Cv)

来源：评论

学校读者我要写书评

暂无评论

Classification of objects and scenes by a neural network with pretrained input modules to decode spatial texture inhomogeneities

引用

JOURNAL OF OPTICAL TECHNOLOGY 2023年第1期90卷 20-25页

作者： v. Yavna, D. v. Babenko, v. Gorbenkova, O. A. v. Plavelsky, I. voronaya, v. D. Stoletniy, A. S. Southern Fed Univ Rostov Na Donu Russia

Subject of study. The study investigated the possibility of using neural network models of second-order visual mech-anisms as inputs for neural network classifiers. Second-order visual mechanisms can detect spatial inhomogeneities in the contrast, orientation, and spatial frequency of an image. These mechanisms are traditionally considered one of the stages of early visual processing;their role in the perception of textures has been well studied. Aim of study. The study aimed to investigate whether the use of classifier input modules pretrained to demodulate the spatial modulations of luminance gradients contributed to object and scene classifications. Method. Neural network modeling was used as the main method. At the first stage of the study, a set of texture images was generated to train neural network models of second-order visual mechanisms. At the second stage, the object and scene samples were prepared, based on which classifier networks were trained. Pretrained models of second-order visual mechanisms with fixed weights were applied as these network inputs. Main results. The second-order information presented as a map of instantaneous values of the modulation function of contrast, orientation, and spatial frequency of the image was sufficient for the identification of only some of the scene classes. In general, the use of the values of luminance gradient modulation functions for object classification proved to be ineffective within the proposed neural network architectural framework. Thus, the hypothesis stating that second-order visual filters encode features enabling the identification of objects was not confirmed. This result makes it necessary to test an alternative hypothesis stating that the role of second-order filters is limited to the construction of saliency maps, and filters are windows through which information is received from the first-order filter outputs. Practical significance. The possibility of using second-order models of visual mechanism

关键词： image processing machine vision Modulation Neural networks Spatial frequency vision modeling

来源：评论

学校读者我要写书评

暂无评论

A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues

引用

ARTIFICIAL INTELLIGENCE REvIEW 2023年第11期56卷 13619-13661页

作者： Sharma, Himanshu Padha, Devanand Cent Univ Jammu Dept Comp Sci & Informat Technol Jammu & Kashmir Jammu 181124 India

image captioning is a pretty modern area of the convergence of computer vision and natural language processing and is widely used in a range of applications such as multi-modal search, robotics, security, remote sensing, medical, and visual aid. The image captioning techniques have witnessed a paradigm shift from classical machine-learning-based approaches to the most contemporary deep learning-based techniques. We present an in-depth investigation of image captioning methodologies in this survey using our proposed taxonomy. Furthermore, the study investigates several eras of image captioning advancements, including template-based, retrieval-based, and encoder-decoder-based models. We also explore captioning in languages other than English. A thorough investigation of benchmark image captioning datasets and assessment measures is also discussed. The effectiveness of real-time image captioning is a severe barrier that prevents its use in sensitive applications such as visual aid, security, and medicine. Another observation from our research is the scarcity of personalized domain datasets that limits its adoption into more advanced issues. Despite influential contributions from several academics, further efforts are required to construct substantially robust and reliable image captioning models.

关键词： Attention-based image captioning Encoder-decoder architecture image captioning Multimodal embedding

来源：评论

学校读者我要写书评

暂无评论

Tool wear monitoring based on scSE-ResNet-50-TSCNN model integrating machine vision and force signals

引用

MEASUREMENT SCIENCE AND TECHNOLOGY 2024年第8期35卷 086117-086117页

作者： Nie, Peng Guo, Yongxi Lou, Bixuan Yang, Chengyue Cao, Lei Pan, Wujiu Shenyang Aerosp Univ Sch Mechatron Engn Shenyang 110136 Peoples R China Shenyang Aerosp Univ Sch Aerosp Engn Shenyang 110136 Peoples R China Shenyang Aircraft Corp Shenyang 110031 Peoples R China

In the realm of mechanical machining, tool wear is an unavoidable phenomenon. Monitoring the condition of tool wear is crucial for enhancing machining quality and advancing automation in the manufacturing process. This paper investigates an innovative approach to tool wear monitoring that integrates machine vision with force signal analysis. It relies on a deep residual two-stream convolutional model optimized with the scSE (concurrent spatial and channel squeeze and excitation) attention mechanism (scSE-ResNet-50-TSCNN). The force signals are converted into the corresponding wavelet scale images following wavelet threshold denoising and continuous wavelet transform. Concurrently, the images undergo processing using contrast limited adaptive histogram equalization and the structural similarity index method, allowing for the selection of the most suitable image inputs. The processed data are subsequently input into the developed scSE-ResNet-50-TSCNN model for precise identification of the tool wear state. To validate the model, the paper employed X850 carbon fibre reinforced polymer and Ti-6Al-4v titanium alloy as laminated experimental materials, conducting a series of tool wear tests while collecting pertinent machining data. The experimental results underscore the model's effectiveness, achieving an impressive recognition accuracy of 93.86%. When compared with alternative models, the proposed approach surpasses them in performance on the identical dataset, showcasing its efficient monitoring capabilities in contrast to single-stream networks or unoptimized networks. Consequently, it excels in monitoring tool wear status and promots crucial technical support for enhancing machining quality control and advancing the field of intelligent manufacturing.

关键词： tool wear deep learning attention mechanism machine vision force signals

来源：评论

学校读者我要写书评

暂无评论

Open-world machine Learning: applications, Challenges, and Opportunities

引用

ACM COMPUTING SURvEYS 2023年第10期55卷 1-37页

作者： Parmar, Jitendra Chouhan, Satyendra Raychoudhury, vaskar Rathore, Santosh Malaviya Natl Inst Technol MNIT Dept Comp Sci & Engn Jaipur 302017 Rajasthan India Miami Univ Dept Comp Sci & Software Engn 510 E High St Oxford OH 45056 USA ABV IIITM Gwalior Dept Comp Sci & Engn Gwalior 474015 Madhya Pradesh India

Traditional machine learning, mainly supervised learning, follows the assumptions of closed-world learning, i.e., for each testing class, a training class is available. However, such machine learning models fail to identify the classes, which were not available during training time. These classes can be referred to as unseen classes. Open-world machine Learning (OWML) is a novel technique, which deals with unseen classes. Although OWML is around for a few years and many significant research works have been carried out in this domain, there is no comprehensive survey of the characteristics, applications, and impact of OWML on the major research areas. In this article, we aimed to capture the different dimensions of OWML with respect to other traditional machine learning models. We have thoroughly analyzed the existing literature and provided a novel taxonomy of OWML considering its two major application domains: Computer vision and Natural Language processing. We listed the available software packages and open datasets in OWML for future researchers. Finally, the article concludes with a set of research gaps, open challenges, and future directions.

关键词： Open-world machine Learning continual machine learning incremental learning open-world image and text classification

来源：评论

学校读者我要写书评

暂无评论

Low-light image enhancement using the illumination boost algorithm along with the SKWGIF method

引用

Multimedia Tools and applications 2024年第17期84卷 1-35页

作者： Radmand, Elnaz Saberi, Erfan Sorkhi, Ali Ghanbari Pirgazi, Jamshid Department of Computer Engineering University of Science and Technology of Mazandaran Behshahr Iran

Low-light image enhancement is highly desirable for outdoor image processing and computer vision applications. Research conducted in recent years has shown that images taken in low-light conditions often pose two main problems, the first of which is low visibility (i.e., small pixel intensities). Secondly, due to the low signal-to-noise ratio, noise also becomes prominent and obscures the image content. For this reason, images with low noise are usually employed in this application, a practice not possible in the real world. In this regard, a hybrid method is proposed which is based on the illumination boost algorithm (IBA) and weighted guided image filtering with steering kernel (SKWGIF). IBA is used to raise the values of low and medium intensity pixels while preventing excessive increases in high-value pixels. SKWGIF is employed to denoise and augment image details by adjusting its parameters based on the output of the previous step. The proposed method utilizes the LOL v1, LOL v2 and ExDark datasets Our hybrid method’s comprehensive approach to low-light image enhancement produced better outcomes than previous methods. Quantitative results have shown that the proposed model outperforms state-of-the-art (SOTA) models on the LOL v1 dataset, achieving a PSNR value of 18.80. Through the successful resolution of visibility and noise reduction issues, our approach led to notable gains in image quality measures, including SSIM, PSNR, BRISQUE, and NIQE. IBA’s ability to selectively boost intensity improved image visibility without producing overexposure artifacts, and SKWGIF effectively reduced noise while enhancing image details. The combined effect produced improved image quality that was superior to that of single techniques or already-used hybrid approaches. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Optimizing Robotic Manipulation With Decision-RWKv: A Recurrent Sequence Modeling Approach for Lifelong Learning

引用

JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING 2025年第3期25卷 031004页

作者： Dong, Yujian Wu, Tianyu Song, Chaoyang Southern Univ Sci & Technol Sch Design 1088 Xueyuan Rd Shenzhen 518055 Peoples R China Southern Univ Sci & Technol Dept Mech & Energy Engn 1088 Xueyuan Rd Shenzhen 518055 Peoples R China

Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revolutionizing machine understanding of human language and demonstrating impressive memory capacity and reproduction capabilities. Traditional machine learning algorithms struggle with catastrophic forgetting, detrimental to the diverse and generalized abilities required for robotic deployment. This article investigates the receptance weighted key value (RWKv) framework, known for its advanced capabilities in efficient and effective sequence modeling, integration with the decision transformer (DT), and experience replay architectures. It focuses on potential performance enhancements in sequence decision-making and lifelong robotic learning tasks. We introduce the decision-RWKv (DRWKv) model and conduct extensive experiments using the D4RL database within the OpenAI Gym environment and on the D'Claw platform to assess the DRWKv model's performance in single-task tests and lifelong learning scenarios, showing its ability to handle multiple subtasks efficiently. The code for all algorithms, training, and image rendering in this study is available online (open source).

关键词： foundation models recurrent models lifelong learning robot learning artificial intelligence computational foundations for engineering optimization data-driven engineering engineering informatics machine learning for engineering applications multiphysics modeling and simulation

来源：评论

学校读者我要写书评

暂无评论

Design and Development of Deep Learning-Aided vision Guidance System for AUv Homing applications

引用

IEEE EMBEDDED SYSTEMS LETTERS 2024年第2期16卷 198-201页

作者： Jyothi, v. Bala Naga Akash, S. Jai Ramadass, G. Ananda vedachalam, N. venkataraman, Hrishikesh Indian Inst Informat Technol Sri City Smart Transportat Res Grp Sri City 517646 India Minist Earth Sci Natl Inst Ocean Technol Chennai 600100 India

In the current subsea industry scenario, autonomous underwater vehicles (AUvs) are widely used for expeditions and explorations. However, the mission duration is limited due to the limitations in the battery capacity. To increase the endurance, there is a need for a submerged docking station (DS) to charge the battery, also to update the next mission profile. In this letter, deep learning (DL) technique aided short-range vision guidance is envisaged for a reliable and precise AUv homing operation. Intelligent control algorithms with an efficient DL-based you only look once (YOLO) v5-image processing techniques are used for DS detection and tracking and deployed in an edge computer integrated into AUv prototype. The developed illuminated DS and AUv prototype with high-definition camera has been demonstrated in test tank at depth of 2 m. An analysis was conducted on the DS data set, which comprised 132 images of clear and turbid water, 13 were designated for testing, 40 for validation, and 79 for training purposes. The results were observed that the probability of detecting the DS is 95%, detection range is 5 m, the probability of homing toward the DS is CEP 90 with the position error of 5% in less-turbid waters and in high-turbid waters, 60% is the probability of DS detection with position error up to 25%, detectable range is 1 m. The proposed embedded hardware is extremely useful for underwater reliable homing applications.

关键词： Cameras Hardware Computer architecture Lighting Real-time systems Prototypes Computational modeling Autonomous underwater vehicles (AUvs) pose estimation vision guidance YOLO v5

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：