检索结果-内蒙古大学图书馆

4th IEEE International Conference on Mobile Networks and Wireless Communications, ICMNWC 2024

作者： Mahantesh, K. Navyashree, K.S. Nairy, Devika S. Asha, R. Anshitha, B. Bengaluru India Sjb Institute of Technology Visvesvaraya Technological University Department of Ece Bengaluru India

ISBN: (纸本)9798350352931

image categorization is a fundamental task in computer vision, with applications in domains such as object recognition, medical imaging, and autonomous systems. Traditional approaches frequently fail to balance accuracy, computing efficiency, and scalability, particularly when dealing with big and complex datasets. This work presents a novel picture classification strategy that combines the Bag of Features (BoF) model with Support vector machines (SvM). The BoF model describes images by extracting local visual characteristics (such as SIFT, SURF, or ORB) from image patches and quantizing them into visual words to create a histogram representation. SvM, a powerful machine learning classifier, is used to classify these histograms, utilizing its capacity to handle high-dimensional, sparse data. Experiments using common image classification datasets show that the BoF-SvM system greatly outperforms previous methods, resulting in higher classification accuracy and lower processing costs. Furthermore, it has superior generalization to previously unseen data and is more resistant to noise and picture changes. The suggested BoF-SvM system produces promising results for boosting both accuracy and efficiency in image classification tasks, with room for further optimization in more complicated and diversified applications © 2024 IEEE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Early vision on the Focal-Plane with High Dynamic Range Pixels

Early Vision on the Focal-Plane with High Dynamic Range Pixe...

引用

International Workshop on Compressed Sensing Theory and its applications to Radar, Sonar and Remote Sensing (CoSeRa)

作者： Marko Jaklin D. García-Lesta P. López v.M. Brea Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS) Universidade de Santiago de Compostela Santiago de Compostela Spain

ISBN: (数字)9798350365504

ISBN: (纸本)9798350365511

This paper introduces a high dynamic range pixel for early vision processing. Early vision is the first stage to subsequently extract semantic information for image processing or video analytics. This paper proposes to bring said processing to the focal plane, next to a high dynamic range image sensor working on the principle of lateral overflow capacitor. This brings the benefits of processing scenes with a wide dynamic range in a power efficient manner. Circuit simulations for edge detection, as an example of early vision processing conveyed in this paper, show that our proposal meets the accuracy typically found in applications like machine vision. Simulations are in XFAB’s XS018 technology.

关键词： image sensors Accuracy Power demand image edge detection visual analytics Multimodal sensors Semantics Radar imaging High dynamic range Proposals

来源：评论

学校读者我要写书评

暂无评论

Robust Approach to vehicle Detection and Counting with Centroid-based Tracking 5

Robust Approach to Vehicle Detection and Counting with Centr...

引用

5th International Conference on IoT Based Control Networks and Intelligent Systems, ICICNIS 2024

作者： Parthasarathy, S. Kumar, v.G. Kishore Raj, P. Naveen Brintha, R. Mohana Miruna Joe Amali, S. Thangasankaran, R. K.L.N. College of Engineering Department of EEE Sivagangai India K.L.N. College of Engineering Department of CSE Sivagangai India K.L.N College of Engineering Sivagangai India

ISBN: (纸本)9798331518097

This study proposes a robust computer vision-based system for autonomous vehicle identification and tracking, utilizing OpenCv with Python for real-time image processing. To precisely identify cars and bikes, the system examines both individual video frames and the motion between successive frames. To separate moving vehicles, important image processing methods like segmentation, filtering, binarization, and background subtraction are used. In order to facilitate in-depth traffic analysis, the system divides vehicles into two categories: motorbikes and light vehicles. For real-time applications including toll collection, highway surveillance, and traffic monitoring, these methods enable precise vehicle classification, speed prediction, counting, and tracking. The system is perfect for urban traffic control and planning because of its non-intrusive design and strong image processing capabilities. The suggested system's flexible and scalable design provides a workable answer for real-time vehicle monitoring and identification in civilian applications, enhancing traffic safety, streamlining traffic management, and promoting more intelligent urban planning. © 2024 IEEE.

关键词： Kalman filters

来源：评论

学校读者我要写书评

暂无评论

Spatial Quality Assessment of Pansharpened images Based on Gray Level Co-Occurrence Matrix 12

Spatial Quality Assessment of Pansharpened Images Based on G...

引用

12th Iranian/2nd International Conference on machine vision and image processing, MvIP 2022

作者： Aghapour Maleki, Shiva Ghassemian, Hassan Tarbiat Modares University Image Processing and Information Analysis Laboratory Faculty of Electrical and Computer Engineering Tehran Iran

ISBN: (纸本)9781665412162

Assessing the quality of pansharpened images is a critical issue in order to obtain a quantitative score to represent the quality and compare the performance of different fusion methods. Most of the introduced metrics for pansharpened image quality assessment, evaluate the spectral content of the image, while in different applications of remote sensing like detection and identification of image objects, spatial quality has an important role. In the current study, a new index for spatial quality assessment is introduced that extracts gray level co-occurrence matrix (GLCM) from distorted and reference images and compares the similarities of these features. The tempere image database 2013 (TID2013) that provides reference and different types of distorted images with subjective scores of each image is used as the desired database. To solve the high computational complexity of obtaining GLCM features, the fast GLCM method is employed. In this way, 16 different features are extracted. To select the features that have the most consistency with the human visual system (HvS), the forward floating search method is used as a feature selection method and five features are obtained as the final features to form the desired index. Experimental results show the efficiency of the proposed method in determining the spatial quality of fused images compared with that of the available quality assessment metrics. © 2022 IEEE.

关键词： Remote sensing

来源：评论

学校读者我要写书评

暂无评论

Plant Diseases Detection Using Deep Learning and machine vision 3

Plant Diseases Detection Using Deep Learning and Machine Vis...

引用

3rd IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal processing, INCOS 2024

作者： Singh, Nidhi School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Vellore632014 India

ISBN: (纸本)9798350361186

This research delves into deep learning and machine vision applications for plant leaf disease detection in agricultural settings, focusing on farm village datasets. Utilizing a blend of authentic farm village data and synthetic data from Generative Adversarial Networks (GANs), three advanced convolutional neural network (CNN) models vGG16, ResNet50, and InceptionNet v3 are employed with transfer learning. Leveraging transfer learning enhances model performance through fine-tuning pre-trained networks. The study systematically evaluates models based on key metrics like accuracy, precision, recall, and F1 score. Results showcase the methodology's robustness, with ResNet50 emerging as the leading performer at 83.23%, contributing to precision agriculture's advancements with promising implications for sustainable farming and crop yield optimization. © 2024 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Scaling vision-Language Models with Sparse Mixture of Experts

Scaling Vision-Language Models with Sparse Mixture of Expert...

引用

Conference on Empirical Methods in Natural Language processing (EMNLP)

作者： Shen, Sheng Yao, Zhewei Li, Chunyuan Darrell, Trevor Keutzer, Kurt He, Yuxiong Univ Calif Berkeley Berkeley CA 94720 USA Microsoft Corp Redmond WA 98052 USA

ISBN: (纸本)9798891760615

The field of natural language processing (NLP) has made significant strides in recent years, particularly in the development of large-scale vision-language models (vLMs). These models aim to bridge the gap between text and visual information, enabling a more comprehensive understanding of multimedia data. However, as these models become larger and more complex, they also become more challenging to train and deploy. One approach to addressing this challenge is the use of sparsely-gated mixture-of-experts (MoE) techniques, which divide the model into smaller, specialized submodels that can jointly solve a task. In this paper, we explore the effectiveness of MoE in scaling vision-language models, demonstrating its potential to achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost. Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vLMs. We hope our work will inspire further research into the use of MoE for scaling large-scale vision-language models and other multimodal machine learning applications.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Design of vision-guided Gripping System for 6DOF Robots Combined with Dexterous Hands 7

Design of Vision-guided Gripping System for 6DOF Robots Comb...

引用

7th International Conference on Robotics, Control and Automation Engineering, RCAE 2024

作者： Wang, Chengwen Wan, Guoyang Li, Hanqi Li, Xuna Zheng, Da Teng, Mingyao Anhui University of Engineering Dept. School of Electrical Engineering Wuhu China

ISBN: (纸本)9798350355642

In the robot application system incorporating dexterous hand, a vision-based robot grasping system is proposed to address the lack of robustness of dexterous hand in grasping fixed attitude objects. First, a 6DOF robot grasping system based on machine vision is constructed using dexterous hand, depth camera and 6DOF collaborative robot, which realizes accurate grasping under vision guidance;second, to solve the problem of vision system's poor localization accuracy due to the loss of image information and features caused by image noise, occlusion and complex background in the process of image processing, a pooling layer and attention mechanism to enhance the feature extraction ability;moreover, an optimized dexterous hand grasping strategy is proposed through exhaustive grasping action design and analysis, which effectively improves the robustness of the system. The experimental results show that the accuracy of the target detection model reaches 87% through the localization measurement of the experimental objects, which is 2.1% higher than the original method, and the grasping success rate of the robotic system equipped with dexterous hand and depth camera is improved by 3.5%. These results validate the feasibility of the robotic grasping system incorporating dexterous hands in practical applications and significantly enhance the robustness of the system. © 2024 IEEE.

关键词： Collaborative robots

来源：评论

学校读者我要写书评

暂无评论

Automated Detection of Offensive images and Sarcastic Memes in Social Media Through NLP

引用

INTERNATIONAL JOURNAL OF ADvANCED COMPUTER SCIENCE AND applications 2024年第7期15卷 1415-1425页

作者： Purnima, Tummala Rao, Ch Koteswara VIT AP Univ Sch Comp Sci Near Vijayawada Amaravati 522237 Andhra Pradesh India

In this digital era, social media is one of the key platforms for collecting customer feedback and reflecting their views on various aspects, including products, services, brands, events, and other topics of interest. However, there is a rise of sarcastic memes on social media, which often convey contrary meaning to the implied sentiments and challenge traditional machine learning identification techniques. The memes, blending text and visuals on social media, are difficult to discern solely from the captions or images, as their humor often relies on subtle contextual cues requiring a nuanced understanding for accurate interpretation. Our study introduces Offensive images and Sarcastic Memes Detection to address this problem. Our model employs various techniques to identify sarcastic memes and offensive images. The model uses Optical Character Recognition (OCR) and bidirectional long-short term memory (Bi-LSTM) for sarcastic meme detection. For offensive image detection, the model employs Autoencoder LSTM, deep learning models such as Densenet and mobilenet, and computer vision techniques like Feature Fusion Process (FFP) based on Transfer Learning (TL) with image Augmentation. The study showcases the effectiveness of the proposed methods in achieving high accuracy in detecting offensive content across different modalities, such as text, memes, and images. Based on tests conducted on real-world datasets, our model has demonstrated an accuracy rate of 92% on the Hateful Memes Challenge dataset. The proposed methodology has also achieved a Testing Accuracy (TA) of 95.7% for Densenet with transfer learning on the NPDI dataset and 95.12% on the Pornography dataset. Moreover, implementing Transfer Learning with a Feature Fusion Process (FFP) has resulted in a TA of 99.45% for the NPDI dataset and 98.5% for the Pornography dataset.

关键词： Deep learning natural language processing offen sive images sarcastic memes toxic content detection

来源：评论

学校读者我要写书评

暂无评论

Optimal Z-axis Find Algorithm in Ellipsometry Semiconductor Process based on Local Search using machine vision 18

Optimal Z-axis Find Algorithm in Ellipsometry Semiconductor ...

引用

18th International Conference on Future Networks and Communications (FNC) / 20th International Conference on Mobile Systems and Pervasive Computing (MobiSPC) / 13th International Conference on Sustainable Energy Information Technology (SEIT)

作者： Lee, Jaehyeong Kim, Taeyong Ryu, Sehyeon Ahn, Jungeun Kim, Sungjun Jeong, Jongpil Sungkyunkwan Univ Dept Smart Factory Convergence Suwon 440746 South Korea AIM AI Res Lab Hanam South Korea

By the latest method, wafers of semiconductors have been sliced very thin for manufacturing efficiency, and the manufacturing process of stacking various thin films has been used. In order to measure such a thin film during the semiconductor manufacturing process, an Elipsometer, a non-destructive optical device, is used. Ellipsometer analyzes the thin film by checking the change in the polarization state of the incident light after the light irradiated to the wafer surface is reflected from the incident surface. However, thinly sliced wafers are often bent during the manufacturing process, so in industrial sites Therefore, it was difficult to efficiently measure the thin film by maintaining an accurate optical state. Accordingly, this study analyzed data based on the image of machine vision and compared algorithms that efficiently enable precise measurement on vented wafers by using it and changing the Z axis. Thus, we propose a focusing optimizing algorithm based on machine vision image processing and evaluate the data and features to support it, and we open data sets and algorithm codes that can prove this process in GitHub repository(1). In addition, the efficiency of these algorithms was interpreted through simulation figures, and through this, an optical system capable of precise measurement applying a method of efficiently moving the Z-axis is proposed. (c) 2023 The Authors. Published by Elsevier B.v.

关键词： Wafer stage Semiconductor Ellipsometer machine vision Auto-Focusing

来源：评论

学校读者我要写书评

暂无评论

Investigation into the Effect of Batch Size on Batch Normalisation during Inference for image Colourisation 26

Investigation into the Effect of Batch Size on Batch Normali...

引用

26th Irish machine vision and image processing Conference, IMvIP 2024

作者： Armstrong, David McLaughlin, Niall Wang, Hui Cirdan United Kingdom Queen's University Belfast United Kingdom

ISBN: (纸本)9781837242672

The Pix2Pix architecture is widely used for image colourisation. This is the problem of transforming a greyscale image into a realistic colour image. However, the canonical Pix2Pix colourisation model uses batch normalisation during inference, which makes the model output dependent on the other images in the inference batch, and leads to excessive colourfulness in its output. In this work, we analyse the effect of small batch sizes on the colourfulness of the Pix2Pix model output. We propose a method for measuring image colourfulness, allowing us to study the colourisation problem quantitatively. We then propose a method for correcting the output of the batch normalisation layers of the Pix2Pix colourisation model. This reduces its dependence on batch size and enables inference of realistic colour images at small batch sizes. © This is an open access article published by the IET under the Creative Commons Attribution License (http://***/licenses/by/3.0/)

关键词： III-v semiconductors

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：