检索结果-内蒙古大学图书馆

RASHT: A Partially Reconfigurable Architecture for Efficient Implementation of CNNs

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 2022年第7期30卷 860-868页

作者： Darbani, Paria Rohbani, Nezam Beitollahi, Hakem Lotfi-Kamran, Pejman Iran Univ Sci & Technol Sch Comp Engn Tehran *** Iran Inst Res Fundamental Sci IPM Sch Comp Sci Tehran 193955531 Iran

Convolutional neural networks (CNNs) are widely used in machine learning (ML) applications such as image processing. CNN requires heavy computations to provide significant accuracy for many ML tasks. Therefore, the efficient implementations of CNNs to improve performance using limited resources without accuracy reduction is a challenge for ML systems. One of the architectures for the efficient execution of CNNs is the array-based accelerator, that consists of an array of similar processing elements (PEs). The array accelerators are popular as high-performance architecture using the features of parallel computing and data reuse. These accelerators are optimized for a set of CNN layers, not for individual layers. Using the same accelerator dimension size to compute all CNN layers with varying shapes and sizes leads to the resource underutilization problem. We propose a flexible and scalable architecture for array-based accelerator that increases resource utilization by resizing PEs to better match the different shapes of CNN layers. The low-cost partial reconfiguration improves resource utilization and performance, resulting in a 23.2% reduction in computational times of GoogLeNet compared to the state-of-the-art accelerators. The proposed architecture decreases the on-chip memory access rate by 26.5% with no accuracy loss.

关键词： Computer architecture Convolutional neural networks Arrays Resource management System-on-chip Computational modeling Very large scale integration Array accelerator convolutional neural network (CNN) image processing and computer vision machine learning (ML) reconfigurable hardware

来源：评论

学校读者我要写书评

暂无评论

Optimizing image Classification Using Bag of Features and Support Vector machines 4

Optimizing Image Classification Using Bag of Features and Su...

引用

4th IEEE International Conference on Mobile Networks and Wireless Communications, ICMNWC 2024

作者： Mahantesh, K. Navyashree, K.S. Nairy, Devika S. Asha, R. Anshitha, B. Bengaluru India Sjb Institute of Technology Visvesvaraya Technological University Department of Ece Bengaluru India

ISBN: (纸本)9798350352931

image categorization is a fundamental task in computer vision, with applications in domains such as object recognition, medical imaging, and autonomous systems. Traditional approaches frequently fail to balance accuracy, computing efficiency, and scalability, particularly when dealing with big and complex datasets. This work presents a novel picture classification strategy that combines the Bag of Features (BoF) model with Support Vector machines (SVM). The BoF model describes images by extracting local visual characteristics (such as SIFT, SURF, or ORB) from image patches and quantizing them into visual words to create a histogram representation. SVM, a powerful machine learning classifier, is used to classify these histograms, utilizing its capacity to handle high-dimensional, sparse data. Experiments using common image classification datasets show that the BoF-SVM system greatly outperforms previous methods, resulting in higher classification accuracy and lower processing costs. Furthermore, it has superior generalization to previously unseen data and is more resistant to noise and picture changes. The suggested BoF-SVM system produces promising results for boosting both accuracy and efficiency in image classification tasks, with room for further optimization in more complicated and diversified applications © 2024 IEEE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Spatial Quality Assessment of Pansharpened images Based on Gray Level Co-Occurrence Matrix 12

Spatial Quality Assessment of Pansharpened Images Based on G...

引用

12th Iranian/2nd International Conference on machine vision and image processing, MVIP 2022

作者： Aghapour Maleki, Shiva Ghassemian, Hassan Tarbiat Modares University Image Processing and Information Analysis Laboratory Faculty of Electrical and Computer Engineering Tehran Iran

ISBN: (纸本)9781665412162

Assessing the quality of pansharpened images is a critical issue in order to obtain a quantitative score to represent the quality and compare the performance of different fusion methods. Most of the introduced metrics for pansharpened image quality assessment, evaluate the spectral content of the image, while in different applications of remote sensing like detection and identification of image objects, spatial quality has an important role. In the current study, a new index for spatial quality assessment is introduced that extracts gray level co-occurrence matrix (GLCM) from distorted and reference images and compares the similarities of these features. The tempere image database 2013 (TID2013) that provides reference and different types of distorted images with subjective scores of each image is used as the desired database. To solve the high computational complexity of obtaining GLCM features, the fast GLCM method is employed. In this way, 16 different features are extracted. To select the features that have the most consistency with the human visual system (HVS), the forward floating search method is used as a feature selection method and five features are obtained as the final features to form the desired index. Experimental results show the efficiency of the proposed method in determining the spatial quality of fused images compared with that of the available quality assessment metrics. © 2022 IEEE.

关键词： Remote sensing

来源：评论

学校读者我要写书评

暂无评论

Impact of Hybrid [CPU-GPU] Architecture on machine Learning-based image-to-image Translation Using HiDT

Impact of Hybrid [CPU-GPU] Architecture on Machine Learning-...

引用

2024 International Conference on Knowledge Engineering and Communication Systems, ICKECS 2024

作者： Kantharaju, V. Chandrashekhar, B.N. Niranjanamurthy, M. Murthy, S.V.N. Bms Institute of Technology and Management Department of Ai & Ml Bengaluru India Amity University Amity School of Engineering and Technology Department of Cse Bangalore India S J C Institute of Technology Department of Cse Karnataka Chikkaballapur India

ISBN: (数字)9798350359688

ISBN: (纸本)9798350359688

image-to-image translation is the process of transforming an image from one domain to another, where the goal is to learn the mapping between an input image and an output image. This task has been generally performed by using a training set of aligned image pairs on fewer cores-based CPU-based architecture, which mainly aims to transfer images from a source domain to a target domain while preserving the content representations by consuming more execution time. Due to its broad range of applications in numerous computer vision and image processing problems, including image synthesis, segmentation, style transfer, restoration, and pose estimation, GPU-based image-to-image has attracted growing attention and made enormous progress in recent years. It can be utilized for a variation of principles, including photo enhancement, object transformation, season transfer, and collection style transfer. Only CPU and only GPU-based architecture are difficult in order to speed up the image processing task, especially during re-rendering the same scene under various illuminations characteristic for day, night, or dawn. To address this issue, in this work, we are proposing the Hybrid CPU-GPU-based architecture with HiDT technology for implementing the image translation works at tremendous speed. On the hybrid CPU-GPU-based architecture, it is possible to train a multi-domain image-to-image translation model with HiDT on variable size of dataset unaligned images without domain labels using this technology when it is integrated into an application. The speed of the mentioned application can be achieved by using emerging technologies such as pix2pixHD and HiDT on hybrid architecture, where pix2pixHD is a deep learning-based technique for high-resolution photorealistic image-to-image translation, and it is implemented in PyTorch. This article represents Impact of Hybrid Architecture on machine Learning-based image-toimage Translation Using HiDT. © 2024 IEEE.

关键词： Training Knowledge engineering image segmentation image resolution image synthesis Pose estimation Lighting

来源：评论

学校读者我要写书评

暂无评论

Design of vision-guided Gripping System for 6DOF Robots Combined with Dexterous Hands 7

Design of Vision-guided Gripping System for 6DOF Robots Comb...

引用

7th International Conference on Robotics, Control and Automation Engineering, RCAE 2024

作者： Wang, Chengwen Wan, Guoyang Li, Hanqi Li, Xuna Zheng, Da Teng, Mingyao Anhui University of Engineering Dept. School of Electrical Engineering Wuhu China

ISBN: (纸本)9798350355642

In the robot application system incorporating dexterous hand, a vision-based robot grasping system is proposed to address the lack of robustness of dexterous hand in grasping fixed attitude objects. First, a 6DOF robot grasping system based on machine vision is constructed using dexterous hand, depth camera and 6DOF collaborative robot, which realizes accurate grasping under vision guidance;second, to solve the problem of vision system's poor localization accuracy due to the loss of image information and features caused by image noise, occlusion and complex background in the process of image processing, a pooling layer and attention mechanism to enhance the feature extraction ability;moreover, an optimized dexterous hand grasping strategy is proposed through exhaustive grasping action design and analysis, which effectively improves the robustness of the system. The experimental results show that the accuracy of the target detection model reaches 87% through the localization measurement of the experimental objects, which is 2.1% higher than the original method, and the grasping success rate of the robotic system equipped with dexterous hand and depth camera is improved by 3.5%. These results validate the feasibility of the robotic grasping system incorporating dexterous hands in practical applications and significantly enhance the robustness of the system. © 2024 IEEE.

关键词： Collaborative robots

来源：评论

学校读者我要写书评

暂无评论

Automated Detection of Offensive images and Sarcastic Memes in Social Media Through NLP

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND applications 2024年第7期15卷 1415-1425页

作者： Purnima, Tummala Rao, Ch Koteswara VIT AP Univ Sch Comp Sci Near Vijayawada Amaravati 522237 Andhra Pradesh India

In this digital era, social media is one of the key platforms for collecting customer feedback and reflecting their views on various aspects, including products, services, brands, events, and other topics of interest. However, there is a rise of sarcastic memes on social media, which often convey contrary meaning to the implied sentiments and challenge traditional machine learning identification techniques. The memes, blending text and visuals on social media, are difficult to discern solely from the captions or images, as their humor often relies on subtle contextual cues requiring a nuanced understanding for accurate interpretation. Our study introduces Offensive images and Sarcastic Memes Detection to address this problem. Our model employs various techniques to identify sarcastic memes and offensive images. The model uses Optical Character Recognition (OCR) and bidirectional long-short term memory (Bi-LSTM) for sarcastic meme detection. For offensive image detection, the model employs Autoencoder LSTM, deep learning models such as Densenet and mobilenet, and computer vision techniques like Feature Fusion Process (FFP) based on Transfer Learning (TL) with image Augmentation. The study showcases the effectiveness of the proposed methods in achieving high accuracy in detecting offensive content across different modalities, such as text, memes, and images. Based on tests conducted on real-world datasets, our model has demonstrated an accuracy rate of 92% on the Hateful Memes Challenge dataset. The proposed methodology has also achieved a Testing Accuracy (TA) of 95.7% for Densenet with transfer learning on the NPDI dataset and 95.12% on the Pornography dataset. Moreover, implementing Transfer Learning with a Feature Fusion Process (FFP) has resulted in a TA of 99.45% for the NPDI dataset and 98.5% for the Pornography dataset.

关键词： Deep learning natural language processing offen sive images sarcastic memes toxic content detection

来源：评论

学校读者我要写书评

暂无评论

Red Deer Optimization with Artificial Intelligence Enabled image Captioning System for Visually Impaired People

引用

Computer Systems Science & Engineering 2023年第8期46卷 1929-1945页

作者： Anwer Mustafa Hilal Fadwa Alrowais Fahd N.Al-Wesabi Radwa Marzouk Department of Computer and Self Development Preparatory Year DeanshipPrince Sattam bin Abdulaziz UniversityAlKharjSaudi Arabia Department of Computer Sciences College of Computer and Information SciencesPrincess Nourah bint Abdulrahman UniversityP.O.Box 84428Riyadh11671Saudi Arabia Department of Computer Science College of Science&Art at MahayilKing Khalid UniversityMahayilSaudi Arabia Department of Information Systems College of Computer and Information SciencesPrincess Nourah bint Abdulrahman UniversityP.O.Box 84428Riyadh11671Saudi Arabia Department of Mathematics Faculty of ScienceCairo UniversityGiza12613Egypt

The problem of producing a natural language description of an image for describing the visual content has gained more attention in natural language processing(NLP)and computer vision(CV).It can be driven by applications like image retrieval or indexing,virtual assistants,image understanding,and support of visually impaired people(VIP).Though the VIP uses other senses,touch and hearing,for recognizing objects and events,the quality of life of those persons is lower than the standard *** image captioning generates captions that will be read loudly to the VIP,thereby realizing matters happening around *** article introduces a Red Deer Optimization with Artificial Intelligence Enabled image Captioning System(RDOAI-ICS)for Visually Impaired *** presented RDOAI-ICS technique aids in generating image captions for *** presented RDOAIICS technique utilizes a neural architectural search network(NASNet)model to produce image ***,the RDOAI-ICS technique uses the radial basis function neural network(RBFNN)method to generate a textual *** enhance the performance of the RDOAI-ICS method,the parameter optimization process takes place using the RDO algorithm for NasNet and the butterfly optimization algorithm(BOA)for the RBFNN model,showing the novelty of the *** experimental evaluation of the RDOAI-ICS method can be tested using a benchmark *** outcomes show the enhancements of the RDOAI-ICS method over other recent image captioning approaches.

关键词： machine learning image captioning visually impaired people parameter tuning artificial intelligence metaheuristics

来源：评论

学校读者我要写书评

暂无评论

A Survey on Few-Shot Techniques in the Context of Computer vision applications Based on Deep Learning 21st

A Survey on Few-Shot Techniques in the Context of Computer V...

引用

21st International Conference on image Analysis and processing (ICIAP)

作者： San-Emeterio, Miguel G. Atos Res & Innovat Madrid 28037 Spain

ISBN: (纸本)9783031133244;9783031133237

This review article about Few-Shot Learning techniques is focused on Computer vision applications based on Deep Convolutional Neural Networks. A general discussion about Few-Shot Learning is given, featuring a context-constrained description, a short list of applications, a description of a couple of commonly used techniques and a discussion of the most used benchmarks for FSL computer vision applications. In addition, the paper features a few examples of recent publications in which FSL techniques are used for training models in the context of Human Behaviour Analysis and Smart City Environment Safety. These examples give some insight about the performance of state-of-the-art FSL algorithms, what metrics do they achieve, and how many samples are needed for accomplishing that.

关键词： Few-Shot Learning Deep Learning Computer vision Human Behaviour Analysis Smart City Environment Safety

来源：评论

学校读者我要写书评

暂无评论

Epidemiological Mucormycosis treatment and diagnosis challenges using the adaptive properties of computer vision techniques based approach: a review

引用

MULTIMEDIA TOOLS AND applications 2022年第10期81卷 14217-14245页

作者： Nira Kumar, Harekrishna GLA Univ Dept Elect & Commun Mathura 281406 India

As everyone knows that in today's time Artificial Intelligence, machine Learning and Deep Learning are being used extensively and generally researchers are thinking of using them everywhere. At the same time, we are also seeing that the second wave of corona has wreaked havoc in India. More than 4 lakh cases are coming in 24 h. In the meantime, news came that a new deadly fungus has come, which doctors have named Mucormycosis (Black fungus). This fungus also spread rapidly in many states, due to which states have declared this disease as an epidemic. It has become very important to find a cure for this life-threatening fungus by taking the help of our today's devices and technology such as artificial intelligence, data learning. It was found that the CT-Scan has much more adequate information and delivers greater evaluation validity than the chest X-Ray. After that the steps of image processing such as pre-processing, segmentation, all these were surveyed in which it was found that accuracy score for the deep features retrieved from the ResNet50 model and SVM classifier using the Linear kernel function was 94.7%, which was the highest of all the findings. Also studied about Deep Belief Network (DBN) that how easy it can be to diagnose a life-threatening infection like fungus. Then a survey explained how computer vision helped in the corona era, in the same way it would help in epidemics like Mucormycosis.

关键词： Mucormycosis Computer vision Black fungus Artificial intelligence Deep learning

来源：评论

学校读者我要写书评

暂无评论

A survey on multimodal bidirectional machine learning translation of image and natural language processing

引用

EXPERT SYSTEMS WITH applications 2024年 235卷

作者： Nam, Wongyung Jang, Beakcheol Yonsei Univ Grad Sch 50 Yonsei Ro Seoul 03722 South Korea

Advances in multimodal machine learning help artificial intelligence to resemble human intellect more closely, which perceives the world from multiple modalities. We surveyed state-of-the-art research on the modalities of bidirectional machine learning translation of image and natural language processing (NLP), which address a considerable proportion of human life. Recently, with the advances in deep learning model architectures and learning methods in the fields of image and NLP, considerable progress has been made in multimodal machine learning translations that can be built by integrating image and NLP. Our goal is to explore and summarize state-of-the-art research on multimodal machine learning translation and present a taxonomy for the multimodal bidirectional machine learning translation of image and NLP. Furthermore, we reviewed the evaluation metrics and compared state-of-the-art approaches that influences this field. We believe that this survey will become a cornerstone of future research by discussing the challenges in multimodal machine learning translation and direction of future research based on understanding state-of-the-art research in the field.

关键词： Computer vision and natural language processing Deep learning image captioning image synthesis machine learning Multimodal

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：