检索结果-内蒙古大学图书馆

5th IEEE International Conference on Electrical, Computer and Communication Technologies, ICECCT 2023

作者： Nirmala, v. Joseph Abraham Sundar, K. Aiswarya, G. Sai Sidhartha, N. Prabu, S. SASTRA Deemed to Be University India

ISBN: (纸本)9781665493604

Artificial Intelligence is a fast-growing domain that facilitates the innovation in various fields of business and manufacturing industries. This field of machine learning provides the automatic inspection of the manufactured products for the detection of dimensionality measurement of the products. Human visual inspection on finding imperfect dimensions at micro level scratches, dents and quantification variations of engine valve are prone to errors. Here, we propose complete automation to find conventional dimension variations of the product with the traditional machine vision algorithms. Two approaches are adopted here for Edge detection for dimension measurement from the pixel processing and template matching for the identification of deviations for the anomaly detection in the product. As a result, a new automation technique is obtained, that will be used at various levels of quality inspections for the dimensions measurement. © 2023 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

A 30-fps 192x 192 CMOS image Sensor With Per-Frame Spatial-Temporal Coded Exposure for Compressive Focal-Stack Depth Sensing

引用

IEEE JOURNAL OF SOLID-STATE CIRCUITS 2022年第6期57卷 1661-1672页

作者： Luo, Yi Mirabbasi, Shahriar Univ British Columbia Dept Elect & Comp Engn Vancouver BC V6T 1Z4 Canada

In this article, we present a CMOS image sensor (CIS) for coded-exposure-based compressive focal-stack imaging. The proposed CIS has a pixel design, which includes two capacitive trans-impedance amplifiers (CTIAs) and a static random access memory (SRAM), and is capable of per-frame exposure encoding with adjustable spatiotemporal resolutions. A proof-of-concept CIS prototype with a 192 x 192 pixel array is designed and fabricated in a 0.13-mu m CMOS process with a pixel size of 12.6 x 12.6 mu m(2). Operating at 30 frames per second (fps), the CIS demonstrates spatial-temporal coded exposure at a maximum rate of 768 masks/frame. The column-wise 10-bit single-slope (SS) analog-to-digital converter (ADC) includes a ramp-slope adaptation feature used for power optimization. During a frame of coded exposure, a linear focal sweep is implemented by a voice-coil motor (vCM) lens mounted in front of the proposed CIS. Through the sparse reconstruction of the coded image, a focal stack consisting of a volume of defocused images is used to synthesize the scene depth map. By introducing coded exposure, the proposed on-chip compressive focal-stack imaging approach facilitates a frame-saving method for passive depth sensing in machine vision and other imaging applications.

关键词： Imaging Cameras Modulation image coding Optical sensors Optical imaging Random access memory 3-D imaging CMOS image sensor (CIS) coded exposure compressive sensing (CS) computational camera focal stack machine vision passive depth sensing single-slope (SS) analog-to-digital converter (ADC)

来源：评论

学校读者我要写书评

暂无评论

Employing a Hybrid Convolutional Neural Network and Extreme Learning machine for Precision Liver Disease Forecasting

引用

INTERNATIONAL JOURNAL OF ADvANCED COMPUTER SCIENCE AND applications 2024年第2期15卷 708-721页

作者： Deshmukh, Araddhana Arvind Krishna, R. v. v. Salman, Rahama Sandhiya, S. Balajee, J. Pilli, Daniel Symbiosis Skill & Profess Univ Sch Comp Sci & Informat Technol Cyber Secur Pune Maharashtra India Aditya Coll Engn Technol ECE Dept Surampalem 5334372 India Jazan Univ Coll Comp Sci & Informat Technol Dept Informat Technol & Secur Jazan Saudi Arabia Panimalar Engn Coll Dept IT Chennai Tamil Nadu India Mother Theresa Inst Engn & Technol Dept Comp Sci & Engn Chittoor 517408 Andhra Pradesh India Koneru Lakshmaiah Educ Fdn Dept MBA Chennai India

This paper discusses the critical relevance of precise forecasting in liver disease, as well as the need for early identification and categorization for immediate action and personalized treatment strategies. The paper describes a unique strategy for improving liver disease classification using ultrasound image processing. The recommended technique combines the properties of the Extreme Learning machine (ELM), Convolutional Neural Network (CNN), along Grey Wolf Optimisation (GWO) to form an integrated model known as CNN-ELM-GWO. The data is provided by Pakistan's Multan Institute of Nuclear Medicine and Radiotherapy, and it is then pre-processed utilizing bilateral and optimal wavelet filtering techniques to increase the dataset's quality. To properly extract significant visual information, feature extraction employs a deep CNN architecture using six convolutional layers, batch normalization, and max-pooling. The ELM serves as a classifier, whereas the CNN is a feature extractor. The GWO algorithm, based on grey wolf searching strategies, refines the CNN and ELM hyperparameters in two stages, progressively boosting the system's classification accuracy. When implemented in Python, CNN-ELM-GWO exceeds traditional machine learning algorithms (MLP, RF, KNN, and NB) in terms of accuracy, precision, recall, and F1-score metrics. The proposed technique achieves an impressive 99.7% accuracy, revealing its potential to significantly enhance the classification of liver disease by employing ultrasound images. The CNN-ELM-GWO technique outperforms conventional approaches in liver disease forecasting by a substantial margin of 27.5%, showing its potential to revolutionize medical imaging and prospects.

关键词： Liver disease prognosis convolutional neural network extreme learning machine grey wolf optimization patient care

来源：评论

学校读者我要写书评

暂无评论

Read Right: Empowering the Blind to Read the World

Read Right: Empowering the Blind to Read the World

引用

2024 IEEE International Conference on Signal processing and Advance Research in Computing, SPARC 2024

作者： Prasuna, vempati Lakshmi Fathimabi, Sk. Adusumilli, Apoorva Rajesh, Bandaru veera venkata Velagapudi Ramakrishna Siddhartha Engineering College Information Technology Vijayawada India

ISBN: (纸本)9798350385199

In the real world, knowledge comes from books and papers. Now that information only reaches to those with clear vision. In the community there are a part of people suffering either from poor eyesight or blindness. Braille is one of the methods employed to interpret several reports or books, however obtaining all these files in braille may be prohibitively expensive as well as regularly impossible. Hence, considering these issues and flaws in existing systems this paper is designed as a simple user-friendly mobile application 'Read Right' that works on voice commands like 'Take Picture' to click an image of the document which extracts text from picture as well plays it out (audio) also providing futuristic object detection. All this is designed using Firebase ML kit which predominantly uses bitmap class for preprocessing, text-to-speech synthesizer and FirebasevisionObjectDetector, for joining machine learning capabilities into portable applications. Hence, the application stands out for its flexibility in dealing with official archives through voice. © 2024 IEEE.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Facial Micro-Expression Recognition using Deep Spatio-Temporal Neural Networks 32

Facial Micro-Expression Recognition using Deep Spatio-Tempor...

引用

Conference on Signal processing, Sensor/Information Fusion, and Target Recognition XXXII

作者： Zheng, Yufeng Blasch, Erik Univ Mississippi Med Ctr Jackson MS 39216 USA MOVEJ Analyt Fairborn OH USA

ISBN: (数字)9781510662117

ISBN: (纸本)9781510662100;9781510662117

In the billions of faces that are shaped by thousands of different cultures and ethnicities, one thing remains universal: the way emotions are expressed. To take the next step in human-machine interactions, a machine must be able to clarify facial emotions. Allowing machines to recognize micro-expressions gives them a deeper dive into a person's true feelings at an instant which allows designers to create more empathetic machines that will take human emotion into account while making optimal decisions;e.g., these machines will be potentially able to detect dangerous situations, alert caregivers to challenges, and provide appropriate responses. Micro-expressions are involuntary and transient facial expressions capable of revealing genuine emotions. We propose to design and train a set of neural network (NN) models capable of micro-expression recognition in real-time applications. Different NN models are explored and compared in this study to design a hybrid deep learning model by combining a convolutional neural network (CNN), a recurrent neural network (RNN, e.g., long short-term memory [LSTM]), and a vision transformer. The CNN can extract spatial features (of a neighborhood within an image) whereas the LSTM can summarize temporal features. In addition, a transformer with an attention mechanism can capture sparse spatial relations residing an image or between frames in a video clip. The inputs of the model are short facial videos, while the outputs are the micro-expressions gleaned from the videos. The deep learning models are trained and tested with publicly available facial micro-expression datasets to recognize different micro-expressions (e.g., happiness, fear, anger, surprise, disgust, sadness). The results of our proposed models are compared with that of literature-reported methods tested on the same datasets. The proposed hybrid models perform the best.

关键词： Facial micro-expression Human-machine interaction Long short-term memory (LSTM) Convolutional neural network (CNN) vision transformer Deep learning

来源：评论

学校读者我要写书评

暂无评论

Development of Automatic Segmentation Techniques using Convolutional Neural Networks to Differentiate Diabetic Foot Ulcers

引用

INTERNATIONAL JOURNAL OF ADvANCED COMPUTER SCIENCE AND applications 2022年第11期13卷 521-526页

作者： Prakash, R. v. Kumar, K. Sundeep SEA Coll Engn & Technol Dept Comp Sci & Engn Bengaluru Karnataka India

The quality of computer vision systems to detect abnormalities in various medical imaging processes, such as dual-energy X-ray absorptiometry, magnetic resonance imaging (MRI), ultrasonography, and computed tomography, has significantly improved as a result of recent developments in the field of deep learning. There is discussion of current techniques and algorithms for identifying, categorizing, and detecting DFU. On the small datasets, a variety of techniques based on traditional machine learning and image processing are utilized to find the DFU. These literary works have kept their datasets and algorithms private. Therefore, the need for end-to-end automated systems that can identify DFU of all grades and stages is critical. The study's goals were to create new CNN-based automatic segmentation techniques to separate surrounding skin from DFU on full foot images because surrounding skin serves as a critical visual cue for evaluating the progression of DFU as well as to create reliable and portable deep learning techniques for localizing DFU that can be applied to mobile devices for remote monitoring. The second goal was to examine the various diabetic foot diseases in accordance with well-known medical categorization schemes. According to a computer vision viewpoint, the authors looked at the various DFU circumstances including site, infection, neuropathy, bacterial infection, area, and depth. machine learning techniques have been utilized in this study to identify key DFU situations as ischemia and bacterial infection.

关键词： Magnetic resonance imaging (MRI) diabetic foot ulcers (DFU) convolutional neural networks ischemia& machine learning algorithms & dual-energy x-ray absorptiometry

来源：评论

学校读者我要写书评

暂无评论

Superior Attribute Weighted Set for Object Skeleton Detection using ResNet50 with Edge based Segmentation Model 2

Superior Attribute Weighted Set for Object Skeleton Detectio...

引用

2nd International Conference on Sustainable Computing and Smart Systems (ICSCSS)

作者： Narayana, v. Lakshman vinayaki, K. vaishnavi Swetha, P. Ayyar Sri, K. Divya Chaithanya, G. Vignans Nirula Inst Technol & Sci Women Dept Comp Sci & Engn Peda Palakaluru Rd Guntur 522009 Andhra Pradesh India

ISBN: (纸本)9798350391558;9798350379990

Object detection is a method used in computer vision for identifying specific items inside an image or video. Most effective object detection systems make use of machine learning or deep learning. Object detection is a method of computer vision that allows us to find specific things in pictures and videos. Labeling and counting items in a scene, as well as pinpointing their locations and following their movement, are all possible because to object detection's ability to precisely identify and localize them. For instance, it is easy to recognize circles as a distinct class because of their shared characteristic of being spherical. These unique characteristics are used for object class recognition. Facial traits like as skin tone and eye distance are employed in a manner analogous to that used for fingerprinting in order to positively identify a person by their face. The object detection task is typically made much more challenging due to the test images being sampled from a distinct data distribution. Many unsupervised domain adaptation approaches have been presented to solve the difficulties introduced by the discrepancy between the domains of the training and test data. Cross-domain object detection has many applications, including autonomous driving because to the ease with which labels can be generated for a large number of scenes in video games. Object detection methods can be categorized as either neural network-based or non-neural. This research presents a Superior Attribute Weighted Set for Object Skeleton Detection using ResNet50 (SAWS-OSD-ResNet50). The proposed model when compared with the traditional methods performs better in object detection.

关键词： image Segmentation Object Detection Computer vision image Annotation Face detection ResNet 50

来源：评论

学校读者我要写书评

暂无评论

CoPL: Contextual Prompt Learning for vision-Language Understanding 38

CoPL: Contextual Prompt Learning for Vision-Language Underst...

引用

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence

作者： Goswami, Koustava Karanam, Srikrishna Udhayanan, Prateksha Joseph, K. J. Srinivasan, Balaji vasan Adobe Res Bangalore Karnataka India

ISBN: (纸本)1577358872

Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we identify that these prompts are trained based on global image features which limits itself in two aspects: First, by using global features, these prompts could be focusing less on the discriminative foreground image, resulting in poor generalization to various out-of-distribution test cases. Second, existing work weights all prompts equally whereas intuitively, prompts should be reweighed according to the semantics of the image. We address these as part of our proposed Contextual Prompt Learning (CoPL) framework, capable of aligning the prompts to the localized features of the image. Our key innovations over earlier works include using local image features as part of the prompt learning process, and more crucially, learning to weight these prompts based on local features that are appropriate for the task at hand. This gives us dynamic prompts that are both aligned to local image features as well as aware of local contextual relationships. Our extensive set of experiments on a variety of standard and few-shot datasets show that our method produces substantially improved performance when compared to the current state of the art methods. We also demonstrate both few-shot and out-of-distribution performance to establish the utility of learning dynamic prompts that are aligned to local image features.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

A Real-Time Multimodal Deep Learning for image-to-Cartoon Conversion 3

A Real-Time Multimodal Deep Learning for Image-to-Cartoon Co...

引用

3rd International Conference on Innovative Mechanisms for Industry applications, ICIMIA 2023

作者： Karthik, Raja Pavan vamsi, Kalla Yadu Reddy, veeramreddy Sourya Tejarsha Abhishek, S. Anjali, T. Amrita Vishwa Vidyapeetham Amrita School of Computing Department of Computer Science and Engineering Amritapuri India Rolls-Royce Inc Bangalore India

ISBN: (纸本)9798350343632

In the era of digital imagery, there is a great interest in finding new and creative ways to express ourselves and make our images look beautiful. One such fascinating method is cartoonization, a process that transforms ordinary images into visually appealing cartoon images. This paper explores the integration of cutting-edge computer vision algorithms, traditional image processing methods, and Neural Networks to achieve cartoonization. The main focus is on combining object segmentation with cartoonization in a smooth and seamless way, which offers a unique and innovative approach to improving images. By thoroughly considering various techniques and how they can be used together, our research not only gives a complete understanding of these methods but also highlights how they can transform the field of digital artistry. By exploring the integration between methods, the study sheds light on how these techniques contribute to the evolving landscape of digital artistry. The research suggests that the fusion of computer vision, traditional image processing, and machine Learning techniques holds promising potential for pushing the boundaries of creative expression in the digital realm, offering new ways for creating efficient cartoon images. © 2023 IEEE.

关键词： Cartoonification Cartooning Color Quantization Edge Detection Region-Based Convolutional Neural Networks (R-CNN)

来源：评论

学校读者我要写书评

暂无评论

Soil type identification model using a hybrid computer vision and machine learning approach

引用

MULTIMEDIA TOOLS AND applications 2023年第1期83卷 575-589页

作者： Abeje, Bekalu Tadele Salau, Ayodeji Oalelan Gela, Belsti Mulualem Mengistu, Abrham Debasu Haramaya Univ Dept Informat Technol Dire Dawa Ethiopia Afe Babalola Univ Dept Elect Elect & Comp Engn Ado Ekiti Nigeria Saveetha Inst Med & Tech Sci Saveetha Sch Engn Chennai India Bahir Dar Univ Inst Technol Bahir Dar Ethiopia Bahir Dar Univ Inst Technol Dept Comp Sci Bahir Dar Ethiopia

Computer vision and its technologies are being used in the area of agricultural automation to identify, locate, and track targets for further image processing. Mostly, agricultural production has been highly dependent on natural resources like soil, water, and other related natural minerals from the soil. Soil classification is a way of arranging soils that have similar characteristics into groups. Identifying and classifying soils has a great role to play in agricultural productivity as it helps to provide relevant information which aids agricultural experts to recommend the type of crop best suited for a specific type of soil. This study mainly concentrated on classifying soil types such as clay soil, loam soil, sandy soil, peat soil, silt soil, and chalk soil. The soil images were collected from Amhara region at different locations by using a sony digital camera. To reduce image noise due to handshake we used a camera stand or arm to avoid other types of noises like environmental lighting effects and shadow. Once the dataset was collected, preprocessing such as resizing and gamma correction was performed to remove noise from the images, and contrast adjustment was also performed. Experimental research was applied as a general methodology and the experiment was conducted based on two approaches. The first approach used CNN as an end-to-end classifier and the second used a hybrid approach which used CNN as a feature extractor and SvM as a classifier. When CNN was used as an end-to-end classifier, a classification accuracy of 88% was achieved, whereas when the hybrid approach which used CNN as feature extractor and SvM as classifier was employed, a classification accuracy of 95% was achieved. Finally, we conclude that the hybrid approach is better than that of the End-to-End classification using our proposed model.

关键词： Soil type identification CNN PCA SvM

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：