检索结果-内蒙古大学图书馆

6th International Conference on Cybernetics, Cognition and machine Learning applications, ICCCMLA 2024

作者： Manju, D. Gollapalli, Bharathi Sudheer Benarji, P. Pooja, Karanam Made, Anil and Ai&ds Hyderabad India Jbiet Department of Cse Hyderabad India Vnr Vignana Jyothi Institute of Engineering and Technology Department of Cse Hyderabad India

ISBN: (纸本)9798331505790

Object recognition is a challenging computer vision application that finds wide use in various fields such as autonomous cars, robotics, security tracking and guiding visually impaired individuals. People with visual impairments face limitations in their mobility, making it crucial to rely on technology to assist them. By training our technologies to recognize objects, can provide guidance to blind individuals when needed. MobileNet SSD is an object detection model that determines the bounding box and category of an object in an input image. Another popular algorithm in object detection is YOLO v4 (You Only Live Once), which has seen significant advancements. The primary objective of this paper is to compare the MobileNet algorithm, which serves as the backbone for Single Shot Detector (SSD), with the YOLO v4 Algorithm. YOLO v4 utilizes an intricate Convolutional Neural Network architecture called Darknet. The goal is to determine the best model that can convert images to text and then text to speech for visually impaired individuals, allowing them to live independently. The chosen model will also provide audio responses by converting annotated text into audio and provide the location of objects in the camera's view. The results of image recognition will be communicated to the user through system audio feedback. Its experimented on real time videos and noticed that accuracy of object detection in YOLO v4 is more compared to MobileNet SSD. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

MAPHIS-Measuring arthropod phenotypes using hierarchical image segmentations

引用

METHODS IN ECOLOGY AND EvOLUTION 2024年第1期15卷 36-42页

作者： Mraz, Radoslav Stepka, Karel Pekar, Matej Matula, Petr Pekar, Stano Masaryk Univ Fac Informat Dept Visual Comp Brno Czech Republic Masaryk Univ Fac Sci Dept Bot & Zool Brno Czech Republic

1. Animal phenotypic traits are utilised in a variety of studies. Often the traits are measured from images. The processing of a large number of images can be challenging;nevertheless, image analytical applications, based on neural networks, can be an effective tool in automatic trait collection.2. Our aim was to develop a stand-alone application to effectively segment an arthropod from an image and to recognise individual body parts: namely, head, thorax (or prosoma), abdomen and four pairs of appendages. It is based on convolutional neural network with U-Net architecture trained on more than a thousand images showing dorsal views of arthropods (mainly of wingless insects and spiders). The segmentation model gave very good results, with the automatically generated segmentation masks usually requiring only slight manual adjustments.3. The application, named MAPHIS, can further (1) organise and preprocess the images;(2) adjust segmentation masks using a simple graphical editor;and (3) calculate various size, shape, colouration and pattern measures for each body part organised in a hierarchical manner. In addition, a special plug-in function can align body profiles of selected individuals to match a median profile and enable comparison among groups. The usability of the application is shown in three practical examples.4. The application can be used in a variety of fields where measures of phenotypic diversity are required, such as taxonomy, ecology and evolution (e.g. mimetic similarity). Currently, the application is limited to arthropods, but it can be easily extended to other animal taxa.

关键词： arachnids arthropods convolutional neural networks hierarchical segmentation image analysis insects machine vision morphological traits

来源：评论

学校读者我要写书评

暂无评论

Fiber-Optic Shape Sensing Using Neural Networks Operating on Multispecklegrams

引用

IEEE SENSORS JOURNAL 2024年第17期24卷 27532-27540页

作者： Cao, Caroline G. L. Javot, Bernard Bhattarai, Shreeram Bierig, Karin Oreshnikov, Ivan volchkov, valentin v. Univ Illinois Dept Ind & Enterprise Syst Engn Champaign IL 61820 USA Univ Illinois Dept Biomed & Translat Sci Champaign IL 61820 USA Max Planck Inst Intelligent Syst D-70569 Stuttgart Germany Max Planck Inst Intelligent Syst D-72076 Tubingen Germany

Application of machine learning techniques on fiber speckle images to infer fiber deformation allows the use of an unmodified multimode fiber to act as a shape sensor. This approach eliminates the need for complex fiber design or construction (e.g., Bragg gratings and time-of-flight). Prior work in shape determination using neural networks trained on a finite number of possible fiber shapes (formulated as a classification task), or trained on a few continuous degrees of freedom, has been limited to reconstruction of fiber shapes only one bend at a time. Furthermore, generalization to shapes that were not used in training is challenging. Our innovative approach improves generalization capabilities, using computer vision-assisted parameterization of the actual fiber shape to provide a ground truth, and multiple specklegrams per fiber shape obtained by controlling the input field. Results from experimenting with several neural network architectures, shape parameterization, number of inputs, and specklegram resolution show that fiber shapes with multiple bends can be accurately predicted. Our approach is able to generalize to new shapes that were not in the training set. This approach of end-to-end training on parameterized ground truth opens new avenues for fiber-optic sensor applications. We publish the datasets used for training and validation, as well as an out-of-distribution (OOD) test set, and encourage interested readers to access these datasets for their own model development.

关键词： Shape Sensors Training Neural networks image reconstruction Cameras Sensor phenomena and characterization machine learning neural networks optical fibers shape measurement speckle patterns

来源：评论

学校读者我要写书评

暂无评论

Interactive Attention AI to Translate Low-Light Photos to Captions for Night Scene Understanding in Women Safety 2nd

Interactive Attention AI to Translate Low-Light Photos to Ca...

引用

2nd International Conference on Big Data, machine Learning, and applications, BigDML 2021

作者： Rajagopal, A. Nirmala, v. vedamanickam, Arun Muthuraj Indian Institute of Technology Madras India PG and Research Department of Physics Queen Mary’s College Tamilnadu Chennai India National Institute of Technology Tiruchirapalli India

ISBN: (纸本)9789819934805

There is amazing progress in deep learning-based models for image captioning and low-light image enhancement. For the first time in literature, this paper develops a deep learning model that translates night scenes to sentences, opening new possibilities for AI applications in the safety of visually impaired women. Inspired by image captioning and visual question answering, a novel ‘Interactive image Captioning’ is developed. A user can make the AI focus on any chosen person of interest by influencing the attention scoring. Attention context vectors are computed from CNN feature vectors and user-provided start words. The encoder–attention–decoder neural network learns to produce captions from low-brightness images. This paper demonstrates how women safety can be enabled by researching a novel AI capability in the interactive vision–language model for perception of the environment in the night. © 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Search for Near-Duplicate Handwritten Documents for Data-Intensive applications

引用

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL 2024年第4期63卷 687-694页

作者： varlamova, K. D. Kaprielova, M. S. Potyashin, I. O. Chekhovich, Yu. v. Antiplagiat Co Moscow Russia Moscow Inst Phys & Technol Dolgoprudnyi 141701 Moscow Oblast Russia Russian Acad Sci Fed Res Ctr Comp Sci & Control Moscow 119333 Russia

The problem of cheating in handwritten academic essays has become more significant over the past few years. One type of cheating involves submitting the same paper, photographed in a different environment (for example, from another angle, in a different light, or in lower quality) or changed by automatic augmentation. The existing methods for detecting near-duplicates are not designed to work on large collections of handwritten documents, which significantly limits their use in practice. A machine learning-based method is presented that enables the detection of near-duplicate handwritten text images among large collections of potential sources. The proposed approach consists of three stages: converting the image into a vector representation, searching for candidates, and then selecting the source of duplication among the candidates. Our method achieved 80% and 59% recall-at-1 with false positive rate of 4.8% and 5.5% on Synthetic and Real data, respectively. The search latency is 5.5 seconds per query for a collection of 10 000 images. The results showed that the developed method is sufficiently robust to solve problems that require checking large collections of handwritten documents for cheating.

关键词： computer vision near-duplicate detection handwritten document analysis large collections Russian cursive

来源：评论

学校读者我要写书评

暂无评论

The blessing of Depth Anything:An almost unsupervised approach to crop segmentation with depth-informed pseudo labeling

引用

植物表型组学（英文） 2025年第1期7卷 49-65页

作者： Songliang Cao Binghui Xu Wei Zhou Letian Zhou Jiafei Zhang Yuhui Zheng Weijuan Hu Zhiguo Han Hao Lu National Key Laboratory of Multispectral Information Intelligent Processing Technology School of Artificial Intelligence and AutomationHuazhong University of Science and TechnologyWuhan430074China PhenoTrait Technology Co. Ltd.Beijing100096China MetaPheno Laboratory Shanghai201114China National Key Laboratory of Multispectral Information Intelligent Processing Technology School of Artificial Intelligence and AutomationHuazhong University of Science and TechnologyWuhan430074China PhenoTrait Technology Co. Ltd.Beijing100096China MetaPheno Laboratory Shanghai201114China SpeCloud Technology Co. Ltd.Sanya572025China State Key Laboratory of Plant Cell and Chromosome Engineering Institute of Genetics and Developmental BiologyChinese Academy of SciencesBeijing100101China

We present Depth-Informed Crop Segmentation(DepthCropSeg),an almost unsupervised crop segmentation approach without manual pixel-level *** segmentation is a fundamental vision task in agriculture,which benefits a number of downstream applications such as crop growth monitoring and yield *** the past decade,image-based crop segmentation approaches have shifted from classic color-based paradigms to recent deep learning-based *** latter,however,rely heavily on large amounts of data with high-quality manual annotation such that considerable human labor and time are *** this work,we leverage Depth Anything v2,a vision foundation model,to produce high-quality pseudo crop masks for training segmentation *** compile a dataset of 17,199 images from six public plant segmentation sources,generating pseudo masks from depth maps after normalization and *** a coarse-to-fine manual screening,1378 images with reliable masks are *** compare four semantic segmentation models and enhance the top-performing one with depth-informed two-stage self-training and depth-informed *** evaluate the feasibility and robustness of DepthCropSeg,we benchmark the segmentation performance on 10 public crop segmentation testing sets and a self-collect dataset covering in-field,laboratory,and unmanned aerial vehicle(UAv)*** results show that our DepthCropSeg approach can achieve crop segmentation performance comparable to the fully supervised model trained with manually annotated data(86.91 vs.87.10).For the first time,we demonstrate almost unsupervised,close-to-full-supervision crop segmentation successfully.

关键词： Crop segmentation Plant phenotyping Depth anything Segment anything Efficient labeling

来源：评论

学校读者我要写书评

暂无评论

LInKs "Lifting Independent Keypoints" - Partial Pose Lifting for Occlusion Handling with Improved Accuracy in 2D-3D Human Pose Estimation

LInKs "Lifting Independent Keypoints" - Partial Pose Lifting...

引用

IEEE/CvF Winter Conference on applications of Computer vision (WACv)

作者： Hardy, Peter Kim, Hansung Univ Southampton ECS Vis Learning & Control Southampton Hants England

ISBN: (纸本)9798350318920;9798350318937

We present LInKs, a novel unsupervised learning method to recover 3D human poses from 2D kinematic skeletons obtained from a single image, even when occlusions are present. Our approach follows a unique two-step process, which involves first lifting the occluded 2D pose to the 3D domain, followed by filling in the occluded parts using the partially reconstructed 3D coordinates. This lift-then-fill approach leads to significantly more accurate results compared to models that complete the pose in 2D space alone. Additionally, we improve the stability and likelihood estimation of normalising flows through a custom sampling function replacing PCA dimensionality reduction used in prior work. Furthermore, we are the first to investigate if different parts of the 2D kinematic skeleton can be lifted independently which we find by itself reduces the error of current lifting approaches. We attribute this to the reduction of long-range keypoint correlations. In our detailed evaluation, we quantify the error under various realistic occlusion scenarios, showcasing the versatility and applicability of our model. Our results consistently demonstrate the superiority of handling all types of occlusions in 3D space when compared to others that complete the pose in 2D space. Our approach also exhibits consistent accuracy in scenarios without occlusion, as evidenced by a 7.9% reduction in reconstruction error compared to prior works on the Human3.6M dataset. Furthermore, our method excels in accurately retrieving complete 3D poses even in the presence of occlusions, making it highly applicable in situations where complete 2D pose information is unavailable.

关键词： 3D computer vision Algorithms Algorithms Algorithms and algorithms Biometrics body pose face formulations gesture machine learning architectures

来源：评论

学校读者我要写书评

暂无评论

An Intelligent Self-Driving Car’s Design and Development, Including Lane Detection Using ROS and machine vision Algorithms 1

引用

3rd International Conference on Universal Threats in Expert applications and Solutions, UNI-TEAS 2024

作者： Sujatha, E. Sundar, J. Sathiya Jeba Raju, D. Naveen Lakshminarayanan, S. Suganthi, N. Thandalam Chennai India Department of Computer Science and Engineering Velammal Institute of Technology Chennai India Department of Computer Science and Engineering R.M.K. Engineering College Kavaraipettai Chennai India Sri Sairam Engineering College Chennai India SRM Institute of Science and Technology Ramapuram Campus Chennai India

ISBN: (数字)9789819738106

ISBN: (纸本)9789819738090

It is challenging to find a solution for lane detection. It has aroused the curiosity of the computer vision field for many years. It has been found that computer vision and machine learning algorithms struggle to tackle the multi-feature identification problem known as lane detection. Even though there are a few different machine learning approaches that may be used for lane identification, these approaches are often employed for classification rather than feature development. On the other hand, contemporary techniques of machine learning may be used to discover features that have a high recognition value, and they have shown success in feature identification tests. These strategies haven’t been applied correctly, which compromises their efficiency and accuracy when it comes to lane recognition. In this study, we provide a fresh approach to solving the problem. A brand-new preprocessing and Region of Interest (ROI) selection method is presented in this article. The major objective is to extract white features by making use of the HSv color transformation, adding preliminary edge feature detection while doing preprocessing, and then selecting ROI based on the preprocessing that was proposed. With the help of this cutting-edge preprocessing strategy, the lane may be found. The integrated autonomous vehicle that we envision is one that is controlled by a Robotic Operating System and that is capable of making intelligent driving choices. The unique filtering and noise reduction techniques that were used on the visual feedback by means of the processing unit served as the basis for the digital image-processing algorithm that was responsible for the greatest performance achieved by the autonomous vehicle. Within the control system, we used two separate control units, one of which was a master and the other of which was a slave. The master control unit is in charge of the visual processing and filtering, while the slave control unit is in charge of the vehicle’s propulsio

关键词： Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

Event-Based Hand Detection on Neuromorphic Hardware Using a Sigma Delta Neural Network 33rd

Event-Based Hand Detection on Neuromorphic Hardware Using a ...

引用

33rd International Conference on Artificial Neural Networks and machine Learning (ICANN)

作者： Azzalini, Loic Gluege, Stefan Struckmeier, Jens Sandamirskaya, Yulia ZHAW Zurich Univ Appl Sci Winterthur Switzerland WAIYS GmbH Langen Germany

ISBN: (纸本)9783031723582;9783031723599

The development of deep learning (DL) models has dramatically improved marker-free human pose estimation, including an important task of hand tracking. However, for applications in real-time critical and embedded systems, e.g. in robotics or augmented reality, hand tracking based on standard frame-based cameras is too slow and/or power hungry. The latency is limited by the frame rate of the image sensor already, and any subsequent DL processing further increases the latency gap, while requiring substantial power for processing. Dynamic vision sensors, on the other hand, enable sub-millisecond time resolution and output sparse signals that can be processed with an efficient Sigma Delta Neural Network (SDNN) model that preserves the sparsity advantage in the neural network. This paper presents the training and evaluation of a small SDNN for hand detection, based on event data from the DHP19 dataset deployed on Intel's Loihi 2 neuromorphic development board. We found it possible to deploy a hand detection model in neuromorphic hardware backend without a notable performance difference to the original GPU implementation, at an estimated mean dynamic power consumption for the network running on the chip of approximate to 7 mW.

关键词： Event-based vision Neuromorphic Hardware Hand Tracking Sigma Delta Neural Networks

来源：评论

学校读者我要写书评

暂无评论

MLDC: multi-lung disease classification using quantum classifier and artificial neural networks

引用

NEURAL COMPUTING & applications 2024年第7期36卷 3803-3816页

作者： Arora, Riya Rao, G. v. Eswara Banerjea, Shashwati Rajitha, B. Motilal Nehru Natl Inst Technol Dept Comp Sci & Engn Allahabad 211004 UP India

Lung diseases are one of the most common diseases around the world. The risk of these diseases are more in under-developed and developing countries, where millions of people are battling with poverty and living in polluted air. Chest X-Ray images are helpful screening tool for lung disease detection. However, disease diagnosis requires expert medical professionals. Furthermore, in developing and under-developed nations, the doctor-to-patient ratio is comparatively poor. Deep learning algorithms have recently demonstrated promise in the analysis of medical images and the discovery of patterns. In this current work, we have proposed a model MLDC (Multi-Lung Disease Classification) to detect common lung diseases. It introduces a MLDC feature extraction model with two different new classifiers, considering ANN (an artificial neural network) and QC (a quantum classifier). In this proposed model, tests are performed on the LDD (Lung Disease Dataset), which includes COvID-19, pneumonia, tuberculosis, and a healthy person's lung from chest X-ray images. Our proposed model achieves an accuracy of 95.6% for MLDC-ANN and 97.5% for MLDC-QC at a lower computational cost.

关键词： COvID-19 Pneumonia Tuberculosis image processing Deep learning machine learning Quantum classifier

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：