检索结果-内蒙古大学图书馆

image processing, Computer Vision and Machine learning (ICICML), International Conference on

作者： Kai Luo Faculty of Science and Technology Beijing Normal University-Hong Kong Baptist University United International College Zhuhai China

ISBN: (数字)9798350355413

ISBN: (纸本)9798350355420

Brain-computer interfaces (BCIs) hold immense potential for restoring communication to individuals with severe speech impairments. This paper investigates the feasibility of real-time open-vocabulary sentence decoding from magnetoencephalography (MEG) signals using deep learning with a Transformer architecture. We propose a novel end-to-end model that leverages the high temporal and spatial resolution of MEG and the powerful sequence-to-sequence learning capabilities of Transformers. Our model is trained and evaluated on a large dataset of MEG recordings paired with natural language sentences. We demonstrate the effectiveness of our approach in achieving real-time, accurate, and flexible communication, significantly outperforming existing methods. Our findings pave the way for the development of more practical and user-friendly BCIs for individuals with speech disabilities.

关键词： deep learning Error analysis Magnetoencephalography Transformers Brain modeling real-time systems Brain-computer interfaces Decoding Speech processing Spatial resolution

来源：评论

学校读者我要写书评

暂无评论

Vehicle detection algorithm based on lightweight YOLOX

引用

SIGNAL image AND VIDEO processing 2023年第5期17卷 1793-1800页

作者： Xiong, Cong Yu, Anning Yuan, Senhao Gao, Xinghua Chongqing Univ Sci & Technol Sch Intelligent Technol & Engn Univ Town East Rd Chongqing 404100 Peoples R China

Nowadays, accurate and fast vehicle detection technology is of great significance for constructing intelligent transportation systems in the context of the era of big data. This paper proposes an improved lightweight YOLOX real-time vehicle detection algorithm. Compared with the original network, the detection speed and accuracy of the new algorithm have been improved with fewer parameters. First, referring to the GhostNet, we make a lightweight design of the backbone extraction network, which significantly reduces the network parameters, training cost, and inference time. Furthermore, by introducing the alpha-CIoU loss function, the regression accuracy of the bounding box (bbox) is improved, while the convergence speed of the model is also accelerated. The experimental results show that the mAP of the improved algorithm on the BIT-Vehicle dataset can reach up to 99.21% with 41.2% fewer network parameters and 12.7% higher FPS than the original network and demonstrate the effectiveness of our proposed method.

关键词： Vehicle detection BIT-Vehicle dataset YOLOX deep learning

来源：评论

学校读者我要写书评

暂无评论

OMRNet: A lightweight deep learning model for optical mark recognition

引用

MULtimeDIA TOOLS AND APPLICATIONS 2024年第5期83卷 14011-14045页

作者： Mondal, Sayan De, Pratyay Malakar, Samir Sarkar, Ram Natl Inst Technol Durgapur Dept Elect Engn Durgapur 713209 India Asutosh Coll Dept Comp Sci Kolkata 700026 India Jadavpur Univ Dept Comp Sci & Engn Kolkata 700032 India

Existing Optical Mark Recognition (OMR) systems tend to be expensive and rigid in their operation, often resulting in erroneous evaluations due to strict correction protocols. This scenario airs the need for a flexible OMR system. Hence, in this work, we propose a lightweight transfer learning based Convolutional Neural Network (CNN) model, dubbed as OMRNet, which can classify answer boxes on any generalized OMR test sheet. Unlike most existing techniques that rely on image processing algorithms to recognize extracted answer boxes in two classes: confirmed and empty, the OMRNet is designed to classify the answer boxes into confirmed, crossed-out, and empty categories. That is, OMRNet is facilitating the crossing out of previously answered questions and thus removing the rigidity of templates in Multiple Choice Question (MCQ) tests. We have built OMRNet on top of a MobileNetV2 backbone connected to four fully connected layers with appropriate dropouts and activation functions in between. We have evaluated OMRNet on the Multiple Choice Answer Boxes dataset available at . We have performed experiments following a 5 fold cross validation scheme, and OMRNet has achieved accuracies of 95.29%, 95.88%, 93.97%, 97.45%, and 97.20%, with an average accuracy of 95.96%. Also, the experimental results confirm that the present model performs better than the compared state-of-the-art methods and standard CNN models in terms of accuracy, execution time, and memory required to store the trained module. Moreover, we have employed a quantization technique to make the trained module more memory efficient and deployed it to a web app using our own Representational State Transfer Application Programming Interface (REST API). It makes OMRNet available via a Hypertext Transfer Protocol (HTTP) endpoint, allowing potential users to connect to it via the Internet. The source code for the work is available at the following link: .

关键词： Optical mark recognition Multiple choice question Transfer learning OMRNet

来源：评论

学校读者我要写书评

暂无评论

Robust Low Complexity Framework for Early Diagnosis of Autism Spectrum Disorder Based on Cross Wavelet Transform and deep Transfer learning

引用

SN Computer Science 2024年第2期5卷 231页

作者： Toranjsimin, Amir Zahedirad, Saeed Moattar, Mohammad Hossein Department of Biomedical Engineering Sadjad University Mashhad Iran Department of Computer Engineering Mashhad Branch Islamic Azad University Mashhad Iran

Autism spectrum disorder (ASD) starts in the early childhood. Therefore, its diagnosis and classification at the right time would prevent the damages in long terms. EEG signals are non-invasive brain activity signals with excellent temporal resolution and low costs. In this article, the goal is to propose a unified framework for early, efficient and noise robust diagnosis of ASD using EEG signals and with the help of deep transfer learning. In the proposed method, other that the proposed unified diagnosis framework, the main contribution is to use Cross Wavelet Transform (XWT) images for representation of brain signals. After pre-processing and segmentation of the signals, a reference signal is separated from the normal class. Using the reference signal, XWT images are generated. Produced images are fed as input to deep network architectures such as AlexNet, GoogleNet VGG19, ResNet-50 and ResNet-101 in a transfer learning procedure. Transfer learning is applied to make use of information from a source image classification domain while compensating the scarcity of ASD and normal subjects. The approach is evaluated on a dataset of 34 ASD samples and 11 normal case in two different without-voice and with-voice conditions. To validate the early diagnosis hypothesis, EEG signals from children older than 5 years are used as the training set and EEG signals from younger subjects are used as the validation set. Experiments on the proposed framework show that the ResNet-101 deep architecture has achieved the best classification performance. This classification performance is higher than recent reported approaches in terms of classification accuracy, sensitivity, specificity and F1 measure. The results show the effectiveness of the proposed approach in early diagnosis of autism spectrum disorder and also demonstrates the auditory impact on the diagnosis of autism. Also, having evaluated the approach on with-voice and without-voice datasets, the results denote the robustness o

关键词： Autism spectrum disorder Cross wavelet transform deep learning Electroencephalography Transfer learning

来源：评论

学校读者我要写书评

暂无评论

A YOLO-based Method for Object Contour Detection and Recognition in Video Sequences

A YOLO-based Method for Object Contour Detection and Recogni...

引用

2024 Workshop Cybersecurity Providing in Information and Telecommunication Systems, CPITS 2024

作者： Nazarkevych, Mariia Kostiak, Maryna Oleksiv, Nazar Vysotska, Victoria Shvahuliak, Andrii-Taras Lviv Polytechnic National University 12 Stepan Bandera str. Lviv79013 Ukraine Lviv Ivan Franko National University 1 Universytetska str. Lviv79000 Ukraine

A method for recognizing the contours of objects in a video data stream is proposed. The data will be uploaded using the video camera. Objects will be recognized in real-time. We will use YOLO—a method of identification and recognition of objects in real-time. Recognized objects will be recorded in a video sequence showing the contours of the objects. The approach proposed in the project reasonably synthesizes methods of artificial intelligence, theories of computer vision on the one hand, and pattern recognition on the other;it makes it possible to obtain control influences and mathematical functions for decision-making at every moment with the possibility of analyzing the influence of external factors and forecasting the flow of processes and refers to the fundamental problems of mathematical modeling of real processes. The installation of the neural network is shown in detail. The characteristics of the neural network and its capabilities are shown. Approaches to computer vision for object extraction are shown. Well-known methods are methods of expanding areas, methods based on clustering, contour selection, and methods using a histogram. The work envisages building a system for rapid identification of combat vehicles based on the latest image filtering methods developed using deep learning methods. The time spent on machine identification will be 10–20% shorter, thanks to the developed new information technology for detecting objects in conditions of rapidly changing information. © 2024 Copyright for this paper by its authors.

关键词： Artificial intelligence image recognition segmentation selection of objects tracking YOLO

来源：评论

学校读者我要写书评

暂无评论

OCT image Denoising Based on Bayesian Non-local Mean Filter and deep learning Network

OCT Image Denoising Based on Bayesian Non-local Mean Filter ...

引用

2023 International Conference on image, Signal processing, and Pattern Recognition, ISPP 2023

作者： Liu, Haotian College of Science and Engineering The University of Edinburgh Edinburgh United Kingdom

ISBN: (纸本)9781510666351

Optical Coherence Tomography (OCT) uses low coherence light to provide a high spatial resolution to detect changes in the microstructure of living organisms in a non-invasive, real-time manner. A new OCT image denoising method is proposed to address the problem of poor noise reduction by conventional OCT image denoising algorithms. The method combines a Bayesian non-local mean filtering algorithm and deep learning to denoise noisy images for better noise reduction. By comparing with the Gaussian filtering algorithm, the median filtering algorithm, the BNLM (Bayesian nonlocal mean) denoising algorithm, the BM3D (block-matching and 3D filtering) denoising algorithm, the new algorithm outperforms the traditional method in terms of noise reduction. And compares the peak signal-to-noise ratio, structural similarity and other metrics. New algorithm shows superiority. © 2023 SPIE.

关键词： Optical tomography

来源：评论

学校读者我要写书评

暂无评论

Road Condition Detection and Crowdsourced Data Collection for Accident Prevention: A deep learning Approach 12

Road Condition Detection and Crowdsourced Data Collection fo...

引用

12th International Conference on image processing Theory, Tools and Applications, IPTA 2023

作者： Jahan, Md Saroar Islam, Mominul Hossain, Md Sanjid Kabir Mim, Jhuma Oussalah, Mourad Akter, Nasrin Oulu Finland Daffodil International University Dept. of Cse Dhaka Bangladesh Lut Univerity Computer Vision and Pattern Recognition Lappeenranta Finland University of Oulu Cmvs Faculty of Itee Oulu Finland

ISBN: (纸本)9798350325416

Bangladesh is one of the countries struggling to prevent road accidents, which is a global cause for concern. An early warning system that indicates road conditions can contribute to the prevention task. For this purpose, a deep-learning based approach using a Convolutional Neural Network (CNN) to learn from random road images the safety factor is developed. This results in a three-class categorization: (i) Severely risky roads, (ii) Mildly risky roads, and (iii) Normal roads. The application of deep learning techniques in this study yields an accuracy of 95.5% in detecting problematic road conditions. Furthermore, based on the study's findings, a mobile application has been developed. The app enables real-time crowdsourced data collection of road conditions and provides a platform for users to share this information in real-time with other drivers, thereby, contributing to prevent accidents and raise awareness among drivers and users by pinpointing the location of the risky road. Finally, crowdsourced data has been reused to update the trained model, which further improves the classifier accuracy. © 2023 IEEE.

关键词： Safety factor

来源：评论

学校读者我要写书评

暂无评论

Mobility-Aware Graph Reinforcement learning for Service Migration in Mobile Edge Computing 17

Mobility-Aware Graph Reinforcement Learning for Service Migr...

引用

17th International Congress on image and Signal processing, BioMedical Engineering and Informatics, CISP-BMEI 2024

作者： Liu, Shilong Sun, Haifeng School of Computer Science and Technology Southwest University of Science and Technology Mianyang621010 China

ISBN: (纸本)9798331507398

Mobile edge computing improves data processing efficiency and reduces latency by deploying computing and storage resources at the network edge, making it suitable for real-time applications. In vehicular networks, due to the high mobility of vehicles and the limited coverage of edge servers, ensuring Quality of Service (QoS) and preventing service inter-ruptions are critical challenges. Service migration and resource reallocation are necessary strategies to maintain QoS. One major challenge in MEC scenarios is how to deliver stable services in the face of high vehicle mobility. To address this, this paper proposes a Mobility-Aware Graph Reinforcement learning (MA-G RL) framework, designed specifically for service migration in vehicular network systems. The MA-GRL framework consists of two components: the first is a vehicle position prediction module based on a sequence-to-sequence (seq2seq) model, and the second is a graph attention-based reinforcement learning (GRL) module for service migration decision-making and resource allocation. MA-GRL models the vehicular network as a graph and the service migration process as a Markov Decision Process (MDP), utilizing a graph attention network to handle the dynamic observation space and leveraging attention mechanisms to make decisions in a constantly changing action space. Simulation results show that MA-GRL outperforms traditional methods in reducing communication latency. Additionally, the trained model demonstrates adaptability and stability across different network topologies, highlighting its robustness in various environments. © 2024 IEEE.

关键词： Markov processes

来源：评论

学校读者我要写书评

暂无评论

Adversarial learning Based Semi-supervised Semantic Segmentation of Low Resolution Gram Stained Microscopic images 8th

Adversarial Learning Based Semi-supervised Semantic Segmenta...

引用

8th International Conference on Computer Vision and image processing (CVIP)

作者： Singh, Harshal Kanabur, Vidyashree R. Sumam, S. David Vijayasenan, deepu Govindan, Sreejith Natl Inst Technol Karnataka Surathkal Karnataka India MAHE Dept Basic Med Sci Manipal Karnataka India

ISBN: (纸本)9783031581731;9783031581748

Urinary tract infections (UTIs) are infections that affect the urinary system. It is usually caused by bacteria and pus cells. Analyzing urine samples, including examining pus cells, is a standard method for diagnosing and monitoring UTIs. However, manually detecting bacteria or pus cells in microscopic urine images is a time-consuming and labour-intensive task for microbiologists. Therefore, the segmentation of microscopic pus cell images will ease the process of detecting UTI. Especially low resolution microscopic images are hard to annotate;therefore, in this study, we propose an adversarial learning based semi-supervised segmentation method for segmentation of pus cell images at low resolution i.e. 40x using labeled high resolution images i.e. 100x. The proposed methodology aims to ease the process of UTI detection by automating the segmentation of pus cell images. The results of the proposed methodology demonstrate an increase in the Dice coefficient score percentage by 1%, 1.6% and 2.4% on 40x images when compared to fully supervised segmentation model trained on only 100x data using three different architectures- Unet, ResUnet++, and PSPnet, respectively.

关键词： Pus cell image segmentation Semi-supervised learning Generative Adversarial Network Fully convolutional networks (FCN) deep learning

来源：评论

学校读者我要写书评

暂无评论

Classification of Insect Pest Using Transfer learning Mechanism 1

引用

8th International Conference on Computer Vision and image processing (CVIP)

作者： Malik, Parveen Parida, Manoj Kumar KIIT Deemed Be Univ Bhubeneswar 751024 Odisha India

ISBN: (数字)9783031585357

ISBN: (纸本)9783031585340;9783031585357

Classification of Insects is one of the most vital and essential research, which needs to be done for, various factors like, for the protection of crops in the agricultural sector. The identification of crop pests is a difficult problem since, pest infestations cause significant crop damage and quality degradation. The majority of insect species are quite similar to one another that makes the task of detection of the insect on field crops like rice, soybeans, and other crops more challenging than a normal detection of objects. Currently, classifying insects manually is the major method used to distinguish them in crop fields, but this is a time - consuming and expensive operation. Considering the advancements in the field of deep learning, we propose to use a pre-trained network model trained on a millions of images of imageNet dataset to do the classification task using transfer learning mechanism. An extensive experimentation was done using various pretrained models like VGG, inception, xception, ResNet, MobileNet, DenseNet and efficient net. Various insect datasets were used for classification task and model was fined tuned using transfer learning. The EfficientNet B7 model has achieved the highest accuracy 70%, 98% and 99% on IP102 (102 classes), Xie (40 classes) and Kaggle village Synthetic dataset (10 classes) respectively.

关键词： IP102 Transfer learning Pretrained Neural Networks Xception Inception ResNet50 VGG16 VGG19 DenseNet121 EfficientNetB7 CNN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：