This study presents a human-computer interaction combined with a brain-machine interface (BMI) and obstacle detection system for remote control of a wheeled robot through movement imagery, providing a potential soluti...
详细信息
This study presents a human-computer interaction combined with a brain-machine interface (BMI) and obstacle detection system for remote control of a wheeled robot through movement imagery, providing a potential solution for individuals facing challenges with conventional vehicle operation. The primary focus of this work is the classification of surface EEG signals related to mental activity when envisioning movement and deep relaxation states. Additionally, this work presents a system for obstacle detection based on imageprocessing. The implemented system constitutes a complementary part of the interface. The main contributions of this work include the proposal of a modified 10-20-electrode setup suitable for motor imagery classification, the design of two convolutional neural network (CNNs) models employed to classify signals acquired from sixteen EEG channels, and the implementation of an obstacle detection system based on computer vision integrated with a brain-machine interface. The models developed in this study achieved an accuracy of 83% in classifying EEG signals. The resulting classification outcomes were subsequently utilized to control the movement of a mobile robot. Experimental trials conducted on a designated test track demonstrated real-time control of the robot. The findings indicate the feasibility of integration of the obstacle detection system for collision avoidance with the classification of motor imagery for the purpose of brain-machine interface control of vehicles. The elaborated solution could help paralyzed patients to safely control a wheelchair through EEG and effectively prevent unintended vehicle movements.
With the increasing volume of collected Earth observation (EO) data, artificial intelligence (AI) methods have become state-of-the-art in processing and analyzing them. However, there is still a lack of high-quality, ...
详细信息
With the increasing volume of collected Earth observation (EO) data, artificial intelligence (AI) methods have become state-of-the-art in processing and analyzing them. However, there is still a lack of high-quality, large-scale EO datasets for training robust networks. This paper presents AgriSen-COG, a large-scale benchmark dataset for crop type mapping based on Sentinel-2 data. AgriSen-COG deals with the challenges of remote sensing (RS) datasets. First, it includes data from five different European countries (Austria, Belgium, Spain, Denmark, and the Netherlands), targeting the problem of domain adaptation. Second, it is multitemporal and multiyear (2019-2020), therefore enabling analysis based on the growth of crops in time and yearly variability. Third, AgriSen-COG includes an anomaly detection preprocessing step, which reduces the amount of mislabeled information. AgriSen-COG comprises 6,972,485 parcels, making it the most extensive available dataset for crop type mapping. It includes two types of data: pixel-level data and parcel aggregated information. By carrying this out, we target two computer vision (CV) problems: semantic segmentation and classification. To establish the validity of the proposed dataset, we conducted several experiments using state-of-the-art deep-learning models for temporal semantic segmentation with pixel-level data (U-Net and ConvStar networks) and time-series classification with parcel aggregated information (LSTM, Transformer, TempCNN networks). The most popular models (U-Net and LSTM) achieve the best performance in the Belgium region, with a weighted F1 score of 0.956 (U-Net) and 0.918 (LSTM).The proposed data are distributed as a cloud-optimized GeoTIFF (COG), together with a SpatioTemporal Asset Catalog (STAC), which makes AgriSen-COG a findable, accessible, interoperable, and reusable (FAIR) dataset.
In security-sensitive areas, tracking individuals can be of high importance or may even be used to understand crowd activity, how people change their movement within the crowd or most common shops. Crowd activity tren...
详细信息
ISBN:
(纸本)9781665440868
In security-sensitive areas, tracking individuals can be of high importance or may even be used to understand crowd activity, how people change their movement within the crowd or most common shops. Crowd activity trends can help to determine the most common locations of public places, which can help attract a wider audience in advertising placement. A growing concern has been public safety;it can be a harrowing experience to track these mischievous elements inside the crowd. This paper provides an approach that can map people with a geographical coordinate, which denotes an individual's realtime position in the real world through a camera feed. This data helps us to track the road, the direction of motion and where a person has turned on a map. These insights include descriptions of in-depth crowd movement that can assist with better surveillance and evaluating crowd activity.
The deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTE...
详细信息
ISBN:
(纸本)9781728176055
The deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020 where we open-sourced training and test datasets for researchers to train their noise suppression models. We also open-sourced a subjective evaluation framework and used the tool to evaluate and select the final winners. Many researchers from academia and industry made significant contributions to push the field forward. We also learned that as a research community, we still have a long way to go in achieving excellent speech quality in challenging noisy real-time conditions. In this challenge, we expanded both our training and test datasets. Clean speech in the training set has increased by 200% with the addition of singing voice, emotion data, and non-English languages. The test set has increased by 100% with the addition of singing, emotional, non-English (tonal and non-tonal) languages, and, personalized DNS test clips. There are two tracks with focus on (i) real-time denoising, and (ii) real-time personalized DNS. We present the challenge results at the end.
作者:
Matsuo, AkiraYamakawa, YujiUniv Tokyo
Grad Sch Interdisciplinary Informat Studies Tokyo 1538505 Japan Univ Tokyo
Interfac Initiat Informat Studies Tokyo 1538505 Japan Univ Tokyo
Inst Ind Sci 4-6-1 KomabaMeguro ku Tokyo 1538505 Japan
Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to ...
详细信息
Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to real-time control in robotics. Therefore, it is required to increase tracking speed and detection accuracy. Currently, however, it is difficult to achieve both of those things simultaneously. In this paper, we propose a tracking method that combines multiple methods: correlation filter-based object tracking, deeplearning-based object detection, and motion detection with background subtraction. The algorithms work in parallel and assist each other's processing to improve the overall performance of the system. We named it the "Mutual Assist tracker of feature Filters and Detectors (MAFiD method)". This method aims to achieve both high-speed tracking of moving objects and high detection accuracy. Experiments were conducted to verify the detection performance and processing speed by tracking a transparent capsule moving at high speed. The results show that the tracking speed was 618 frames per second (FPS) and the accuracy was 86% for Intersection over Union (IoU). The detection latency was 3.48 ms. These experimental scores are higher than those of conventional methods, indicating that the MAFiD method achieved fast object tracking while maintaining high detection performance. This proposal will contribute to the improvement of object-tracking technology.
Visual perception is a key technology in the Intelligent Visual Internet of Things. The research of object detection methods is of great significance for improving the safety and efficiency of unmanned driving technol...
详细信息
Visual perception is a key technology in the Intelligent Visual Internet of Things. The research of object detection methods is of great significance for improving the safety and efficiency of unmanned driving technology and intelligent visual Internet of Things. 3D point clouds object detection of deeplearning can not only use the deep network to automatically learn characteristics of the multi-layer abstract structure, improve calculation efficiency and detection accuracy of the model, but also have better performance in dealing with object occlusion, absence and data sparsity with obtained high-dimensional point clouds information. However, the current review of object detection methods for 3D point clouds based on deeplearning is scarce. In order to provide a more comprehensive understanding and understanding of the security and efficiency development of driverless technology, this paper is divided into the monocular camera, RGB-D image and LiDAR point cloud, according to the main data of the network model, and further subdivides according to the different use methods of the model. Analyze the performance of various model detection methods. This article also summarizes current commonly used 3D point clouds datasets of object detection, organizes and describes detection metrics of commonly used 3D point clouds, and discusses research challenges and development trends. The real-time performance of 3D point cloud object detection under the intelligent vision Internet of Things needs to be improved.
UAV development is being intensively developed by various groups to help overcome various types of problems. Object Detection is important in helping UAVs to do drone chasing and other competition that need visual app...
详细信息
ISBN:
(纸本)9781665401791
UAV development is being intensively developed by various groups to help overcome various types of problems. Object Detection is important in helping UAVs to do drone chasing and other competition that need visual approach based on imageprocessing and deeplearning. Unfortunately, the computational capabilities of the onboard processing unit that attached to the UAV are less than optimal for object detection due to storage and memory size constraints. This paper aims to create the new approach to improve the precision and recall during UAV detection by using web application to do realtime detection. To decide a pre-trained model, it is necessary to compare which SSD pre-trained model is suitable to be deployed in this web application. The results obtained are that using the web application approach is better than the onboard processing approach with a high level of precision and recall with an average precision value of 0.85 and an average recall value of 0.837.
Carrier signal detection has been a problem for a long time, which is the first step for blind signal processing. In this paper, we propose a new method for carrier signal detection in the broadband power spectrum bas...
详细信息
Carrier signal detection has been a problem for a long time, which is the first step for blind signal processing. In this paper, we propose a new method for carrier signal detection in the broadband power spectrum based on the fully convolutional network (FCN). FCN is a deeplearning method and used in semantic image segmentation tasks. By regarding the broadband power spectrum sequence as a one-dimensional (1D) image and each subcarrier on the broadband as the target object, we can transform the carrier signal detection problem on the broadband into a semantic segmentation problem on a 1D image without prior knowledge. We design a 1D deep convolutional neural network (CNN) based on FCN to classify each point on broadband power spectrum array into two classes: subcarrier or noise, and then we can easily locate the subcarrier signals' position on the broadband power spectrum. We train the deep CNN on a simulation dataset and validate it on a real satellite broadband power spectrum dataset. The experimental results show that our method can effectively detect the subcarrier signal in the broadband power spectrum and achieve higher accuracy than the slope tracing method.
In order to prevent the spread of CORONA virus, everyone must wear a mask during the pandemic. In these tough times of COVID-19 it is necessary to build a model that detects people with and without mask in real-time a...
详细信息
ISBN:
(纸本)9781728158754
In order to prevent the spread of CORONA virus, everyone must wear a mask during the pandemic. In these tough times of COVID-19 it is necessary to build a model that detects people with and without mask in real-time as it works as a simple precautionary measure to prevent the spread of virus. If deployed correctly, this machine learning technique helps in simplifying the work of frontline warriors and saving their lives. A basic Convolutional Neural Network (CNN) model is built using TensorFlow, Keras, Scikit-learn and OpenCV to make the algorithm as accurate as possible. Javascript API helps in accessing webcam for real-time face mask detection. Since Google Colab runs on web browser it can't access local hardware like a camera without APIs. The proposed work contains three stages: (i) pre-processing, (ii) Training a CNN and (iii) real-time classification. The first part is the Pre-processing section, which can be divided into "Grayscale Conversion" of RGB image, "image resizing and normalization" to avoid false predictions. Then the proposed CNN, classifies faces with and without masks as the output layer of proposed CNN architecture contains two neurons with Softmax activation to classify the same. Categorical cross-entropy is employed as loss function. The proposed model has Validation accuracy of 96%. If anyone in the video stream is not wearing a protective mask a Red coloured rectangle is drawn around the face with a dialog entitled as NO MASK and a Green coloured rectangle is drawn around the face of a person wearing MASK
暂无评论