Pedestrian re-identification algorithms are crucial in personnel localization tasks in underground coal mines. The high similarity in attire among personnel in this environment renders general pedestrian re-identifica...
详细信息
ISBN:
(数字)9798350366556
ISBN:
(纸本)9798350366563
Pedestrian re-identification algorithms are crucial in personnel localization tasks in underground coal mines. The high similarity in attire among personnel in this environment renders general pedestrian re-identification algorithms unsuitable for personnel identification tasks in underground coal mines. This paper introduces an attention-guided feature fusion network to address the issue of poor identification accuracy arising from high similarity among personnel in coal mine scenarios. Initially, a ResNet network is utilized to extract detailed information about the target personnel. Subsequently, an attention-induced cross-level fusion module establishes a new feature fusion branch, enhancing cross-level learning and the representation of inter-class comparable subjects. Finally, contrastive global features are used to generate a powerful feature representation, reducing the difficulty of distinguishing between similar personnel. Experiments conducted on the proposed method in-house MineData dataset and the public Market-1501 dataset show that the proposed method outperforms current advanced methods, achieving mAP scores of 88.32 and 65.63, respectively.
The development of Spiking Neural Networks (SNN) and the discipline of Neuromorphic Engineering has resulted in a paradigm shift in how Machine Learning (ML) and Computer Vision (CV) problems are approached. At the he...
详细信息
ISBN:
(数字)9781665483483
ISBN:
(纸本)9781665483483
The development of Spiking Neural Networks (SNN) and the discipline of Neuromorphic Engineering has resulted in a paradigm shift in how Machine Learning (ML) and Computer Vision (CV) problems are approached. At the heart of this shift is the adoption of event-based sensing and processing methods. The production of sparse and asynchronous events that are dynamically connected to the scene is possible with an event-based vision sensor, allowing for the acquisition of not just spatial data but also high-fidelity temporal data. In this work, we describe a novel method for performing instance segmentation of objects, only using their spatio-temporal movement patterns, by utilising the weights of an unsupervised Spiking Convolutional Neural Network that was originally trained for object recognition and extending it to instance segmentation. This takes advantage of the network's spatial and temporal characteristics encoded within its internal feature representation, to offer this additional discriminative ability. We demonstrate this through a track path identification problem, where 6 identical blobs complete complex movement patterns within the same area at the same time. The network is able to successfully identify all 6 individual movements and segment the movement patterns belonging to each. The work then also explains how these methods map into the more complex Track before Detect problem. A complex track initiation problem, where detection can only be completed after an integration period, due to the low signal, high noise environment. These problem characteristics seem to complement the properties of event-based sensing and processing and initial test results are shown.
In order to remedy the underestimation of an advanced convex penalty defined via minimization, this paper proposes its nonconvex enhancement while preserving convexity of the associated regularized least-squares model...
详细信息
ISBN:
(数字)9789464593617
ISBN:
(纸本)9798331519773
In order to remedy the underestimation of an advanced convex penalty defined via minimization, this paper proposes its nonconvex enhancement while preserving convexity of the associated regularized least-squares model. We first design a generalized Moreau enhanced minimization induced (GME-MI) penalty function by subtracting from the MI penalty its generalized Moreau envelope. Then, we derive an overall convexity condition for the GME-MI regularized least-squares model. Finally, under the overall convexity condition, characterizing the solution set of the GME-MI model with a carefully designed averaged nonexpansive operator, we develop a proximal splitting algorithm which is guaranteed to converge to a globally optimal solution. Numerical examples demonstrate the effectiveness of the proposed approach.
Graph Neural Networks (GNNs) have drawn great research attention for graph machine learning. However, graph learning techniques are extremely difficult for practical deployments in the industry applications owing to t...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Graph Neural Networks (GNNs) have drawn great research attention for graph machine learning. However, graph learning techniques are extremely difficult for practical deployments in the industry applications owing to the scalability challenges incurred by data dependency. Although some works attempt to establish the known efficient MLPs with GNNs based on knowledge distillation (KD), they are primarily designed for graphs in Euclidean spaces, which can not provide the most powerful geometry for graph representation as numerous real- world graphs display a combination of Euclidean and Hyperbolic geometry. To achieve comprehensive expression for complex graph data with high efficiency, in this paper, we proposed a novel advanced Graph-MLPs Distillation Framework (AGMDF) based on global and local hyperbolic geometry learning. The key point of our method is to fully exploit the complex graph with additional hyperbolic properties based on knowledge distillation. Specifically, the global cross-geometric knowledge fusion can exploit the compensation information learned from Euclidean view and Hyperbolic domain. Then, the local hyperbolic knowledge enhancement can employ prominent tree-likeness components among graph data to improve the graph representation ability. Thus, the distilled MLP model enjoys the high expressive ability of graph context-awareness based on global and local hyperbolic geometry learning. Extensive experiments show that AGMDF achieves competitive accuracy with GNNs and improves over stand-alone MLPs by 21.84% on average while inferring faster than GNNs across five benchmark datasets.
Precise mapping of crop types and estimating yields are important in gauging agricultural diversity and yield potential, especially in regions dominated by small-scale farming. Nevertheless, these tasks are challengin...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Precise mapping of crop types and estimating yields are important in gauging agricultural diversity and yield potential, especially in regions dominated by small-scale farming. Nevertheless, these tasks are challenging due to factors such as small field sizes, inter-cropping, and a lack of sufficient ground truth labels for certain regions. In this paper, we propose an approach that combines advanced deep learning algorithms with Sentinel-2 and MODIS satellite data for improving the accuracy of crop type mapping and yield prediction. We used datasets from the main growing season of 2017 in Kenya (Bungoma, Busia and Siaya) coupled with county level yield data from US, Argentina and Brazil spanning from 2005 to 2016. Our models (CNN, SegNet, MaskRCNN, ResNet, UNet) were evaluated on both tasks i.e., classification of crop types and predicting yields.
In recent years, gesture recognition based on data gloves has attracted increasing attention as a human-computer interaction (HCI) method that is natural, convenient, stable, robust, easy to recognize, and applicable ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In recent years, gesture recognition based on data gloves has attracted increasing attention as a human-computer interaction (HCI) method that is natural, convenient, stable, robust, easy to recognize, and applicable to various usage environments. This research first proposes an advanced smart data glove that integrates cutting-edge flexible capacitive sensors on the fingertips and a 6-axis IMU on the back of the hand to recognize gestures. Secondly, this study proposes a personalized continuous gesture segmentation (PCGS) model that can adaptively calculate the most appropriate gesture segmenting threshold based on the current user and introduces the multi-sliding window theory and kinematic knowledge to perform personalized gesture segmentation. The accuracy of gesture segmentation can reach 94.3%. The result shows that our PCGS model achieves an average segmentation accuracy of 94.3% and outperforms the state-of-the-art pproaches by 11.2% to 18.5%.
Video streaming services typically employ traditional codecs, such as H.264, to encode videos into multiple bitrate representations. These codecs are tightly limited by discrete quantization parameters (QPs), resultin...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Video streaming services typically employ traditional codecs, such as H.264, to encode videos into multiple bitrate representations. These codecs are tightly limited by discrete quantization parameters (QPs), resulting in encoded rates that do not align with the target bitrate. Additionally, the subpar video quality produced by conventional codecs does not meet the demands of high-resolution communication. Considering the limitations of traditional codecs, we take a fresh new approach to video streaming by leveraging advanced deep learning-based video codecs. Specifically, we develop a neural adaptive contextual video streaming framework that incorporates: 1) an ensemble deep reinforcement learning based adaptive bitrate algorithm named TSAC that enables continuous bitrate adjustment to varying network conditions 2) a two-stage proportional-integral-derivative-based rate control module that dynamically fine-tunes QPs to ensure the encoded bitrate aligning with the target bitrate. Furthermore, we implement intra-GoP and inter-GoP techniques to accelerate the inference process of the contextual video codec for real-time processing needs. Our experiments demonstrate that the average relative error in bitrate remains below 2%, the quality of experience provided by our TSAC agents surpasses that of existing discrete algorithms by 13%-20%. Our optimization techniques enable real-time decoding at approximately 24 frames per second for quad high definition videos.
Visual emotion recognition is a very large field. It plays a very important role in different domains such as security, robotics, and medical tasks. The visual tasks could be either image or video. Unlike the image pr...
详细信息
ISBN:
(纸本)9781665427357
Visual emotion recognition is a very large field. It plays a very important role in different domains such as security, robotics, and medical tasks. The visual tasks could be either image or video. Unlike the image processing, the difficulty of video processing is always a challenge due to changes in information over time variation. Significant performance improvements when applying deep learning algorithms to video processing. This paper presents a deep neural network based on ResNet50 model. The latter is conducted on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) due to the variance of the nature of the data exists which is speech and song. The choice of ResNet model is based on the ability of facing different problems such as of vanishing gradients, the performing stability offered by this model, the ability of CNN for feature extraction which is considered to be the base architecture for ResNet, and the ability of improving the accuracy results and minimizing the loss. The achieved results are 57.73% for song and 55.52% for speech. Results shows that the Resnet50 model is suitable for both speech and song while maintaining performance stability.
Direct-to-Satellite IoT (DTS-IOT) represents a promising solution for data transmission in remote regions where terrestrial infrastructure deployment is unfeasible. In DTS-IOT scenarios, Low-Earth Orbit (LEO) satellit...
详细信息
ISBN:
(数字)9798331522353
ISBN:
(纸本)9798331522384
Direct-to-Satellite IoT (DTS-IOT) represents a promising solution for data transmission in remote regions where terrestrial infrastructure deployment is unfeasible. In DTS-IOT scenarios, Low-Earth Orbit (LEO) satellites function as in-orbit gateways. Addressing the need for practical simulation tools, we present FLORASAT 2, an open-source, event-driven, end-to-end simulation tool leveraging OMNET++. The original FLO-RASAT met many DTS-IOT Medium Access Control (MAC) requirements. Still, this enhanced version introduces advanced Inter-Satellite Link (ISL) communication modules, including a helper for constellation creation, dynamic ISL topology control, routing, and analytics, facilitating the thorough evaluation of constellation-grade DTS-IOT networks. These new features allow detailed simulation scenario configuration, flexible support for developing and including diverse routing algorithms, and the tooling to perform automated data analysis from parametric simulations. Overall, the simulator enables the analysis of complex behaviors in DTS-IOT environments, optimizing performance and enhancing connectivity and efficiency in large-scale satellite IoT constellation networks.
Ensuring individuals’ well-being through health monitoring is vital for safety and overall health. Monitoring vital signs, such as heart rate, is essential for understanding physiological conditions in diverse enviro...
详细信息
ISBN:
(数字)9798350368673
ISBN:
(纸本)9798350368680
Ensuring individuals’ well-being through health monitoring is vital for safety and overall health. Monitoring vital signs, such as heart rate, is essential for understanding physiological conditions in diverse environments. Traditional methods often involve cumbersome and invasive equipment, making them impractical in confined or remote settings. As a result, there is growing interest in non-invasive, remote monitoring technologies that can track vital signs in real-time without compromising comfort or activities. This paper introduces a system utilizing camera-based photoplethysmography (PPG) technology for remote monitoring of vital signs. The system extracts and analyzes PPG signals from video footage captured by cameras, relying on advancedsignalprocessingalgorithms to accurately extract these signals despite challenges like motion artifacts and low signal-to-noise ratios. It employs techniques such as motion artifact removal, adaptive filtering, and real-time peak detection algorithms to improve PPG signal quality and reliability. The study demonstrates the system’s capabilities in non-contact physiological monitoring and underscores its potential benefits in various settings. This research highlights the system’s ability to extract critical health information using cameras, emphasizing its significance in maintaining individuals’ health and well-being.
暂无评论