In image fusion, combining infrared and visible images from different sensors is crucial to create a complete representation that merges complementary information. However, current deep learning approaches, mainly usi...
详细信息
ISBN:
(纸本)9789819786848;9789819786855
In image fusion, combining infrared and visible images from different sensors is crucial to create a complete representation that merges complementary information. However, current deep learning approaches, mainly using Convolutional Neural Networks (CNNs) or Transformer architectures, do not fully capitalize on the distinct features of infrared and visible images. To overcome this limitation, we introduce a novel Dual-Branch feature extraction network for infrared and visible image fusion (DBIF). DBIF optimally leverages the advantages of CNN and Transformer for feature extraction from different types of images. Specifically, the Transformer's proficiency in extracting global features renders it more suitable for extracting target information from infrared images, while the CNN's superior sensitivity to capturing local information makes it more adept at extracting background texture information from visible images. Consequently, our DBIF architecture incorporates two distinct branches, content and detail, for feature extraction from infrared and visible images, respectively. Additionally, we introduce a Detailed Feature Enhancement Module (DFEM) to consolidate and amplify the prominent features extracted by the detailed branch. Through extensive experimentation across multiple datasets, we validate the effectiveness of our proposed approach, showcasing its superiority over existing fusion algorithms. Furthermore, our method shows substantial performance improvements, especially in object detection tasks. This underscores its practical relevance in various real-world applications that require accurate and efficient fusion of diverse image data types.
Accurate stress detection is vital for various applications, including health monitoring and well-being assessment. This research study addresses the challenge of stress detection through a multimodal fusion approach,...
详细信息
Electromyography (EMG) and Inertial Measurement (IMU) sensors are increasingly associated with exoskeletons and prosthetics. Coordination and synchronization of these sensors with a sensorfusion algorithm modifying a...
详细信息
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ...
详细信息
ISBN:
(纸本)9798350346855
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ASIC integration. FieldHAR aims to address the lack of apparatus to transform complex HAR methodologies often limited to offline evaluation to efficient run-time edge applications. The framework uses parallel sensor interfaces and integer-based multi-branch convolutional neural networks (CNNs) to support flexible modality extensions with synchronous sampling at the maximum rate of each sensor. To validate the framework, we used a sensor-rich kitchen scenario HAR application which was demonstrated in a previous offline study. Through resource-aware optimizations, with FieldHAR the entire RTL solution was created from data acquisition to ANN inference taking as low as 25% logic elements and 2% memory bits of a low-end Cyclone IV FPGA and less than 1% accuracy loss from the original FP32 precision offline study. The RTL implementation also shows advantages over MCU-based solutions, including superior data acquisition performance and virtually eliminating ANN inference bottleneck.
The rapid growth within the e-commerce industry and the necessity of hiring more inexperienced labor in logistics made the warehouse safety more critical than ever. Monitoring the workers' activity during warehous...
详细信息
The abundance of available image capturing technologies is driving up demand for fusing images in the domain of image processing. Image fusion creates a single composite image by merging pertinent input from several s...
详细信息
There are over 300 million severely visually impaired people worldwide, and in their constant struggle for independence, there is a pressing need for advanced Electronic Travel Aid (ETA) products. The FireFly wearable...
详细信息
ISBN:
(纸本)9798331530143
There are over 300 million severely visually impaired people worldwide, and in their constant struggle for independence, there is a pressing need for advanced Electronic Travel Aid (ETA) products. The FireFly wearable ETA device presented here enhances mobility and safety for visually impaired individuals. The design fuses an 8x8 pixel Grid-EYE thermal imaging sensor to differentiate different objects based on their thermal signatures and a Time-of-Flight (ToF) LiDAR sensor for detecting distances. FireFly provides real-time feedback through acoustic, vibration, and light alerts, which vary in frequencies and amplitudes depending on the proximity and type of detected object. Its open architecture allows for easy integration into various wearable devices, ensuring application versatility. Experimental validation quantified the system's ability for object detection and human identification. Future improvements will focus on increasing system speed and thermal sensitivity and implementing machine learning algorithms for better object classification. FireFly aims to offer a reliable, low-power solution that promotes independence and safety for visually impaired individuals.
This paper presents a sensory fusion neuromorphic dataset collected with precise temporal synchronization using a set of Address-Event-Representation sensors and tools. The target application is the lip reading of sev...
详细信息
ISBN:
(纸本)9781665451093
This paper presents a sensory fusion neuromorphic dataset collected with precise temporal synchronization using a set of Address-Event-Representation sensors and tools. The target application is the lip reading of several keywords for different machine learning applications, such as digits, robotic commands, and auxiliary rich phonetic short words. The dataset is enlarged with a spiking version of an audio-visual lip reading dataset collected with frame-based cameras. LIPSFUS is publicly available and it has been validated with a deep learning architecture for audio and visual classification. It is intended for sensory fusionarchitectures based on both artificial and spiking neural network algorithms.
This paper proposes a digital substation standard system to enhance management efficiency. It introduces a model for comprehensive digital information perception and Bayesian algorithms for monitoring. Wireless sensor...
详细信息
Unmanned Aerial Vehicles (UAVs) are used in different fields ranging from recreational vehicles to agriculture, environmental monitoring, infrastructure inspection, disaster management, security, surveillance and logi...
详细信息
暂无评论