This paper proposes a Design Space Exploration for Edge machine learning through the utilization of the novel MathWorks FPGA deeplearning Processor IP, featured in the HDL deeplearning toolbox. With the ever-increas...
详细信息
This paper proposes a Design Space Exploration for Edge machine learning through the utilization of the novel MathWorks FPGA deeplearning Processor IP, featured in the HDL deeplearning toolbox. With the ever-increasing demand for real-time machine learning applications, there is a critical need for efficient and low-latency hardware solutions that can operate at the edge of the network, in close proximity to the data source. The HDL deeplearning toolbox provides a flexible and customizable platform for deploying deeplearning models on FPGAs, enabling effective inference acceleration for embedded IoT applications. In this study, our primary focus lies in investigating the impact of parallel processing elements on the performance and resource utilization of the FPGA-based processor. By analyzing the trade-offs between accuracy, speed, energy efficiency, and hardware resource utilization, we aim to gain valuable insights into making optimal design choices for FPGA-based implementations. Our evaluation is conducted on the AMD-Xilinx ZC706 development board, which serves as the target device for our experiments. We consider all the compatible Convolutional Neural Networks available within the HDL deeplearning toolbox to comprehensively assess the performances.
Online detection of action start is a significant and challenging task that requires prompt identification of action start positions and corresponding categories within streaming videos. This task presents challenges ...
详细信息
Online detection of action start is a significant and challenging task that requires prompt identification of action start positions and corresponding categories within streaming videos. This task presents challenges due to data imbalance, similarity in boundary content, and real-time detection requirements. Here, a novel time-Attentive Fusion Network is introduced to address the requirements of improved action detection accuracy and operational efficiency. The time-attentive fusion module is proposed, which consists of long-term memory attention and the fusion feature learning mechanism, to improve spatial-temporal feature learning. The temporal memory attention mechanism captures more effective temporal dependencies by employing weighted linear attention. The fusion feature learning mechanism facilitates the incorporation of current moment action information with historical data, thus enhancing the representation. The proposed method exhibits linear complexity and parallelism, enabling rapid training and inference speed. This method is evaluated on two challenging datasets: THUMOS'14 and ActivityNet v1.3. The experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art methods in terms of both detection accuracy and inference speed. Here, a novel time-Attentive Fusion Network (TAF-Net) is introduced to address the requirements of improved action detection accuracy and operational efficiency in the task of online detection of action start. The proposed model not only learns valuable sequence information for precise detection but its linear computational complexity and parallelism also contribute to a faster inference speed. image
In this paper, a unified deeplearning framework is developed for high-precision direction-of-arrival (DOA) estimation. Unlike previous methods that divide the real and imaginary parts of complex-valued sparse problem...
详细信息
In this paper, a unified deeplearning framework is developed for high-precision direction-of-arrival (DOA) estimation. Unlike previous methods that divide the real and imaginary parts of complex-valued sparse problem into two separate input channels, a real-valued transformation is adopted to encode the correlation between them. Then, a novel adaptive attention aggregation residual network (A(3)R-Net) is designed to overcome the challenges in the case of low signal-to-noise ratios or small inter-signal angle separations. First, to alleviate the gradient disappearance and gradient explosion caused by network deepening, a residual learning strategy is introduced to construct a deep estimation network that learns the inverse mapping from the array measurement vector to the original spatial spectrum. Second, since the feature fusion method via simple summation in the shortcut connection ignores the inconsistency on the scale and semantic of features, an adaptive attention aggregation module (A(3)M) with adaptive channel context aggregators is proposed to capture multi-scale channel contexts and generate element-wise fusion weights. Finally, a dilated convolution with a broader receptive field is embedded into the channel context aggregator to learn wider local cross-channel association. Extensive simulation results demonstrate the superiority and robustness of the proposed method compared with other state-of-the-art methods.
This paper proposes an innovative algorithm for optimizing intelligent image data systems based on deeplearning. The algorithm combines image feature extraction, data preprocessing and efficient optimization strategi...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
This paper proposes an innovative algorithm for optimizing intelligent image data systems based on deeplearning. The algorithm combines image feature extraction, data preprocessing and efficient optimization strategies to improve the performance and accuracy of image data processing systems. First, by designing a deep CNN architecture, important features in the image are extracted to achieve efficient completion of image recognition and classification tasks. Subsequently, a new multi-level data processing method is proposed, which can optimize image data at different levels, thereby improving processing speed and reducing noise interference. Through a series of simulation experiments, the results show that the image classification accuracy of the algorithm is improved by about 12%, from 85.6% of the traditional method to 97.3%. In addition, the processing efficiency is improved by about 20%, the data processingtime is reduced from 2.5 seconds of the traditional method to 2 seconds, and the stability of the system is significantly enhanced by introducing optimization strategies, and the stability is improved by about 18%. The optimized algorithm shows significant advantages in both accuracy and efficiency, meeting the needs of efficient intelligent imageprocessing systems.
Traffic signal light detection poses significant challenges in the intelligent driving sector, with high precision and efficiency being crucial for system safety. Advances in deeplearning have led to significant impr...
详细信息
Traffic signal light detection poses significant challenges in the intelligent driving sector, with high precision and efficiency being crucial for system safety. Advances in deeplearning have led to significant improvements in image object detection. However, existing methods continue to struggle with balancing detection speed and accuracy. We propose a lightweight model for traffic light detection that uses a streamlined backbone network and a Low-GD neck architecture. The model's backbone employs structured reparameterization and lightweight Vision Transformers, using multi-branch and Feed-Forward Network structures to boost informational richness and positional awareness, respectively. The Neck network utilizes the Low-GD structure to enhance the aggregation and integration of multi-scale features, reducing information loss during cross-layer exchanges. We introduce a data augmentation strategy using Stable Diffusion to expand our traffic light dataset in complex weather conditions like fog, rain, and snow, improving model generalization. Our method excels on the YCTL2024 traffic light dataset, achieving a detection speed of 135 FPS and 98.23% accuracy, with only 1.3M model parameters. Testing on the Bosch Small Traffic Lights Dataset confirms the method's strong generalization capabilities. This suggests that our proposed method can effectively provide accurate and real-time traffic light detection.
Smart mobility intelligent traffic services have become critical in intelligent transportation systems (ITS). This involves using advanced sensors and controllers and the ability to respond to real-time traffic situat...
详细信息
Quantifying the phagocytosis of dynamic, unstained cells is essential for evaluating neurodegenerative diseases. However, measuring rapid cell interactions and distinguishing cells from background make this task very ...
详细信息
Quantifying the phagocytosis of dynamic, unstained cells is essential for evaluating neurodegenerative diseases. However, measuring rapid cell interactions and distinguishing cells from background make this task very challenging when processingtime-lapse phase-contrast video microscopy. In this study, we introduce an end-to-end, scalable, and versatile real-time framework for quantifying and analyzing phagocytic activity. Our proposed pipeline is able to process large data-sets and includes a data quality verification module to counteract potential perturbations such as microscope movements and frame blurring. We also propose an explainable cell segmentation module to improve the interpretability of deeplearning methods compared to black-box algorithms. This includes two interpretable deeplearning capabilities: visual explanation and model simplification. We demonstrate that interpretability in deeplearning is not the opposite of high performance, by additionally providing essential deeplearning algorithm optimization insights and solutions. Besides, incorporating interpretable modules results in an efficient architecture design and optimized execution time. We apply this pipeline to quantify and analyze microglial cell phagocytosis in frontotemporal dementia (FTD) and obtain statistically reliable results showing that FTD mutant cells are larger and more aggressive than control cells. The method has been tested and validated on several public benchmarks by generating state-of-the art performances. To stimulate translational approaches and future studies, we release an open-source end-to-end pipeline and a unique microglial cells phagocytosis dataset for immune system characterization in neurodegenerative diseases research. This pipeline and the associated dataset will consistently crystallize future advances in this field, promoting the development of efficient and effective interpretable algorithms dedicated to the critical domain of neurodegenerative disease
real-time semantic segmentation provides precise insights into dynamic street environments for autonomous driving, traffic control, and urban planning. However, state-of-the-art models following attention mechanisms a...
详细信息
real-time semantic segmentation provides precise insights into dynamic street environments for autonomous driving, traffic control, and urban planning. However, state-of-the-art models following attention mechanisms and deep convolutional neural networks have improved semantic segmentation at the cost of complex architectures and high computation complexity. The study aims to mitigate the presence of gridding artifacts and enhance semantic segmentation performance. In addition, we propose a multi-level downsampling approach before employing the depth-wise split separable global convolution with the bottleneck to achieve a trade-off between accuracy and inference time. The spatial attention module used in this study effectively keeps lowlevel spatial characteristics, enhancing the accuracy of localization, robustness against disturbances, processing efficiency, and the ability to handle occlusions. Thorough tests of the Cityscapes and CamVid datasets available for public access indicate that the model presented is capable of efficiently processing high-resolution photos in realtime, resulting in exceptional performance. The model has achieved an accuracy of 72.3% on the cityscapes dataset and 72.7% on the CamVid dataset.
The mechanism governing pharmaceutical tablet disintegration is far from fully understood. Despite the importance of controlling a formulation's disintegration process to maximize the active pharmaceutical ingredi...
详细信息
The mechanism governing pharmaceutical tablet disintegration is far from fully understood. Despite the importance of controlling a formulation's disintegration process to maximize the active pharmaceutical ingredient's bioavailability and ensure predictable and consistent release profiles, the current understanding of the process is based on indirect or superficial measurements. Formulation science could, therefore, additionally deepen the understanding of the fundamental physical principles governing disintegration based on direct observations of the process. We aim to help bridge the gap by generating a series of time-resolved X-ray microcomputed tomography (mu CT) images capturing volumetric images of a broad range of minitablet formulations undergoing disintegration. Automated image segmentation was a prerequisite to overcoming the challenges of analyzing multiple time series of heterogeneous tomographic images at high magnification. We devised and trained a convolutional neural network (CNN) based on the U-Net architecture for autonomous, rapid, and consistent image segmentation. We created our own mu CT data reconstruction pipeline and parameterized it to deliver image quality optimal for our CNN-based segmentation. Our approach enabled us to visualize the internal microstructures of the tablets during disintegration and to extract parameters of disintegration kinetics from the time-resolved data. We determine by factor analysis the influence of the different formulation components on the disintegration process in terms of both qualitative and quantitative experimental responses. We relate our findings to known formulation component properties and established experimental results. Our direct imaging approach, enabled by deeplearning-based imageprocessing, delivers new insights into the disintegration mechanism of pharmaceutical tablets.
Object recognition, an essential technique in computer vision, enables machines to identify and understand real-time objects and environments based on input images. The main aim of this technology is to accurately rec...
详细信息
暂无评论