The Dynamic Vision Sensor (DVS) is a bio-inspired image sensor which has many advantages such as high dynamic range, high bandwidth, high temporal resolution and low power consumption for Internet of Video Things and ...
详细信息
The Dynamic Vision Sensor (DVS) is a bio-inspired image sensor which has many advantages such as high dynamic range, high bandwidth, high temporal resolution and low power consumption for Internet of Video Things and Edge Computing applications. However, spuriously generated Background Activity (BA) noise events can significantly degrade the quality of DVS output and cause unnecessary computations throughout the image processing chain, reducing its energy efficiency. Near-sensor filters can mitigate this problem by preventing the BA noise events from reaching downstream stages. In this paper, we propose a novel, hardware-efficient, spatio-temporal correlation filter (HAST) for near-sensor BA noise filtering. It uses compact two-dimensional binary arrays along with simple, arithmetic-free hash-based functions for storage and retrieval operations. This approach eliminates the need to use timestamps for determining the chronological order of events. HAST uses much lower memory and energy compared to other hardware-friendly filters (BAF/STCF) while matching their performance in simulations with standard datasets;for a sensor of resolution 346 x 260 pixels, it requires only 5-18% of their memory, and about 15% of their energy per event for correlation time tau ranging from 1 to 50 ms. The memory and energy gains of the filter increase with sensor resolution. In FPGA implementation, HAST achieves about 29% higher throughput than BAF/STCF while utilizing only about 5% of their memory. The filter parameter values can be chosen by Design Space Exploration (DSE) for optimized performance-resource trade-offs based on application requirements.
X-ray imaging is essential in medical diagnostics, particularly for identifying anomalies like respiratory diseases. However, building accurate and efficient deep learning models for X-ray image classification remains...
详细信息
X-ray imaging is essential in medical diagnostics, particularly for identifying anomalies like respiratory diseases. However, building accurate and efficient deep learning models for X-ray image classification remains challenging, requiring both optimized architectures and low computational complexity. In this paper, we present a three-stage framework to enhance X-ray image classification using Neural Architecture Search (NAS), Transfer Learning, and Model Compression via filter pruning, specifically targeting the ChestX-Ray14 dataset. First, NAS is employed to automatically discover the optimal convolutional neural network (CNN) architecture tailored to the ChestX-Ray14 dataset, reducing the need for extensive manual tuning. Subsequently, we leverage transfer learning by incorporating pre-trained models, which enhances the model's generalizability and reduces dependency on large volumes of labeled X-ray data. Finally, model compression through filter pruning, driven by evolutionary algorithms, trims redundant parameters to improve computational efficiency while preserving model accuracy. Experimental results demonstrate that this approach not only boosts classification accuracy on the ChestX-Ray14 dataset but also significantly reduces model size, making it suitable for deployment in resource-constrained environments, such as mobile and edge devices. This framework provides a practical, scalable solution to improve both the accuracy and efficiency of medical image classification.
Feature selection is an important preprocessing step in machine learning to remove irrelevant and redundant features. Due to its ability to effectively maintain the discriminability of extracted features, Trace Ratio ...
详细信息
Feature selection is an important preprocessing step in machine learning to remove irrelevant and redundant features. Due to its ability to effectively maintain the discriminability of extracted features, Trace Ratio Linear Discriminant Analysis (TR-LDA) has become the foundation for many feature selection algorithms. As is known, TR-LDA is a challenging problem to solve because of its trace-ratio form, and it also faces the scale invariance problem. These two drawbacks of TR-LDA significantly reduce the performance of feature selection algorithms based on it. To overcome these drawbacks, this paper proposes the sparse LDA with constant between-class distance (SLDA-CBD) model to select relavant features. This model first transforms TR-LDA into a non-trace ratio problem with a constant between-class distance constraint, and then imposes row constraints on the projection matrix to implement feature selection. Since the SLDA-CBD model is rooted in TR-LDA, it ensures the discriminative performance of the selected features. The constant between-class distance constraint successfully avoids the scale invariance problem. Additionally, due to the non-trace ratio form of the SLDA-CBD model, it is easily solvable. The experimental results show that the proposed method has better performance compared to the baseline and six state-of-the-art relative methods, with improvements of over 1% on image datasets and over 2% on video datasets in most cases, while also demonstrating high stability, proving its effectiveness and advantage in practical applications.
Many-core systems are systolic architectures consisting of an arbitrarily large number of processing nodes connected by a point-to-point communication network. Their architecture makes them ideally suited for the impl...
详细信息
Many-core systems are systolic architectures consisting of an arbitrarily large number of processing nodes connected by a point-to-point communication network. Their architecture makes them ideally suited for the implementation of data-flow algorithms, of which Mathematical Morphology (MM) filters are a typical example. However, the performance of data-flow applications on many-core systems is highly dependent on the quality of the mapping of the application tasks to the computational cores. Decomposing the structuring elements of morphological operations improves their performance, however, performing such decomposition on many-core systems leads to increased communication. The need to find a balance between performance and communication is a representative example of the general problem of mapping optimizations. The approach presented in this paper explores a two-phase design-time optimization toolchain based on evolutionary algorithms: a front-end single-objective algorithm decomposes a MM filter using smaller operators, while a back-end multi-objective (fault-tolerance, energy, and communication) algorithm searches for optimal mappings of the filter on a specific many-core system, taking into account the architectural parameters of the hardware. The output of the toolchain is a Pareto front of mapping solutions, allowing the designer to select an implementation that matches application-specific requirements. A set of standard benchmark applications was used to determine the optimal parameters for the algorithms, which were then validated on two real-world application examples involving the detection of features in high-resolution PCB images. Two application mapping experiments focusing on energy constraints were conducted, in which the proposed procedure was compared to deterministic mapping techniques. The evolutionary procedure was observed to offer a significant advantage over the deterministic approach, with percentage gains of up to 73.78% for smaller grids
Convolutional neural networks (CNNs) are evolving as they are applied to more diverse environments and more difficult challenges. The evolving induces various convolution modes (e.g., 1x1 convolution, 2-stride convolu...
详细信息
Convolutional neural networks (CNNs) are evolving as they are applied to more diverse environments and more difficult challenges. The evolving induces various convolution modes (e.g., 1x1 convolution, 2-stride convolution and rectangle convolution) in current CNNs and makes it difficult for the hardware accelerators to efficiently support such various convolution modes. In this paper, it is found that an important difference of these convolution modes is the computation density. Therefore, the above convolution modes are regarded as structured sparse and claims that sparse-based design methodology can be applied for the implementation of the reconfigurable CNN accelerator. Subsequently, two critical architectural parameters, including input tile size and convolution engine (CE) scale, are evaluated based on Standard deviation of calculations (SDC), unsupported convolution mode (UCM) and unsuitable IFM size (UIS), DSP utilization ratio (DUR) as well as hardware resource overhead (HRO), respectively. With the aid of the optimal parameters, a high-parallelism and flexible CE array and a high-performance and reconfigurable CNN architecture are designed. The accelerator was implemented on a Xilinx VC709 FPGA and ran at a clock frequency of 300 MHz, achieving 921.60 to 1382.40 GOPS while supporting various convolution modes. Compared with previous dense-/sparse-based works, the proposed accelerator can realize 1.35x to 10.77x improvements on performance and 1.22x to 2.84x improvements on DSP efficiency while deploying VGG16.
A graphic equalizer (GEQ) is a standard tool in audio production and effect design. Adjustable gain control frequencies are fixed along the logarithmic frequency axis, and an automatic design method matches the magnit...
详细信息
A graphic equalizer (GEQ) is a standard tool in audio production and effect design. Adjustable gain control frequencies are fixed along the logarithmic frequency axis, and an automatic design method matches the magnitude response to them whenever target gains are changed. Most commonly, the GEQ comprises a set of peak filters centered an octave apart, possibly with a shelving filter at the bottom and top of the frequency range. While accurate designs were proposed, the dynamic range is typically limited to 24 dB. In this paper, we propose two innovations. First, we introduce a GEQ based on shelving filters only, which can cover an extensive dynamic range of over 60 dB. Secondly, we introduce an order-switching technique that combines shelf filters of different order. We demonstrate the performance and advantages of the proposed filter with design examples. The proposed shelf-filter-based GEQ offers a wider dynamic range and a smoother magnitude response than traditional peak-filter-based GEQ designs.
To enhance the quality of speech signals, this paper introduces a novel speech signal processing method that integrates Dynamic Multi-Scale (DMS) and Adaptive Error Minimization (AEM) techniques. This method significa...
详细信息
To enhance the quality of speech signals, this paper introduces a novel speech signal processing method that integrates Dynamic Multi-Scale (DMS) and Adaptive Error Minimization (AEM) techniques. This method significantly enhances noise reduction and signal fidelity in dynamic environments, distinguishing itself from previous approaches through its real-time adaptive filtering, which makes it highly adaptable to complex, non-stationary noise conditions. The proposed method is grounded in dynamic multi-scale analysis, employing multi-scale decomposition of speech signals to optimize their time-frequency characteristics and dynamic adjustments, thereby forming a new noise reduction approach, DMS. Initially, the multi-scale decomposition technique effectively captures the multi-scale features of noisy speech signals. Subsequently, optimizing the time-frequency characteristics and dynamic signal adjustments effectively removes noise while improving the signal's time-frequency resolution. Finally, the method is further enhanced through the adaptive error minimization algorithm, leading to a more pronounced noise reduction effect. Experimental results demonstrate that the proposed method outperforms the single dynamic multi-scale technique in terms of improving signal-to-noise ratio (SNR).
The standard multi-target transition density assumes that, conditional on the current multi-target state, targets survive and move independently of each other. Although this assumption is followed by most multi-target...
详细信息
The standard multi-target transition density assumes that, conditional on the current multi-target state, targets survive and move independently of each other. Although this assumption is followed by most multi-target tracking (MTT) algorithms, it may not be applicable for tracking group targets exhibiting coordinated motion. This paper presents a principled Bayesian solution to tracking multiple resolvable group targets in the labeled random finite set framework. The transition densities of group targets with collective behavior are derived both for single-group and multi-group. For single-group, the transition density is characterized by a general labeled multi-target density and then approximated by the closest general labeled multi-Bernoulli (GLMB) density in terms of Kullback-Leibler divergence. For multi-group, we augment the group structure to multi-target states and propose a multiple group structure transition model (MGSTM) to recursively infer it. Additionally, the conjugation of the structure augmented multi-group multi-target density is also proved. An efficient implementation of multi-group multi-target tracker, named MGSTM-LMB filter, and its Gaussian mixture form are devised which preserves the first-order moment of multi-group multi-target density in recursive propagation. Numerical simulation results demonstrate the capability of the proposed MGSTM-LMB filter in multi-group scenes.
Myocardial infarction (MI) stands as one of the most critical cardiac complications, occurring when blood flow to the cardiovascular system is partially or completely blocked. Electrocardiography (ECG) is an invaluabl...
详细信息
Myocardial infarction (MI) stands as one of the most critical cardiac complications, occurring when blood flow to the cardiovascular system is partially or completely blocked. Electrocardiography (ECG) is an invaluable tool for detecting diverse cardiac irregularities. Manual investigation of MI-induced ECG changes is tedious, laborious, and time-consuming. Nowadays, deep learning-based algorithms are widely investigated to detect various cardiac abnormalities and enhance the performance of medical diagnostic systems. Therefore, this work presents a lightweight deep learning framework (CardioNet) for MI detection using ECG signals. To construct time-frequency (T-F) spectrograms, filtered ECG sensor data are subjected to the short-time Fourier transform (STFT), movable Gaussian window-based S-transform (ST), and smoothed pseudo-Wigner-Ville distribution (SPWVD) methods. To develop an automated MI detection system, obtained spectrograms are fed to benchmark Squeeze-Net, Alex-Net, and a newly developed, lightweight deep learning model. The developed CardioNet with ST-based T-F images has obtained an average classification accuracy of 99.82%, a specificity of 99.57%, and a sensitivity of 99.97%. The proposed system, in combination with a cloud-based algorithm, is suitable for designing wearable to detect several cardiac diseases using other biological signals from the cardiovascular system.
Data is the essential fuel for deep neural networks (DNNs), and its quality affects the practical performance of DNNs. In real-world training scenarios, the successful generalization performance of DNNs is severely ch...
详细信息
Data is the essential fuel for deep neural networks (DNNs), and its quality affects the practical performance of DNNs. In real-world training scenarios, the successful generalization performance of DNNs is severely challenged by noisy samples with incorrect labels. To combat noisy samples in image classification, numerous methods based on sample selection and semi-supervised learning (SSL) have been developed, where sample selection is used to provide the supervision signal for SSL, achieving great success in resisting noisy samples. Due to the necessary warm-up training on noisy datasets and the basic sample selection mechanism, DNNs are still confronted with the challenge of memorizing noisy samples. However, existing methods do not address the memorization of noisy samples by DNNs explicitly, which hinders the generalization performance of DNNs. To alleviate this issue, we present a new approach to combat noisy samples. First, we propose a memorized noise detection method to detect noisy samples that DNNs have already memorized during the training process. Next, we design a noise-excluded sample selection method and a noise-alleviated MixMatch to alleviate the memorization of DNNs to noisy samples. Finally, we integrate our approach with the established method DivideMix, proposing Modified-DivideMix. The experimental results on CIFAR-10, CIFAR-100, and Clothing1M demonstrate the effectiveness of our approach.
暂无评论