This paper presents the implementation of chirp match filtering for Synthetic Aperture Radar (SAR) on a field-programmable gate array (FPGA) using high-Level Synthesis (HLS). HLS enables higher productivity by signifi...
详细信息
ISBN:
(纸本)9798350389937;9798350389920
This paper presents the implementation of chirp match filtering for Synthetic Aperture Radar (SAR) on a field-programmable gate array (FPGA) using high-Level Synthesis (HLS). HLS enables higher productivity by significantly reducing development time compared to traditional hardware description languages (HDL). A C-based programming approach is employed to leverage parallel processing, which is critical for high-performance signal processing tasks. The design utilizes a symmetric chirp signal as the reference for match filtering, with optimizations implemented through parallelism in the filtering process. Experimental results show that the FPGA-based match filtering, programmed using HLS, achieves low latency and highperformance, with a speed increase of 937 times compared to running the same process in Octave on a standard computer.
Contemporary data center CPUs are experiencing an unprecedented surge in core count. This trend necessitates scrutinized Last-Level Cache (LLC) strategies to accommodate increasing capacity demands. While DRAM offers ...
详细信息
ISBN:
(纸本)9798350326598;9798350326581
Contemporary data center CPUs are experiencing an unprecedented surge in core count. This trend necessitates scrutinized Last-Level Cache (LLC) strategies to accommodate increasing capacity demands. While DRAM offers significant capacity, using it as a cache poses challenges related to latency and energy. This paper introduces Native DRAM Cache (NDC), a novel DRAM architecture specifically designed to operate as a cache. NDC features innovative approaches, such as conducting tag matching and way selection within a DRAM subarray and repurposing existing precharge transistors for tag matching. These innovations facilitate Caching-In-Memory (CIM) and enable NDC to serve as a high-capacity LLC with high set-associativity, low-latency, high-throughput, and low-energy. Our evaluation demonstrates that NDC significantly outperforms state-of-the-art DRAM cache solutions, enhancing performance by 2.8%/52.5%/44.2% (up to 8.4%/140.6%/85.5%) in SPEC/NPB/GAP benchmark suites, respectively.
The emerging technology of quantum computing has the potential to change the way how problems will be solved in the future. This work presents a centralized network control algorithm executable on already existing qua...
详细信息
ISBN:
(纸本)9798350363869;9798350363852
The emerging technology of quantum computing has the potential to change the way how problems will be solved in the future. This work presents a centralized network control algorithm executable on already existing quantum computer which are based on the principle of quantum annealing like the D-Wave AdvantageT. We introduce a resource reoccupation algorithm for traffic engineering in wide-area networks. The proposed optimization algorithm changes traffic steering and resource allocation in case of overloaded transceivers. Settings of active components like fiber amplifiers and transceivers are not changed for the reason of stability. This algorithm is beneficial in situations when the network traffic is fluctuating in time scales of seconds or spontaneous bursts occur. Further, we developed a discrete-time flow simulator to study the algorithm's performance in wide-area networks. Our network simulator considers backlog and loss modeling of buffered transmission lines. Concurring flows are handled equally in case of a backlog. This work provides an ILP-based network configuring algorithm that is applicable on quantum annealing computers. We showcase, that traffic losses can be reduced significantly by a factor of 2 if a resource reoccupation algorithm is applied in a network with bursty traffic. As resources are used more efficiently by reoccupation in heavy load situations, overprovisioning of networks can be reduced. Thus, this new form of network operation leads toward a zero-margin network. We show that our newly introduced network simulator enables analyses of short-time effects like buffering within fat-pipe networks. As the calculation of network configurations in real-sized networks is typically time-consuming, quantum computing can enable the proposed network configuration algorithm for application in realsized wide-area networks.
The construction industry is one of the largest consumers of natural resources, including water, materials, and energy. Towards the goal of more sustainable building design, we present a "tiny home" case stu...
详细信息
In order to improve the mining high energy density of permanent magnet direct drive all-in-one control performance and anti-interference ability, for the traditional PID control performance efficiency is not high, and...
详细信息
Genetic programming (GP) presents a unique challenge in fitness evaluation due to the need to repeatedly execute the evolved programs, often represented as tree structures, to assess their quality on multiple input da...
详细信息
Beatnik is a novel open source mini-application that exercises the complex communication patterns often found in production codes but rarely found in benchmarks or mini-applications. It simulates 3D Raleigh-Taylor ins...
详细信息
Video denoising is a fundamental problem in numerous computer vision applications. State-of-the-art attention-based denoising methods typically yield good results, but require vast amounts of GPU memory and usually su...
详细信息
ISBN:
(纸本)9781665493468
Video denoising is a fundamental problem in numerous computer vision applications. State-of-the-art attention-based denoising methods typically yield good results, but require vast amounts of GPU memory and usually suffer from very long computation times. Especially in the field of restoring digitized high-resolution historic films, these techniques are not applicable in practice. To overcome these issues, we introduce a lightweight video denoising network that combines efficient axial-coronal-sagittal (ACS) convolutions with a novel shifted window attention formulation (ASwin), which is based on the memory-efficient aggregation of self- and cross-attention across video frames. We numerically validate the performance and efficiency of our approach on synthetic Gaussian noise. Moreover, we train our network as a general-purpose blind denoising model for real-world videos, using a realistic noise synthesis pipeline to generate clean-noisy video pairs. A user study and non-reference quality assessment prove that our method outperforms the state-of-the-art on real-world historic videos in terms of denoising performance and temporal consistency.
Camera-based 3D object detection in BEV (Bird's Eye View) space has drawn great attention over the past few years. Dense detectors typically follow a two-stage pipeline by first constructing a dense BEV feature an...
ISBN:
(纸本)9798350307184
Camera-based 3D object detection in BEV (Bird's Eye View) space has drawn great attention over the past few years. Dense detectors typically follow a two-stage pipeline by first constructing a dense BEV feature and then performing object detection in BEV space, which suffers from complex view transformations and high computation cost. On the other side, sparse detectors follow a query-based paradigm without explicit dense BEV feature construction, but achieve worse performance than the dense counterparts. In this paper, we find that the key to mitigate this performance gap is the adaptability of the detector in both BEV and image space. To achieve this goal, we propose SparseBEV, a fully sparse 3D object detector that outperforms the dense counterparts. SparseBEV contains three key designs, which are (1) scale-adaptive self attention to aggregate features with adaptive receptive field in BEV space, (2) adaptive spatio-temporal sampling to generate sampling locations under the guidance of queries, and (3) adaptive mixing to decode the sampled features with dynamic weights from the queries. On the test split of nuScenes, SparseBEV achieves the state-of-the-art performance of 67.5 NDS. On the val split, SparseBEV achieves 55.8 NDS while maintaining a real-time inference speed of 23.5 FPS. Code is available at https://***/MCG-NJU/SparseBEV.
Salient object detection is essential for many computer vision tasks and aims to detect the most prominent objects in images. However, existing methods often perform poorly when dealing with complex scenes. To overcom...
详细信息
暂无评论