NVMe zoned namespace (ZNS) SSDs present a new class of storage devices with attractive features including low cost, software definability, and stable performance. However, one primary culprit that hinders the adoption...
详细信息
ISBN:
(纸本)9798350342918
NVMe zoned namespace (ZNS) SSDs present a new class of storage devices with attractive features including low cost, software definability, and stable performance. However, one primary culprit that hinders the adoption of ZNS is the high garbage collection (GC) overhead it brings to host software. The ZNS interface divides the logical address space into size-fixed zones that must be written sequentially. Despite being friendly to flash memory, ZNS requires host software to perform out-of-place updates and GC on individual zones. Current ZNS SSDs typically employ a large zone size (e.g., of GBs) to be conducive to die-level RAID protection on flash memory. This impedes flexible data placement, such as mixing data with different lifetimes in the same zone, and incurs sizable data migrations during zone GC. To address this problem, we propose FlexZNS, a novel ZNS SSD design that provides reliable zoned storage allowing host software to configure the zone size flexibly as well as multiple zone sizes. The size variability of zones poses two interrelated challenges, one for the SSD controller to establish per-zone RAID protection, and the other for host software to manage variable zone capacity loss caused by parity storage. To tackle the challenges, FlexZNS decouples the storage of parity from individual zones on flash memory and hides the zone capacity loss from the host software. We verify FlexZNS on a ZNS-compatible file system F2FS and a popular key-value store RocksDB. Extensive experiments demonstrate that FlexZNS can significantly improve the system performance and reduce GC-induced write amplification, compared with a conventional ZNS SSD with large-sized zones.
The existing Retina OCT image automatic classification systems encounter challenges in deployment due to their substantial size. To address this, we propose Light-AP-EfficientNet, a lightweight model leveraging adapti...
详细信息
ISBN:
(纸本)9789819772315;9789819772322
The existing Retina OCT image automatic classification systems encounter challenges in deployment due to their substantial size. To address this, we propose Light-AP-EfficientNet, a lightweight model leveraging adaptive pooling for efficient feature extraction and classification, specifocally designed for effective data mining applications in medical imaging. Firstly, we optimize EfficientNet's convolutional layer settings to reduce redundant convolutional structures, significantly reducing the model's parameters. Then, we integrate adaptive pooling layers to facilitate the model in learning both global and local features, enhancing model classification performance. Experimental results demonstrate that Light-AP-EfficientNet achieves 99.7% accuracy on UCSD dataset, while requiring only 17% of the parameter volume of ShuffleNetV2 and 19% of the computational volume of MobileNetV2. Additionally, it processes a single image in just 0.028 s on a CPU and 0.009 s on a GPU. Furthermore, compared to recent novel models, our model demonstrates significant improvements in metrics such as Accuracy and Precision on the same dataset. Specifically, the maximum improvement in Accuracy is 4.5%, and in Precision is 5.42%. With its high accuracy and reduced hardware requirements, Light-AP-EfficientNet is ideal for data mining tasks in resource-constrained scenarios.
As the application scenarios for large-scale spiking neural networks (SNN) increase, efficient SNN simulation becomes more essential. However, simulating such a large-scale network faces expensive overhead in terms of...
详细信息
ISBN:
(数字)9789819708116
ISBN:
(纸本)9789819708109;9789819708116
As the application scenarios for large-scale spiking neural networks (SNN) increase, efficient SNN simulation becomes more essential. However, simulating such a large-scale network faces expensive overhead in terms of computation and communication, especially for high firing rates. To address this problem, we propose an effective accelerated optimization method for simulating SNN on GPGPUs, which simultaneously takes into account workload balancing and communication overhead. We design a workload-oriented network partition algorithm to minimize the number of external synapses and ensure workload balance. Additionally, we propose spike synchronization optimization by incorporating fine-grained scale, data compression, and full-duplex communication. This optimization aims to achieve lower communication overhead and better performance improvement. Furthermore, to avoid thread warp divergence, we assign an entire thread block for each neuron without collecting information on fired neurons in the spike propagation phase, which simplifies the execution flow and enhances performance. Experimental results demonstrate that our simulator can achieve up to 1.31x similar to 6.74x speedup for SNN with different configurations, and the efficiency is improved by 40.21%similar to 51.11% compared with the state-of-the-art methods.
Multiple Sclerosis is a chronic disease of the central nervous system that affects millions worldwide, and early detection is crucial for better treatment outcomes. Current detection methods are expensive and invasive...
详细信息
In robotics applications, ensuring reliable performance in the presence of actuator faults is essential for maintaining system safety and reliability. This paper presents a tolerant tracking control method for permane...
详细信息
Existing dehazing approaches struggle to process real-world hazy images owing to the lack of paired real data and robust priors. In this work, we present a new paradigm for real image dehazing from the perspectives of...
详细信息
ISBN:
(纸本)9798350301298
Existing dehazing approaches struggle to process real-world hazy images owing to the lack of paired real data and robust priors. In this work, we present a new paradigm for real image dehazing from the perspectives of synthesizing more realistic hazy data and introducing more robust priors into the network. Specifically, (1) instead of adopting the de facto physical scattering model, we rethink the degradation of real hazy images and propose a phenomenological pipeline considering diverse degradation types. (2) We propose a Real Image Dehazing network via high-quality Codebook Priors (RIDCP). Firstly, a VQGAN is pre-trained on a large-scale high-quality dataset to obtain the discrete codebook, encapsulating high-quality priors (HQPs). After replacing the negative effects brought by haze with HQPs, the decoder equipped with a novel normalized feature alignment module can effectively utilize high-quality features and produce clean results. However, although our degradation pipeline drastically mitigates the domain gap between synthetic and real data, it is still intractable to avoid it, which challenges HQPs matching in the wild. Thus, we re-calculate the distance when matching the features to the HQPs by a controllable matching operation, which facilitates finding better counterparts. We provide a recommendation to control the matching based on an explainable solution. Users can also flexibly adjust the enhancement degree as per their preference. Extensive experiments verify the effectiveness of our data synthesis pipeline and the superior performance of RIDCP in real image dehazing. Code and data are released at https://***/projects/RIDCP.
The high-voltage power supply based on pulse modulation requires a more comprehensive management system to address the deficiencies in the original system, including inadequate functionality and a lack of storage for ...
详细信息
Hardware development for CRYSTALS-Kyber is essential to prevent future quantum computer attacks. However, existing CRYSTALS-Kyber hardware often has low performance and lacks the flexibility to support the three opera...
详细信息
Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.
暂无评论