The simulation of interconnection network and MPI message transportation behavior is an important part of the simulation of high-performance computing systems. Current research on the simulation of message transportat...
详细信息
Recent learning-based lossless image compression methods encode an image in the unit of subimages and achieve comparable performances to conventional non-learning algorithms. However, these methods do not consider the...
详细信息
ISBN:
(数字)9781665469463
ISBN:
(纸本)9781665469463
Recent learning-based lossless image compression methods encode an image in the unit of subimages and achieve comparable performances to conventional non-learning algorithms. However, these methods do not consider the performance drop in the high frequency region, giving equal consideration to the low and high frequency areas. In this paper, we propose a new lossless image compression method that proceeds the encoding in a coarse-to-fine manner to separate and process low and high frequency regions differently. We initially compress the low frequency components and then use them as additional input for encoding the remaining high frequency region. The low frequency components act as a strong prior in this case, which leads to improved estimation in the high frequency area. In addition, we design the frequency decomposition process to be adaptive to color channel, spatial location, and image characteristics. As a result, our method derives an image-specific optimal ratio of low/high-frequency components. Experiments show that the proposed method achieves state-of-the-art performance for benchmark high-resolution datasets.
The development of nanotechnology in recent years makes intra-body flow-guided nanonetworks more feasible. Monitoring different biomarkers, detecting infectious agents, locating cancerous cells, delivering drugs accur...
详细信息
During the rapid evolution of 5G networks, the efficient allocation of bandwidth, frequency spectrum, and computing power is critical to maintaining a high standard of service and performance. A methodology for predic...
详细信息
Recently, the transformer-based methods have achieved advanced performance result in human-object interaction (HOI) detection task. However, most of them directly utilize the semantically high-level feature from the d...
详细信息
With advances in neural network technology, Processing-In-Memory (PIM) has emerged as a solution to per-formance bottlenecks between processors and memory. Among various PIM design techniques, integrating processing u...
详细信息
In Artificial Intelligence (AI) and high-performance computing (HPC), growing data and model sizes require distributed processing across multiple nodes due to single-node limitations, increasing inter-node communicati...
详细信息
Event extraction is an important task in natural language processing, and it is widely utilized in intelligence domains such as business and military for information extraction. Recently, many works have successfully ...
详细信息
This paper explores an enhanced learning model of the Elman Neural Network with a view of addressing problems of local minima and slow convergence time by using the sine cosine algorithm. Elman Neural Network that is ...
详细信息
This paper presents an octree construction method, called Cornerstone, that facilitates global domain decomposition and interactions between particles in mesh-free numerical simulations. Our method is based on algorit...
详细信息
ISBN:
(纸本)9798400701900
This paper presents an octree construction method, called Cornerstone, that facilitates global domain decomposition and interactions between particles in mesh-free numerical simulations. Our method is based on algorithms developed for 3D computer graphics, which we extend to distributed highperformance computing (HPC) systems. Cornerstone yields global and locally essential octrees and is able to operate on all levels of tree hierarchies in parallel. The resulting octrees are suitable for supporting the computation of various kinds of short and long range interactions in N-body methods, such as Barnes-Hut and the Fast Multipole Method (FMM). While we provide a CPU implementation, Cornerstone may run entirely on GPUs. This results in significantly faster tree construction compared to execution on CPUs and serves as a powerful building block for the design of simulation codes that move beyond an offloading approach, where only numerically intensive tasks are dispatched to GPUs. With data residing exclusively in GPU memory, Cornerstone eliminates data movements between CPUs and GPUs. As an example, we employ Cornerstone to generate locally essential octrees for a Barnes-Hut treecode running on almost the full LUMI-G system with up to 8 trillion particles.
暂无评论