Three-dimensional (3D) reconstruction in cryo-electron tomography (cryo-ET) plays an important role in studying in situ biological macromolecular structures at the nanometer level. Owing to limited tilt angle, 3D reco...
详细信息
ISBN:
(数字)9789819947492
ISBN:
(纸本)9789819947485;9789819947492
Three-dimensional (3D) reconstruction in cryo-electron tomography (cryo-ET) plays an important role in studying in situ biological macromolecular structures at the nanometer level. Owing to limited tilt angle, 3D reconstruction of cryo-ET always suffers from a "missing wedge" problem which causes severe accuracy degradation. Multi-tilt reconstruction is an effective method to reduce artifacts and suppress the effect of the missing wedge. As the number of tilt series increases, large size data causes high computation and huge memory overhead. Limited by the memory, multi-tilt reconstruction cannot be performed in parallel on GPUs, especially when the image size reaches 1 K, 2 K, or even larger. To optimize large-scale multi-tilt reconstruction of cryo-ET, we propose a newGPU-based large-scale multi-tilt tomographic reconstruction algorithm (GMSIRT). Furthermore, we design a two-level data partition strategy in GM-SIRT to greatly reduce the memory required in the whole reconstructing process. Experimental results show that the performance of the GM-SIRT algorithm has been significantly improved compared with DM-SIRT, the distributed multi-tilt reconstruction algorithm on the CPU cluster. The acceleration ratio is over 300%, and the memory requirement only decreases to one-third of DM-SIRT when the image size reaches 2 K.
Since the last decade, radio astronomy has started a new era: the advent of the Square Kilometer Array (SKA), preceded by its pathfinders, will produce a huge amount of data that will be hard to process with a traditi...
详细信息
ISBN:
(纸本)9798331524937
Since the last decade, radio astronomy has started a new era: the advent of the Square Kilometer Array (SKA), preceded by its pathfinders, will produce a huge amount of data that will be hard to process with a traditional approach. This means that the current state-of-the-art software for data reduction and imaging will have to be re-modeled to face such data challenge. In order to manage such an increase in data size and computational requirements, scientists need to exploit modern high-performance computing (HPC) architectures. In particular, heterogeneous systems, based on complex combinations of CPUs, accelerators, high-speed networks and composite storage devices need to be used in an efficient and effective way. In this paper, we present an overview on Radio Imaging Code Kernels (RICK;[1];[2];[3]), a code able to perform the most computationally demanding steps of w-stacking gridder algorithm exploiting distributedparallelism and GPU acceleration. GPU offloading is possible through CUDA, HIP, and OpenMP, aiming at the largest possible usability among multiple architectures. After detailing the (multi-)GPU approach to the problem and listing all the new code implementations, we analyze its performances considering both the computational and communication workload. We will show how the full, distributed GPU offload of the code, first of its kind and crucial to deal with increasingly large interferometric data, represents not only an extremely fast and optimized approach, but also the greenest one if compared to its parallel CPU counterpart. This code, now publicly available, has been tested with a wide variety of modern interferometers and SKA pathfinders. This represents, to date, the first example of radio imaging software fully enabled to GPUs, becoming a potential state-of-the-art approach for the upcoming SKA. Finally, we will also present the future perspectives about the code, planned to be converted into a library and possibly be used by any of the most
The QR factorization, which is a fundamental operation in linear algebra, is used extensively in scientific simulations. The acceleration and memory reduction of it are important research targets. QR factorization usi...
详细信息
According to the traditional multi-dimensional FFT, memory layouts of high-dimensional data are discontinuous. Transposition is introduced to keep high-dimensional data continuous in memory. However, transposition inc...
详细信息
Despite the increasing adoption of FPGAs in compute clouds, there remains a significant gap in programming tools and abstractions which can leverage network-connected, cloud-scale, multi-die FPGAs to generate accelera...
详细信息
The solver module of the Astrometric Verification Unit Global Sphere Reconstruction (AVU GSR) pipeline aims to find the astrometric parameters of ∼108 stars in the Milky Way, the attitude and instrumental settings of...
详细信息
In this paper, we propose a Zero-shot Face Swapping Network (ZFSNet) to swap novel identities where no training data is available, which is very practical. In contrast to many existing methods that consist of several ...
详细信息
ISBN:
(纸本)9783030967727;9783030967710
In this paper, we propose a Zero-shot Face Swapping Network (ZFSNet) to swap novel identities where no training data is available, which is very practical. In contrast to many existing methods that consist of several stages, the proposed model can generate images containing the unseen identity in a single forward pass without fine-tuning. To achieve it, based on the basic encoder-decoder framework, we propose an additional de-identification (De-ID) module after the encoder to remove the source identity information, which contributes to removing the source identity retaining in the encoding stream and improves the model's generalization capability. Then we introduce an attention component (ASSM) to blend the encoded source feature and the target identity feature adaptively. It amplifies proper local details and helps the decoder attend to the related identity feature. Extensive experiments evaluated on the synthesized and real images demonstrate that the proposed modules are effective in zero-shot face swapping. In addition, we also evaluate our framework on zero-shot facial expression translation to show its versatility and flexibility.
The accuracy of processor power modeling is an important foundation for power management and optimization on parallel computing system. It is difficult to build a high-accuracy instantaneous CPU/DRAM power prediction ...
详细信息
The General Data Protection Regulation (GDPR) of the European Union became binding in May 2018. The objective of the GDPR is essentially twofold. On the one hand, it seeks to facilitate the free movement of personal d...
详细信息
With the rapid development of wireless communication network, channel is the most important part in the communication process, and the problem of reasonable channel assignment becomes increasingly serious. To solve th...
详细信息
ISBN:
(数字)9798350391954
ISBN:
(纸本)9798350391961
With the rapid development of wireless communication network, channel is the most important part in the communication process, and the problem of reasonable channel assignment becomes increasingly serious. To solve this problem, we study the radio label problem of graph. The graph is usually used as the channel assignment modeling of wireless communication, and the channel assignment problem of the network is simulated by the vertex labeling problem of the graph. Various applications of radio labeling, such as frequency assignment in mobile communication systems, signal processing, parallel and distributed computing, circuit and sensor network design, play an important role in the channel assignment process of wireless communication networks. The channel assignment in the network is converted to the vertex labeling problem of the graph. The maximum radio label of the graph is called its span, and the minimum possible span is called the radio number of the graph. The aim is to find an optimal radio label to reduce the channel utilization rate in the network, so as to reduce the interference in the process of network communication. In this paper, we mainly study the topology of the Cartesian product of stars with $\mathbf{n}$ vertices and the middle graph of cycles, where $m \geq 3$. We simulate the channel assignment of a wireless communication network with the same structure as it, obtain the lower bound of its radio label, and determine optimal radio label.
暂无评论