This paper presents a novel approach to address the issue of identity protection in facial image datasets. Our goal is to prevent any violation of privacy for the individuals depicted in the dataset while ensuring tha...
详细信息
Most supercomputers adopt a data forwarding architecture to achieve storage scalability. However, it results in a significant reduction in single-process bandwidth compared to direct file system access. Moreover, cons...
详细信息
ISBN:
(纸本)9781665473156
Most supercomputers adopt a data forwarding architecture to achieve storage scalability. However, it results in a significant reduction in single-process bandwidth compared to direct file system access. Moreover, considering that a majority of applications uses only a single process for writing and reading data, the low single-process performance also leads to a time overhead for these applications. This paper proposes an userspace forwarding mechanism DFBUFFER with two performance optimization methods: user-space multi-thread request processing and data write buffer in a unit of file. The client of DFBUFFER is embedded in the application as a library reducing the software overhead, and the server implements multi-thread I/O request processing to improve bandwidth efficiency. The data write buffer can asynchronously handle write requests, which accelerates the write bandwidth of compute nodes. We evaluate DFBUFFER on the Sunway exascale prototype system. The results indicate that in the regular mode of DFBUFFER, both the write and read latency are reduced, and the write bandwidth and large-block read bandwidth of single-process are increased by 1.8 times and 2.8 times respectively. The DFBUFFER buffer mode increases the write bandwidth of a single process by 0.8 times over the regular mode. Although the performance advantage of the regular mode of DFBUFFER gradually weakens with the increase of concurrent processes, the DFBUFFER buffer mode has the effect of improving the write bandwidth, the 64-IO-processes application is increased by 0.2 times.
Diffusion models have exhibited remarkable capabilities in text-to-image generation. However, their performance in image-to-text generation, specifically image captioning, has lagged behind Auto-Regressive (AR) models...
详细信息
The emerging class of high velocity and high volume data analytic workflows comprise interwoven data ingestion, organization, and processing stages, with ingestion and organization steps often contributing comparable ...
详细信息
Generative adversarial networks (GANs) have remarkably advanced in diverse domains, especially image generation and editing. However, the misuse of GANs for generating deceptive images, such as face replacement, raise...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
Generative adversarial networks (GANs) have remarkably advanced in diverse domains, especially image generation and editing. However, the misuse of GANs for generating deceptive images, such as face replacement, raises significant security concerns, which have gained widespread attention. Therefore, it is urgent to develop effective detection methods to distinguish between real and fake images. Current research centers around the application of transfer learning. Nevertheless, it encounters challenges such as knowledge forgetting from the original dataset and inadequate performance when dealing with imbalanced data during training. To alleviate this issue, this paper introduces a novel GAN-generated image detection algorithm called X-Transfer, which enhances transfer learning by utilizing two neural networks that employ interleaved parallel gradient transmission. In addition, we combine AUC loss and cross-entropy loss to improve the model's performance. We carry out comprehensive experiments on multiple facial image datasets. The results show that our model outperforms the general transferring approach, and the best metric achieves 99.04%, which is increased by approximately 10%. Furthermore, we demonstrate excellent performance on non-face datasets, validating its generality and broader application prospects.
Despite the remarkable progress has made in deep compressed sensing (DCS), how to improve the reconstruction quality is still a major challenge. The existing DCS model generally still has some issues, especially in re...
详细信息
With the application of blockchain light nodes in embedded devices, how to alleviate computing pressure brought by complex operations such as transaction's SPV Verification for CPU of embedded devices and improve ...
详细信息
Many models with deep learning exhibit good performance, but due to the noise generated during the dataset or image generation and transmission used for learning, it may include noise and cannot achieve the desired re...
详细信息
Variations of stochastic gradient decedent (SGD) methods are at the core of training deep neural network models. However, in distributed deep learning, where multiple computing devices and data segments are employed i...
详细信息
ISBN:
(纸本)9781450397339
Variations of stochastic gradient decedent (SGD) methods are at the core of training deep neural network models. However, in distributed deep learning, where multiple computing devices and data segments are employed in the training process, the performance of SGD can be significantly limited by the overhead of gradient communication. Local SGD methods are designed to overcome this bottleneck by averaging individual gradients trained over parallel workers after multiple local iterations. Currently, both for theoretical analyses and for practical applications, most studies employ periodic synchronization scheme by default, while few of them focus on the aperiodic schemes to obtain better performance models with limited computation and communication overhead. In this paper, we investigate local SGD with an arbitrary synchronization scheme to answer two questions: (1) Is the periodic synchronization scheme best? (2) If not, what is the optimal one? First, for any synchronization scheme, we derive the performance boundary with fixed overhead, and formulate the performance optimization under given computation and communication constraints. Then we find a succinct property of the optimal scheme that the local iteration number decreases as training continues, which indicates the periodic one is suboptimal. Furthermore, with some reasonable approximations, we obtain an explicit form of the optimal scheme and propose Aperiodic Local SGD (ALSGD) as an improved substitute for local SGD without any overhead increment. Our experiments also confirm that with the same computation and communication overhead, ALSGD outperforms local SGD in performance, especially for heterogeneous data.
The fast distributed garden planning system in colleges and universities is one of the mainstream applications of contemporary computers. The latest traditional methods usually use the GPU full virtualization technolo...
详细信息
暂无评论