With single thread performance starting to plateau, HW architects have turned to chip level multiprocessing (CMP) to increase processing power. All major microprocessor companies are aggressively shipping multi-core p...
详细信息
ISBN:
(纸本)1595936025
With single thread performance starting to plateau, HW architects have turned to chip level multiprocessing (CMP) to increase processing power. All major microprocessor companies are aggressively shipping multi-core products in the mainstream computing market. Moore's law will largely be used to increase HW thread-level parallelism through higher core counts in a CMP environment. CMPs bring new challenges into the design of the software system stack. In this tutorial, we talk about the shift to multi-core processors and the programming implications. In particular, we focus on transactional programming. Transactions have emerged as a promising alternative to lock-based synchronization that eliminates many of the problems associated with lock-based synchronization. We discuss the design of both hardware and software transactional memory and quantify the tradeoffs between the different design points. We show how to extend the Java and C languages with transactional constructs, and how to integrate transactions with compiler optimizations and the language runtime (e.g., memory manager and garbage collection).
GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world's fastest supercomputer...
详细信息
GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world's fastest supercomputer in the TOP500 list, built at NUDT (National University of Defense technology) last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability guarantees required by many HPC applications. By analyzing the SIMT (single-instruction, multiple-thread) characteristics of programs running on GPGPUs, we have developed PartialRC, a new checkpoint-based compiler-directed partial recomputing method, for achieving efficient fault recovery by leveraging the phenomenal computing power of GPGPUs. In this paper, we introduce our PartialRC method that recovers from errors detected in a code region by partially re-computing the region, describe a checkpoint-based faulttolerance framework developed on PartialRC, and discuss an implementation on the CUDA platform. Validation using a range of representative CUDA programs on NVIDIA GPGPUs against FullRC (a traditional full-recomputing Checkpoint-Rollback-Restart fault recovery method for CPUs) shows that PartialRC reduces significantly the fault recovery overheads incurred by FullRC, by 73.5% when errors occur earlier during execution and 74.6% when errors occur later on average. In addition, PartialRC also reduces error detection overheads incurred by FullRC during fault recovery while incurring negligible performance overheads when no fault happens.
The NBS Data Encryption Standard may be integrated into computer networks to protect personal (nonshared) files, to communicate securely both on- and off-line with local and remote users, to protect against key substi...
详细信息
The NBS Data Encryption Standard may be integrated into computer networks to protect personal (nonshared) files, to communicate securely both on- and off-line with local and remote users, to protect against key substitution, to authenticate system users, to authenticate data, and to provide digital signatures using a nonpublic key encryption algorithm. Key notarization facilities give users the capability of exercising a set of commands for key management as well as for data encryption functions. The facilities perform notarization which, upon encryption, seals a key or password with the identities of the transmitter and intended receiver. Thus, in order to decrypt a message, the receiver must authenticate himself and supply the correct identity of the transmitter. This feature eliminates the threat of key substitution which must be protected against to attain a high level of security.
In this paper definite Horn clause programs are investigated within a proof-theoretic framework;program clauses being considered rules of a formal system. Based on this approach, the soundness and completeness of SLD-...
详细信息
This paper introduced the optimization and deoptimization technologies for Escape analysis in open world. These technologies are used in a novel Escape analysis framework that has been implemented in Open runtime plat...
详细信息
This paper introduced the optimization and deoptimization technologies for Escape analysis in open world. These technologies are used in a novel Escape analysis framework that has been implemented in Open runtime platform, Intel's opensource Java virtual machine. We introduced the optimization technologies for synchronization removal and object stack allocation, as well as the runtime deoptimization and compensation work. The deoptimization and compensation technologies are crucial for a practical Escape analysis in open world. We evaluated the runtime efficiency of the deoptimization and compensation work on benchmarks like SPECjbb2000 and SPECjvm98.
We introduce a definitional extension of logic programming by means of an inference schema (Ph), which, in acertain sense, is dual to the (1-P) schema of rule application discussed in Part I. In the operational semant...
In this paper, a Multi-Choice Stochastic Bi-Level programming Problem (MCSBLPP) is considered where all the parameters of constraints are followed by normal distribution. The cost coefficients of the objective functio...
详细信息
作者:
Mucahit SoyluResul DasInonu University
Department of Organized Industrial Zone Vocational School Computer Programming Malatya Turkiye Firat University
Faculty of Technology Department of Software Engineering 23119 Elazig Turkiye
This study proposes a hybrid approach for visualizing cyberattacks by combining the deep learning-based GAT model with JavaScript-based graph visualization tools. The model processes large, heterogeneous data from the...
This study proposes a hybrid approach for visualizing cyberattacks by combining the deep learning-based GAT model with JavaScript-based graph visualization tools. The model processes large, heterogeneous data from the UNSW-NB15 dataset to generate dynamic and meaningful graphs. In the data cleaning phase, missing and erroneous data were removed, unnecessary columns were discarded, and the data was transformed into a format suitable for modeling. Then, the data was converted into homogeneous graphs, and heterogeneous structures were created for analysis using the GAT model. GAT prioritizes relationships between nodes in the graph with an attention mechanism, effectively detecting attack patterns. The analyzed data was then converted into interactive graphs using tools like SigmaJS, with attacks between the same nodes grouped to reduce graph complexity. Users can explore these dynamic graphs in detail, examine attack types, and track events over time. This approach significantly benefits cybersecurity professionals, allowing them to better understand, track, and develop defense strategies against cyberattacks.
作者:
SYNEFAKI, AI[a]Faculty of Technology-General Department
University of Thessaloniki Faculty of Technology Div. Computational Methods and Computer Programming Thessaloniki 540 06 Greece Fax: [b]University of Macedonia Thessaloniki
In this paper we develop an algorithm in order to solve pseudo-Boolean nonlinear equations and inequalities. The solutions to these problems are given in the form of set families. The objective of the proposed algorit...
详细信息
In this paper we develop an algorithm in order to solve pseudo-Boolean nonlinear equations and inequalities. The solutions to these problems are given in the form of set families. The objective of the proposed algorithms is the minimisation of the family number. The computational experiment showed that the time needed to obtain a solution family is almost constant.
DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizat...
详细信息
DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizations to eliminate significantly more row buffer conflicts. Validation in simulation using a set of selected scientific and engineering benchmarks against a few representative memory controller optimizations shows that our method can reduce row buffer miss rates by up to 76% (with an average of 37.4%). This reduction in row buffer miss rates will be translated into performance speedups by up to 15% (with an average of 5%).
暂无评论