Calculation of many-body correlation functions is one of the critical kernels utilized in many scientific computing areas, especially in Lattice Quantum Chromodynamics (Lattice QCD). It is formalized as a sum of a lar...
详细信息
Calculation of many-body correlation functions is one of the critical kernels utilized in many scientific computing areas, especially in Lattice Quantum Chromodynamics (Lattice QCD). It is formalized as a sum of a large number of contraction terms each of which can be represented by a graph consisting of vertices describing quarks inside a hadron node and edges designating quark propagations at specific time intervals. Due to its computation- and memory-intensive nature, real-world physics systems (e.g., multi-meson or multi-baryon systems) explored by Lattice QCD prefer to leverage multi-GPUs. Different from general graph processing, many-body correlation function calculations show two specific features: a large number of computation-/data-intensive kernels and frequently repeated appearances of original and intermediate data. The former results in expensive memory operations such as tensor movements and evictions. The latter offers data reuse opportunities to mitigate the data-intensive nature of many-body correlation function calculations. However, existing graph-based multi-GPU schedulers cannot capture these data-centric features, thus resulting in a sub-optimal performance for many-body correlation function calculations. To address this issue, this paper presents a multi-GPU scheduling framework, MICCO, to accelerate contractions for correlation functions particularly by taking the data dimension (e.g., data reuse and data eviction) into account. This work first performs a comprehensive study on the interplay of data reuse and load balance, and designs two new concepts: local reuse pattern and reuse bound to study the opportunity of achieving the optimal trade-off between them. Based on this study, MICCO proposes a heuristic scheduling algorithm and a machine-learning-based regression model to generate the optimal setting of reuse bounds. Specifically, MICCO is integrated into a real-world Lattice QCD system, Redstar, for the first time running on multiple GPU
One type of neurocomputer recently proposed, the folded-array digital neural emulator using tree accumulation and communication structures, incorporates a new concept in representing an artificial digital neuron. Begi...
详细信息
One type of neurocomputer recently proposed, the folded-array digital neural emulator using tree accumulation and communication structures, incorporates a new concept in representing an artificial digital neuron. Beginning from the parallel distributed processing (PDP) neuron model, the folded-array digital neural emulator is briefly described. Then by applying the folded-array concepts to the PDP model, the folded axon/dendrite tree neuron is created which, in a general form, represents a new model for the neural paradigm.< >
As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications ...
详细信息
ISBN:
(数字)9781665497473
ISBN:
(纸本)9781665497480
As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs should not be neglected. As we embrace a world with ever-increasing amounts of data as well as research & development of A.I. applications, we are sure to face an ever-mounting energy footprint to sustain these computational budgets, data storage needs, and more. But, is this sustainable and, more importantly, what kind of setting is best positioned to nurture such sustainable A.I. in both research and practice? In this paper, we outline our outlook for Green A.I.—a more sustainable, energy-efficient and energy-aware ecosystem for developing A.I. across the research, computing, and practitioner communities alike—and the steps required to arrive there. We present a bird's eye view of various areas for potential changes and improvements from the ground floor of AI's operational and hardware optimizations for datacenter/HPCs to the current incentive structures in the world of A.I. research and practice, and more. We hope these points will spur further discussion, and action, on some of these issues and their potential solutions.
The need for open computer support for cooperative work (CSCW) systems is examined. Their requirements and impact on distributed systems are discussed. The Mocca project for developing an environment which will suppor...
详细信息
The need for open computer support for cooperative work (CSCW) systems is examined. Their requirements and impact on distributed systems are discussed. The Mocca project for developing an environment which will support open CSCW systems is described. It is shown that the open distributed systems (ODP) experience and approach can be a valuable tool in realizing open CSCW systems. Similarly, the ODP standardization effort can benefit both from the experience of CSCW application developers and from the requirements which CSCW systems place on their distributed platform.< >
Complex applications today involve multiple processes, multiple threads of control, distributed processing, thread pools, event handling, messages. The behaviors and misbehaviors of these nondeterministic, message-bas...
详细信息
Complex applications today involve multiple processes, multiple threads of control, distributed processing, thread pools, event handling, messages. The behaviors and misbehaviors of these nondeterministic, message-based systems are difficult to capture and understand. The typical approach is to trace the behavior of the systems and track how the different incoming messages are processed throughout the system. While messages between processes can be captured automatically at the network or library level, tracing the message processing within a system, which is often more complex and error-prone, requires the programmer to manually instrument the code by identifying the different message handlers, thread states, processing stages, and shared queues accurately and completely. In this paper we show how dynamic analysis can be used to automatically identify the transactions, stages and shared queues in Java programs as a prelude to trace-based comprehension.
Graph-based multi-view clustering methods have gained significant attention due to their outstanding ability of clustering-structure representation. Considering the influence on the quality of pre-constructed graphs b...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Graph-based multi-view clustering methods have gained significant attention due to their outstanding ability of clustering-structure representation. Considering the influence on the quality of pre-constructed graphs by noise, in this paper, we propose a novel method called ANGVR. Unlike existing methods that aim to directly learn a consensus graph from multi-view data, ANGVR rebuilds the graph constructed from raw data to seek a consensus graph across views for clustering. Furthermore, to guide the construction of graph, an embedding constraint based on neighboring group structures is introduced, which explores the neighborhood structure information corresponding to neighborhood sets. The experimental results show improvements in accuracy of 5.56%, 6.77% and 4.02%, respectively on 3Sources dataset with 10%, 30% and 50% missing view compared to the existing works.
Cyber attack campaigns are becoming increasingly complex and severe, causing significant impacts on institutions and individuals. Cyber Threat Intelligence (CTI) provides important evidential knowledge about attackers...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Cyber attack campaigns are becoming increasingly complex and severe, causing significant impacts on institutions and individuals. Cyber Threat Intelligence (CTI) provides important evidential knowledge about attackers and is critical to the shift from reactive to proactive defense against cyber attacks. Attack detection based on Indicators of Compromise (IOCs), a type of CTI, is vulnerable to the limitation of insufficient context of attack scenarios. In contrast, attack behavior intelligence is associated with information on attackers’ techniques, targets, and intentions, providing a solid foundation for security practitioners to conduct attack investigations or other applications. Many current CTI mining systems are limited to extracting CTI from a single source, leading to challenges such as fragmented attack behavior view and low-value density. To address these issues, we propose an unsupervised fusion framework named CTIFuser, which includes a comprehensive pipeline of four subtasks aimed at mining and fusing multi-source attack behaviors at the attack technique level. In our evaluation of 739 real-world CTI reports from 542 sources, experimental results demonstrate that CTIFuser can obtain a complete view of the attack behaviors at the attack technique level.
Test oracles are widely used to verify whether a system under test is running as desired. Since the correctness of real-time systems depends on the logical results of the computation and the time when results are prod...
详细信息
Test oracles are widely used to verify whether a system under test is running as desired. Since the correctness of real-time systems depends on the logical results of the computation and the time when results are produced at the same time, an optimized model checking-based method for test oracles generation is proposed to check if the system traces satisfy their real-time specifications at run time. Inspired by the idea of real-time model checking, the test oracles can be automatically generated from their specifications in the real-time logic MITL/sub [o,d]/ in a simpler way and modelled by a variant of the timed automata. Assertions are chosen to acquire the traces of real-time systems. A case study is presented to demonstrate the usefulness of the method proposed in this paper.
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
ISBN:
(纸本)9781538655566;9781538655559
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
暂无评论