The ever-growing processing ability of in-memory processing logic makes the data sharing and coherence between processors and in-memory logic play an increasingly important role in Processing-in-Memory (PIM) systems. ...
ISBN:
(纸本)9781450360074
The ever-growing processing ability of in-memory processing logic makes the data sharing and coherence between processors and in-memory logic play an increasingly important role in Processing-in-Memory (PIM) systems. Unfortunately, the existing state-of-the-art coarse-grained PIM coherence solutions suffer from unnecessary data movements and stalls caused by a data ping-pong issue. This work proposes CuckooPIM, a criticality-aware and less-blocking coherence mechanism, which can effectively avoid unnecessary data movements and stalls. Experiments reveal that CuckooPIM achieves 1.68x speedup on average comparing with coarse-grained PIM coherence.
Hybrid density-functional calculation is one of the most commonly adopted electronic structure theory used in computational chemistry and materials science because of its balance between accuracy and computational cos...
详细信息
This work presents a dynamic parallel distribution scheme for the Hartree-Fock exchange (HFX) calculations based on the real-space NAO2GTO framework. The most time-consuming electron repulsion integrals (ERIs) calcula...
详细信息
Materials science literature contains vast amount of structure-activity relationship knowledge crucial for materials discovery and design. However, automatic extraction of domain knowledge from literature remains chal...
详细信息
Materials science literature contains vast amount of structure-activity relationship knowledge crucial for materials discovery and design. However, automatic extraction of domain knowledge from literature remains challenging due to its unstructured and heterogeneous format. Herein, we propose a framework for automating knowledge acquisition, which involves a materials entity-aware relational extraction model (MatRE) to mine triples, an approach to construct a knowledge graph (KG) for the detection of associations among triples, as well as inference and representation of structure-activity relationships in a machine learning (ML)-compatible format. We demonstrate its application in predicting sodium ion activation energy for the NASICON solid-state electrolyte (SSE) system. MatRE trained on a NASICON SSE dataset, achieves an F1-score of 0.80, and is used to extract 260,475 entity–relation triples from 1,808 scientific publications. Furthermore, embedding 24 knowledge bullets from the KG into data pre-processing and feature engineering stages improves the performance and interpretability of six common ML models by up to 25.7%. This work offers key insights into automatic knowledge acquisition from literature and heralds a new paradigm for AI-assisted materials genome engineering driven by both data and knowledge.
With the continuous development of power grids, the scale of supercomputing clusters has also gradually increased to carry a large number of power system simulation calculations, and the problem of high energy consump...
ISBN:
(数字)9781728167824
ISBN:
(纸本)9781728167831
With the continuous development of power grids, the scale of supercomputing clusters has also gradually increased to carry a large number of power system simulation calculations, and the problem of high energy consumption has appeared. To solve this problem, we propose a container virtualization-based supercomputing cluster for power system. We analyze the impact of containers on power simulation calculations and compare the energy consumption effects of various container scheduling and migration algorithms on clusters. Experiments show that compared to virtual machines with hypervisor, which consumes massive resources and reduces performances by 28.4%, the performance degradation of container on power simulation calculation is 1.3%, which can be ignored. The energy consumption of load-concentration or resource-and-load-balance container scheduling algorithms is up to 4.0% lower and at least 2.2% lower than other algorithms. In container migration, the method combining autoregressive model with most-correlation and resource-andload-balance algorithms is better than other methods, which not only minimizes energy consumption, but also has lowest number of migrations and SLA violations. Experiments verify the feasibility and advantages of container migration in power systemcomputing clusters.
The increasing computational cost of deep neural network models limits the applicability of intelligent applications on resource-constrained edge devices. While a number of neural network pruning methods have been pro...
详细信息
Double buffering is an effective mechanism to hide the latency of data transfers between on-chip and off-chip memory. However, in dataflow architecture, the swapping of two buffers during the execution of many tiles d...
详细信息
Double buffering is an effective mechanism to hide the latency of data transfers between on-chip and off-chip memory. However, in dataflow architecture, the swapping of two buffers during the execution of many tiles decreases the performance because of repetitive filling and draining of the dataflow accelerator. In this work, we propose a non-stop double buffering mechanism for dataflow architecture. The proposed non-stop mechanism assigns tiles to the processing element array without stopping the execution of processing elements through optimizing control logic in dataflow architecture. Moreover, we propose a work-flow program to cooperate with the non-stop double buffering mechanism. After optimizations both on control logic and on work-flow program, the filling and draining of the array needs to be done only once across the execution of all tiles belonging to the same dataflow graph. Experimental results show that the proposed double buffering mechanism for dataftow architecture achieves a 16.2% average efficiency improvement over that without the optimization.
Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale th...
详细信息
Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale the five-tuple model to multi-tuple. In general, packet classification on multiple fields is a complex problem. Although most existing software-based algorithms have been proved extraordinary in practice, they are only suitable for the classic five-tuple model and difficult to be scaled up. Meanwhile, hardware-specific solutions are inflexible and expensive, and some of them are power consuming. In this paper, we propose a universal multi-dimensional packet classification approach for multi-core systems. In our approach, novel data structures and four decomposition-based algorithms are designed to optimize the classification and updating of rules. For multi-field rules, a rule set is cut into several parts according to the number of fields. Each part works independently. In this way, the fields are searched in parallel and all the partial results are merged together at last. To demonstrate the feasibility of our approach, we implement a prototype and evaluate its throughput and latency. Experimental results show that our approach achieves a 40% higher throughput than that of other decomposed-based algorithms and a 43% lower latency of rule incremental update than that of the other algorithms on average. Furthermore, our approach saves 39% memory consumption on average and has a good scalability.
UAV networks often partition into separated clusters due to the high node and link dynamic. As a result, network connectivity recovery is an important issue in this area. Existing solutions always need excessive movem...
详细信息
Earlier-stage evaluations of a new AI architecture/system need affordable AI benchmarks. Only using a few AI component benchmarks like MLPerf alone in the other stages may lead to misleading conclusions. Moreover, the...
详细信息
暂无评论