Activity recognition has broad application prospects in many fields including pervasive computing and human-computer interaction. In this paper, the technology of wireless-based activity recognition is introduced. By ...
详细信息
Vision-Language Navigation (VLN) tasks require an agent to follow human language instructions to navigate in previously unseen environments. This challenging field involving problems in natural language processing, co...
详细信息
In the field of nuclear energy, the Loss of Coolant Accident (LOCA) is recognized as one of the most severe types of nuclear reactor accidents, characterized by its complex physical processes and potentially catastrop...
详细信息
ISBN:
(数字)9798331531409
ISBN:
(纸本)9798331531416
In the field of nuclear energy, the Loss of Coolant Accident (LOCA) is recognized as one of the most severe types of nuclear reactor accidents, characterized by its complex physical processes and potentially catastrophic consequences. These challenges impose stringent requirements on safety analysis and emergency response. Accurate prediction and analysis of fluid behavior within pipelines under LOCA conditions are critical for evaluating accident outcomes and formulating response strategies. This paper introduces an innovative intelligent computing approach—a Physics-Informed Neural Networks (PINNs) model driven by both physical data and simulation data, specifically tailored for LOCA conditions. To address the challenges posed by multivariable and complex physical relationships, the six-equation two-fluid model is first simplified to represent the physical processes. Subsequently, a dual-driven PINNs network is developed, integrating both simulation data and physical constraints. The proposed model demonstrates a root mean square error (RMSE) of 0.02, a mean absolute error (MAE) of 0.044, and an R2 of 0.81 in predicting the outcomes under the six-equation model.
The concerns of data-intensiveness and energy awareness are actively reshaping the design of high-performance computing (HPC) systems nowadays. The Graph500 is a widely adopted benchmark for evaluating the performance...
详细信息
ISBN:
(纸本)9781479947638
The concerns of data-intensiveness and energy awareness are actively reshaping the design of high-performance computing (HPC) systems nowadays. The Graph500 is a widely adopted benchmark for evaluating the performance of computing systems for data-intensive workloads. In this paper, we introduce a data-parallel implementation of Graph500 on the Intel Single-chip Cloud Computer (SCC). The SCC features a non-coherent many-core architecture and multi-domain on-chip DVFS support for dynamic power management. With our custom-made shared virtual memory programming library, memory sharing among threads is done efficiently via the shared physical memory (SPM) while the library has taken care of the coherence. We conduct an in-depth study on the power and performance characteristics of the Graph500 workloads running on this system with varying system scales and power states. Our experimental results are insightful for the design of energy-efficient many-core systems for data-intensive applications.
Network traffic classification is crucial for network security and network management and is one of the most important network tasks. Current state-of-the-art traffic classifiers are based on deep learning models to a...
详细信息
Network traffic classification is crucial for network security and network management and is one of the most important network tasks. Current state-of-the-art traffic classifiers are based on deep learning models to automatically extract features from packet streams. Unfortunately, current approaches fail to effectively combine the structural information of traffic packets with the content features of the packets, resulting in limited classification accuracy. In this paper, we propose a graph neural network model for network traffic classification, which can well perceive the interaction feature of packets in traffic. Firstly, we design a graph structure for packets’ flows to hold the interaction information between packets, which embeds both packet contents and sequence relationships into a unified graph. Secondly, we propose a graph neural network framework for graph classification to automatically learn the structural features of the packets’ flows together with the packets’ features. Extensive evaluation results on real-world traffic data show that the proposed model improves the prediction accuracy of improves the prediction accuracy by 2% to 37% for malicious traffic classification.
In this paper, an improved algorithm is proposed for the reconstruction of singularity connectivity from the available pairwise connections during preprocessing phase. To evaluate the performance of our algorithm, an ...
详细信息
Unstructured sparse pruning significantly reduces the computational and parametric complexities of deep neural network models. Nevertheless, the highly irregular nature of sparse models limits its performance and effi...
详细信息
Unstructured sparse pruning significantly reduces the computational and parametric complexities of deep neural network models. Nevertheless, the highly irregular nature of sparse models limits its performance and efficiency on traditional computing platforms, thereby prompting the development of specialized hardware solutions. To improve computational efficiency, we introduce the Sparse Dataflow Fusion Accelerator (SPDFA), a specialized architecture meticulously designed for sparse deep neural networks. Firstly, we present a non-blocking data distribution-computing engine that integrates inner product and column product. This engine boosts computational efficiency by decomposing matrix multiplication and convolution into rectangular matrix-vector multiplications. Secondly, we implement a computation array to further exploit the parallelism, and design an on-chip buffer structure that supports multi-line memory access mode. Lastly, to bolster the adaptability of our accelerator, we propose an innovative macroinstruction set coupled with a micro-kernel scheme. Furthermore, we refine the macroinstruction issue strategy, thereby further enhancing computational efficiency. Our evaluation results demonstrate that SPDFA achieves an average 1.29\(\times\)-2.38\(\times\) improvement in computational efficiency compared to the state-of-the-art SpMM accelerators when applied to unstructured sparse deep neural network models. Furthermore, its performance outperforms existing sparse neural network accelerators by a factor of 1.03\(\times\)-1.83\(\times\). Additionally, SPDFA exhibits excellent scalability with a scaling efficiency exceeding 80%.
Low power is the first-class design requirement for HPC systems. Dynamic voltage and frequency scaling (DVFS) has become the commonly used and efficient technology to achieve a trade-off between power consumption and ...
详细信息
Low power is the first-class design requirement for HPC systems. Dynamic voltage and frequency scaling (DVFS) has become the commonly used and efficient technology to achieve a trade-off between power consumption and system performance. However, most the prior work using DVFS did not take into account the latency of voltage/frequency scaling, which is a critical factor in real hardware determining the power efficiency of the power management algorithm. This paper, firstly, investigate the latency features of DVFS on a real many-core hardware platform. Secondly, we propose a latency-aware DVFS algorithm for profile-based power management to avoid aggressive power state transitions. At last, we evaluate our algorithm on Intel SCC platform using a data-intensive benchmark, Graph 500 benchmark. The experimental results not only show impressive potential for energy saving in data-intensive applications (up to 31% energy saving and 60% EDP reduction), but also evaluate the efficiency of our latency-aware DVFS algorithm which achieves 12.0% extra energy saving and 5.0% extra EDP reduction, moreover, increases the execution performance by 22.4%.
Three dimensions multiple-input multiple-output (3D MIMO) system is now gaining a growing interest among researchers in wireless communication, the reason can be attributed to its potential to enable a variety of stra...
详细信息
ISBN:
(纸本)9781479974719
Three dimensions multiple-input multiple-output (3D MIMO) system is now gaining a growing interest among researchers in wireless communication, the reason can be attributed to its potential to enable a variety of strategies like user specific 3D beamforming and cell-splitting. In this paper, we pursue the performance evaluation of 3D MIMO system employing zero-forcing (ZF) receivers, accounting for both large and small-scale fading. In particular, we consider the classical log-normal model and antenna gain with tilting angle, we derive the optimal tilting angle for 3D MIMO system which maximizes the sum rate, furthermore, based on the optimal tilting angle, we propose a closed-form linear approximation on the achievable sum rate in the asymptotically high signal-to-noise ratio (SNR) and low-SNR regime, respectively. We investigate the effect of antenna tilting angle, number of base station (BS) antennas and the distance between BS and the center of building on the achievable sum rate performance of 3D MIMO.
As the energy consumption of embedded multiprocessor systems becomes increasingly prominent, the real-time energy-efficient scheduling in multiprocessor systems becomes an urgent problem to reduce the system energy co...
详细信息
As the energy consumption of embedded multiprocessor systems becomes increasingly prominent, the real-time energy-efficient scheduling in multiprocessor systems becomes an urgent problem to reduce the system energy consumption while meeting real-time constraints. For a multiprocessor with independent DVFS and DPM at each processor, this paper proposes an energy-efficient real-time scheduling algorithm named LRE-DVFS-EACH, based on LRE-TL which is an optimal real-time scheduling algorithm for sporadic tasks. LRE-DVFS-EACH utilizes the concept of TL plane and the idea of fluid scheduling to dynamically scale the voltage and frequency of processors at the initial time of each TL plane as well as the release time of a sporadic task in each TL plane. Consequently, LRE-DVFS-EACH can obtain a reasonable tradeoff between the real-time constraints and the energy saving. LRE-DVFS-EACH is also adaptive to the change of workload caused by the dynamic release of sporadic tasks, which can obtain more energy savings. The experimental results show that compared with existing algorithms, LRE-DVFS-EACH can not only guarantee the optimal feasibility of sporadic tasks, but also achieve more energy savings in all cases, especially in the case of high workloads.
暂无评论