Network monitoring is vital in modern clouds and data center networks that need diverse traffic statistics ranging from flow size distributions to heavy hitters. To cope with increasing network rates and massive traff...
详细信息
ISBN:
(数字)9781728164120
ISBN:
(纸本)9781728164137
Network monitoring is vital in modern clouds and data center networks that need diverse traffic statistics ranging from flow size distributions to heavy hitters. To cope with increasing network rates and massive traffic volumes, sketch based approximate measurement has been extensively studied to trade the accuracy for memory and computation cost, which unfortunately, is sensitive to hash *** paper presents a clustering-preserving sketch method to be resilient to hash collisions. We provide an equivalence analysis of the sketch in terms of the K-means clustering. Based on the analysis result, we cluster similar network flows to the same bucket array to reduce the estimation variance and use the average to obtain unbiased estimation. Testbed shows that the framework adapts to line rates and provides accurate query results. Real-world trace-driven simulations show that LSS remains stable performance under wide ranges of parameters and dramatically outperforms state-of-the-art sketching structures, with over 10 3 to 10 5 times reduction in relative errors for per-flow queries as the ratio of the number of buckets to the number of network flows reduces from 10% to 0.1%.
Fault resilience has became a major issue for HPC systems, particularly, in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. MPI-level fault tolerant constru...
详细信息
This paper presents a load balancing method for a multi-block grids-based CFD (Computational Fluid Dynamics) application on heterogeneous platform. This method includes an asymmetric task scheduling scheme and a load ...
详细信息
ISBN:
(数字)9781665403986
ISBN:
(纸本)9781665403993
This paper presents a load balancing method for a multi-block grids-based CFD (Computational Fluid Dynamics) application on heterogeneous platform. This method includes an asymmetric task scheduling scheme and a load balancing model. The idea is to balance the computing speed between the CPU and the coprocessor by adjusting the workload and the numbers of threads on both sides. Optimal load balance parameters are empirically selected, guided by a performance model. Performance evaluation is conducted on a computer server consists of two Intel Xeon E5-2670 v3 CPUs and two MIC coprocessors (Xeon Phi 5110P and Xeon Phi 7120P) for the simulation of turbulent combustion in a supersonic combustor. The results show that the performance is highly sensitive to the load balance parameters. With the optimal parameters, the heterogeneous computing achieves a maximum speedup of 2.30 × for a 6-block mesh, and a maximum speedup of 2.66 × for a 8-block mesh, over the CPU-only computing.
In the relay-trading mode of wireless cognitive radio networks the secondary user (SU) can achieve a promised spectrum access opportunity by relaying for the primary user (PU). How to utilize the exchanged resource ef...
详细信息
In the relay-trading mode of wireless cognitive radio networks the secondary user (SU) can achieve a promised spectrum access opportunity by relaying for the primary user (PU). How to utilize the exchanged resource efficiently and fairly is an interesting and practical problem. In this paper we proposed a cooperative spectrum sharing strategy (RT-CSS) for the relay-trading mode from the fairness view. The cooperative SUs are gathered in a cooperative sharing group (CSG), and contribution metric (CM) is proposed to measure each CSG member's contribution to CSG as well as benefit from CSG. The adjustment of CM can guarantee the fairness and efficiency of spectrum sharing. The numerical simulation shows that RT-CSS can achieve better performance than the sense-uncooperative mode.
Network monitoring is vital in modern clouds and data center networks for traffic engineering, network diagnosis, network intrusion detection, which need diverse traffic statistics ranging fromflow size distributions ...
详细信息
Blockchain technology has been extensively uti-lized in decentralized data-sharing applications, with the immutability of blockchain providing a witness for the circulation of data. However, current blockchain data-sh...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Blockchain technology has been extensively uti-lized in decentralized data-sharing applications, with the immutability of blockchain providing a witness for the circulation of data. However, current blockchain data-sharing solutions still fail to address the simultaneous screening needs of both the sender and receiver with multi-keywords. Without the capability to support bilateral simultaneous filtering, the disclosure of reasons for matching failures could inadvertently expose sensitive user data. Therefore, the challenge lies in enabling ciphertexts with multiple keywords and receivers with multiple interests to achieve mutual and simultaneous matching. Based on the technical foundations of SE (Searchable Encryption), MABE (Multi-Attribute Based Encryption), and polynomial fitting, this paper proposes a scheme called DMSA (Decentralized and Multi-keyword selective Sharing and selective Acquisition). This scheme can satisfy soundness, enabling ciphertexts carrying multiple keywords and receivers representing multiple interests to match each other simultaneously. We conducted a security analysis that confirms the security of DMSA against chosen-plaintext attacks. Our experimental results demonstrate a significant efficiency improvement, with a 67% increase over single-keyword data-sharing schemes and a 16% enhancement compared to the existing multi-keyword data-sharing solution.
Virtualization is the foundation for cloud computing, and the virtualization can not be achieved without software defined, elastic, flexible and scalable virtual layers. Unfortunately, if multiple virtual storage devi...
详细信息
Virtualization is the foundation for cloud computing, and the virtualization can not be achieved without software defined, elastic, flexible and scalable virtual layers. Unfortunately, if multiple virtual storage devices are chained together, the system may be subject to severe performance degradation. While the read-ahead (RA) mechanism in storage devices plays a very important role to improve I/O performance, RA may not be effective as expected for multiple virtualization layers, since it is originally designed for one layer only. When I/O requests are passed through a long I/O path, they may trigger a chain reaction and lead to unnecessary data transmission and thus bandwidth waste. In this paper, we study the dynamic behavior of RA through multiple I/O layers and demonstrate that if controlled well, RA can greatly accelerate I/O speed. We present RAFlow, a RA control mechanism, to effectively improve I/O performance by strategically expanding RA window at each layer. Our real-world experiments show that it can achieve 20% to 50% performance improvement in I/O paths with up to 8 virtualized storage devices.
Many scientific computing applications demand a great amount of memory, which are usually run on supercomputers with large physical memory. The applications whose complete time is affected by memory capacity, is usual...
详细信息
Many scientific computing applications demand a great amount of memory, which are usually run on supercomputers with large physical memory. The applications whose complete time is affected by memory capacity, is usually called memory-intensive applications. Many disk 10 operations will be caused by memory-intensive applications due to insufficient physical memory of computing environment, which depress the system performance dramatically. Traditional network memory or memory server schemes try to share free memory of idle nodes in the cluster, while it is often influenced by heavy cluster loading or interior network congestion. Combining the network memory with current service computing and grid computing technology, a memory service based memory sharing grid system named RAM Grid, is proposed. RAM grid improve traditional network memory scheme, and extend the application area of grid computing. As one of the major computing resources, memory can be shared on the Internet through RAM grid, such as computing power and disk storage being shared through computational grid and data grid. The system criteria and properties are analyzed. One of the system schemes is also proposed in detail. Through the real trace based simulation, the performance of RAM Grid is evaluated and proved better than traditional network memory or disk IO obviously.
With the rapid growth of large language models, cloud computing has become an indispensable component of the AI industry. Cloud service providers(CSPs) are establishing AI data centers to service AI workloads. In the ...
详细信息
ISBN:
(数字)9798350387339
ISBN:
(纸本)9798350387346
With the rapid growth of large language models, cloud computing has become an indispensable component of the AI industry. Cloud service providers(CSPs) are establishing AI data centers to service AI workloads. In the face of this surging need for AI computing power, building a connected computing environment across various clouds and forming a JointCloud presents an attractive solution. However, scheduling AI tasks across multiple AI data centers within a JointCloud environment presents a significant challenge: how to balance users’ demands while ensuring CSPs’ fairness in scheduling. Existing research primarily focuses on optimizing scheduling quality with limited consideration for fairness. Therefore, this paper proposes a Fairness-Aware AI-Workloads Allocation method (F3A), a fair cross-cloud allocation technique for AI tasks. F3A utilizes Point and Token to reflect both the resource status and historical task allocations of AI data centers, enabling the consideration of users’ multidimensional demands and facilitating fair task allocation across multiple centers. In order to better assess the fairness of scheduling, we also devised a fairness indicator(FI), based on the Gini coefficient to measure the fairness of task allocation. The experimental results demonstrate that F3A consistently maintains FI within 0.1 across various cluster sizes and different task quantities, representing an improvement of 76.45% compared to classical fair scheduling algorithms round-robin. F3A exhibits commendable performance in ensuring fairness in task allocation while also demonstrating effectiveness in cost reduction and enhancing user satisfaction.
An improved algorithm is proposed for the reconstruction of singular connectivity from the available pairwise connections during preprocessing phase. To evaluate the performance of the algorithm, an in-house computati...
详细信息
An improved algorithm is proposed for the reconstruction of singular connectivity from the available pairwise connections during preprocessing phase. To evaluate the performance of the algorithm, an in-house computational fluid dynamics (CFD) code is used in which high-order finite-difference method for spatial discretization running on the Tianhe-1A supercomputer is employed. Test cases with a varied amount of mesh points are chosen, and the test results indicate that the improved singular connection reconstruction algorithm can achieve a speedup of 2000× at least when compared with the naive search method adopted in the former version of our code. Moreover, the parallel efficiency can benefit from the strategy of local communication based on the algorithm.
暂无评论