Electric vehicles (EVs) are thriving to alleviate environmental issues. Conventional two-stage onboard charger (OBC) in EV only contains one large-power DC/DC converter to connect the whole battery pack to the inverte...
详细信息
ISBN:
(纸本)9781665486682
Electric vehicles (EVs) are thriving to alleviate environmental issues. Conventional two-stage onboard charger (OBC) in EV only contains one large-power DC/DC converter to connect the whole battery pack to the inverter. It requires dozens of battery cells to connect in parallel and then in series for charging. parallel connection causes circulating current among batteries, increasing the loss and safety risk and decreasing the battery life. Aimed at diminishing the circulating current by reducing parallel connections of battery cells, a distributed OBC architecture is proposed in this paper. It contains a bi-directional inverter and numerous paralleled bi-directional low-power DC/DC converters. The batteries are divided into multiple clusters with less paralleled cells to interface with those DC/DC converters, respectively. Furthermore, a novel virtual synchronous machine (VSM) control is proposed for the distributed OBC, enabling the OBC to provide inertia and frequency regulation to the grid and to serve as an emergency power supply in island mode. Compared to the conventional OBC, the distributed OBC under the proposed VSM control achieves higher fault tolerance, better power allocation, less circulating current among batteries, and less current impact on the batteries. Those priorities are finally verified by simulation results.
To effectively control large-scale distributedsystems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the ...
详细信息
ISBN:
(纸本)9781665495721
To effectively control large-scale distributedsystems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithmetic accelerations and overlook the benefits of the underlying system's structure. In particular, the existing decoupled software-hardware algorithm design that naively parallelizes the arithmetic operations by the hardware does not tackle the hardware overheads such as CPU-GPU and thread-to-thread communications in a principled manner. Also, the advantages of parallelizable subproblem decomposition in distributed MPC are not well recognized and exploited. As a result, we have not reached the full potential of hardware acceleration for MPC. In this paper, we explore those opportunities by leveraging GPU to parallelize the distributed and localized MPC (DLMPC) algorithm. We exploit the locality constraints embedded in the DLMPC formulation to reduce the hardware-intrinsic communication overheads. Our parallel implementation achieves up to 50x faster runtime than its CPU counterparts under various parameters. Furthermore, we find that the locality-aware GPU parallelization could halve the optimization runtime comparing to the naive acceleration. Overall, our results demonstrate the performance gains brought by software-hardware co-design with the information exchange structure in mind.
Trusted Execution Environments (TEEs), like Intel SGX/TDX, AMD SEV-SNP, ARM TrustZone/CCA, have been widely adopted in prevailing architectures. However, these TEEs typically do not consider I/O isolation (e.g., defen...
详细信息
ISBN:
(纸本)9798400703850
Trusted Execution Environments (TEEs), like Intel SGX/TDX, AMD SEV-SNP, ARM TrustZone/CCA, have been widely adopted in prevailing architectures. However, these TEEs typically do not consider I/O isolation (e.g., defending against malicious DMA requests) as a first-class citizen, which may degrade the I/O performance. Traditional methods like using IOMMU or software I/O can degrade throughput by at least 20% for I/O intensive workloads. The main reason is that the isolation requirements for I/O devices differ from CPU ones. This paper proposes a novel I/O isolation mechanism for TEEs, named sIOPMP (scalable I/O Physical Memory Protection), with three key features. First, we design a Multi-stage-Tree-based checker, supporting more than 1,000 hardware regions. Second, we classify the devices into hot and cold, and support unlimited devices with the mountable entry. Third, we propose a remapping mechanism to switch devices between hot and cold status for dynamic I/O work-loads. Evaluation results show that sIOPMP introduces only negligible performance overhead for both benchmarks and real-world workloads, and improves 20% similar to 38% network throughput compared with IOMMU-based mechanisms or software I/O adopted in TEEs.
Edge computing is a rapidly developing research area known for its ability to reduce latency and improve energy efficiency, and it also has a potential for green computing. Many geographically distributed edge servers...
详细信息
The minimum spanning tree is a critical problem for many applications in network analysis, communication network design, and computer science. The parallel implementation of minimum spanning tree algorithms increases ...
详细信息
ISBN:
(纸本)9783030975494;9783030975487
The minimum spanning tree is a critical problem for many applications in network analysis, communication network design, and computer science. The parallel implementation of minimum spanning tree algorithms increases the simulation performance of large graph problems using high-performance computational resources. The minimum spanning tree algorithms generally use traditional parallel programming models for distributed and shared memory systems, like Massage Passing Interface or OpenMP. Furthermore, the partitioned global address space model offers new capabilities in the form of asynchronous computations on distributed shared memory, positively affecting the performance and scalability of the algorithms. The paper aims to present a new minimum spanning tree algorithm implemented in a partitioned global address space model. The experiments with diverse parameters have been conducted to study the efficiency of the asynchronous implementation of the algorithm.
Existing multi-FPGA architectures often leverage high-speed interconnect technologies to achieve higher performance by exploiting ample communication bandwidth. In this paper, we propose an effective mapping approach ...
Existing multi-FPGA architectures often leverage high-speed interconnect technologies to achieve higher performance by exploiting ample communication bandwidth. In this paper, we propose an effective mapping approach for accelerating CNNs on bandwidth-constrained distributed multi-FPGA architectures. We formulate the system-level mapping problem and then introduce a method based on Genetic Algorithm (GA) and Mixed-Integer Nonlinear Programming (MINLP) to attain optimal solutions.
Automated Guided Vehicles are mobile robots de-signed for transportation purposes, and one of the most important problems associated with intelligent logistics is the problem of job scheduling. The goal is to find the...
Automated Guided Vehicles are mobile robots de-signed for transportation purposes, and one of the most important problems associated with intelligent logistics is the problem of job scheduling. The goal is to find the optimal allocation of job execution by the number of available devices. The problem can be resolved with a simulation in which the different scenarios are evaluated. However, creating such a simulation model requires a statistical description of the problem. In this paper, we implement the simulation model for the AGV environment. Based on the mathematical description of the model, the discrete event simulation is created using the Python programming language and the SimPy library. We use the simulation to compare the solution of the job scheduling problem using the simulated annealing and genetic algorithms.
With the rapid development of storage and network technology, emerging high-performance hardware is being widely applied to the distributed storage cluster. However, existing distributed storage systems employing mult...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
With the rapid development of storage and network technology, emerging high-performance hardware is being widely applied to the distributed storage cluster. However, existing distributed storage systems employing multi-layer abstractions to provide table data services result in leaving high-speed hardware under-exploited. In this paper, we propose TEngine, a native distributed table storage engine designed for NVMe SSD and RDMA. The key is that TEngine removes the file abstraction to construct table structures on the device directly. For metadata service, TEngine designs a decoupled single metadata server, reducing distributed coordination, easing the burden on the metadata node, and enabling localized data node access. For data service, TEngine optimizes the parallel processing capability of NVMe devices by integrating upper-level multi-thread parallel operations with lower-level NVMe devices' parallel I/O processing. Moreover, TEngine introduces a periodic pull-based data synchronization approach to transform data pushing into periodic data pulling, which offloads the synchronization burden from the leader to the followers. The experimental results show that TEngine outperforms state-of-the-art distributed storage systems using the same hardware environment.
With the digital transformation and increasing demand for informatization in the healthcare industry, traditional centralized information systems can no longer meet the requirements of high-concurrency access. However...
With the digital transformation and increasing demand for informatization in the healthcare industry, traditional centralized information systems can no longer meet the requirements of high-concurrency access. However, microservices architecture offers a flexible and scalable solution that effectively addresses the complexity and high-concurrency access demands of healthcare information systems. In this paper, we propose a microservices system that adopts a distributed architecture combined with high-concurrency processing mechanisms. The system disperses various modules across different servers as micro-services, and each microservice can independently handle requests. This distributed and parallel processing approach improves system responsiveness and throughput while reducing the risk of single-point failures. To validate the feasibility and performance of the system, we conducted a series of experiments and evaluations. The results demonstrate that the distributed healthcare information system based on microservices architecture performs exceptionally well in handling large-scale data and high-concurrency access. The system not only provides efficient data storage and retrieval capabilities but also exhibits good scalability and fault tolerance.
The digital transformation opens new opportunities for enterprises to optimize their business processes by applying data-driven analysis techniques. For storing and organizing the required huge amounts of data, differ...
详细信息
暂无评论