Many large science projects rely on remote clusters for (near) real-time data processing, thus they demand reliable wide-area data transfer performance for smooth end-to-end workflow executions. However, data transfer...
详细信息
Many large science projects rely on remote clusters for (near) real-time data processing, thus they demand reliable wide-area data transfer performance for smooth end-to-end workflow executions. However, data transfers are often exposed to performance variations due to the changing network (e.g., background traffic) and dataset (e.g., average file size) conditions, necessitating adaptive solutions to meet stringent performance requirements of delay-sensitive streaming workflows. In this article, we propose FStream++ to provide reliable transfer performance for large streaming science applications by dynamically adjusting transfer settings to adapt to changing transfer conditions. FStream++ combines three optimization methods as dynamic tuning, online profiling, and historical analysis to swiftly and accurately discover optimal transfer settings that can meet workflow requirements. Dynamic tuning uses a heuristic model to predict the values of transfer parameters based on dataset characteristics and network settings. Since heuristic models fall short to incorporate many important factors such as I/O throughput and resource interference, we complement it with online profiling to execute a real-time search for a subset of transfer settings. Finally, historical analysis takes advantage of the long-running nature of streaming workflows by storing and analyzing previous performance observations to shorten the execution time of online profiling. We evaluate the performance of FStream++ by transferring several synthetic and real-world workloads in high-performance production networks and show that it offers up to 3:6x performance improvement over legacy transfer applications and up to 24% over our previous work FStream.
Data partitioning is the most fundamental procedure before parallelizing complex analysis on very big graphs. As a classical NP-complete problem, graph partitioning usually employs offline or online/streaming heuristi...
详细信息
ISBN:
(纸本)9798350339864
Data partitioning is the most fundamental procedure before parallelizing complex analysis on very big graphs. As a classical NP-complete problem, graph partitioning usually employs offline or online/streaming heuristics to find approximately optimal solutions. However, they are either heavyweight in space and time overheads or suboptimal in quality measured by workload balance and the number of cutting edges across partitions, both of which cannot scale well with the ever-growing demands of quickly analyzing big graphs. This paper thereby proposes a new vertex partitioner for better scalability. It preserves the lightweight advantage of existing streaming heuristics, and more importantly, fully utilizes the knowledge embedded in the local view when streaming a vertex, which significantly improves the quality. We present a sliding window technique to compensate for the additional memory costs caused by knowledge utilization. Also, a parallel technique with dependency detection optimization is designed to further enhance efficiency. Experiments on a spread of real-world datasets validate that our proposals can achieve overall success in terms of partitioning quality, memory consumption, and runtime efficiency.
作者:
Xie, GuoqiZeng, GangLi, RenfaHunan Univ
Coll Comp Sci & Elect Engn Key Lab Embedded & Cyber Phys Syst Hunan Prov Changsha 410082 Hunan Peoples R China Nagoya Univ
Grad Sch Engn Nagoya Aichi 4648603 Japan
In distributed automotive embedded systems, safety issues run through the entire life cycle, and safety mechanisms for error handling are desirable for risk control. This article focuses on safety enhancement (i.e., s...
详细信息
In distributed automotive embedded systems, safety issues run through the entire life cycle, and safety mechanisms for error handling are desirable for risk control. This article focuses on safety enhancement (i.e., safety mechanisms for error handling) for a safety-critical automotive application within its deadline. A stable stopping approach used for safety enhancement for an automotive application is proposed based on the static recovery mechanism provided in ISO 26262. The Stable Stopping-based Safety Enhancement (SSSE) approach is proposed by combining known backward recovery, proposed forward recovery, and proposed forward-and-backward recovery through primary-backup repetition. The stable stopping (i.e., SSSE) approach is a convergence algorithm, which means that when the reliability value reaches a steady state and the algorithm can stop. Experimental results reveal that the exposure level defined in ISO 26262 drops from E3 to E1 after using SSSE, and such improvement enables a safety guarantee of higher level.
The rise and proliferation of Artificial Intelligence (AI) technologies are bringing transformative changes to various sectors, signaling a new era of innovation in fields as diverse as medicine, manufacturing, and ev...
详细信息
ISBN:
(纸本)9798350371000;9798350370997
The rise and proliferation of Artificial Intelligence (AI) technologies are bringing transformative changes to various sectors, signaling a new era of innovation in fields as diverse as medicine, manufacturing, and even day-to-day social interactions. Notable advancements are not just confined to textual understanding, as seen in models like GPT, but also extend to visual cognition through image recognition and more. Beyond surface interactions and predictions, AI finds profound applications in life-saving domains such as medical diagnostics and becomes an integral part of daily life through chatbot-based customer interactions. However, as the horizon of AI expands, a crucial yet often overlooked aspect emerges- the underlying mission-critical infrastructure required to support and deploy these models effectively. The intricacies of efficient communication systems, foundational for real-time AI model operations, take center stage in ensuring the seamless functioning of AI-driven applications. This paper explores the quintessential changes needed in communication paradigms to keep pace with the evolving AI landscape. Specifically, we highlight the pivotal role of multipath communication in enhancing the responsiveness and efficiency of AI applications [1]. As a case in point, we investigate its impact on mission-critical operations in robotics. Through experimentation and analysis, the results elucidate the substantial benefits of this approach, revealing a significant improvement in delay metrics. This work underscores the imperative of aligning communication systems with the ever-growing demands of AI, ensuring that infrastructural capabilities do not lag in the race for innovation.
Application of blockchain in financial services has opened new ways of efficiency in transaction processing, assets management and security. The application of parallel, distributed, and grid computing with blockchain...
详细信息
Anomaly detection from remote sensing images is to detect pixels whose spectral signatures are different from their background. Anomalies are often man-made targets. With such target signatures being unknown, anomaly ...
详细信息
Anomaly detection from remote sensing images is to detect pixels whose spectral signatures are different from their background. Anomalies are often man-made targets. With such target signatures being unknown, anomaly detection has many important applications, such as water quality monitoring, crop stress surveying, and law enforcement-related uses, where prior information of targets is often unavailable. The key to success is accurate background modeling. Anomaly detection from remote sensing images is challenging because spatial coverage is very large and the background is highly heterogeneous. For pixel-based anomaly detection, computing cost in background modeling and a spatial-convolution-type detection process is very expensive. Thus, parallel and distributed computing is critical in reducing execution time, which can fit the need for real-time or near real-time detection from airborne and spaceborne platforms in support of immediate decision-making. This article reviews the recent advances in anomaly detection from hyperspectral remote sensing images and their implementation using parallel and distributedsystems. The classical methods, i.e., the Reed-Xiaoli (RX) algorithm and its variants, including its real-time processing version, are illustrated in commodity graphic processing units (GPUs), cloud, and field-programmable gate array (FPGA) implementations. Practical issues and future development trends are also discussed.
Existing Data parallel (DP) trainings for deep neural networks (DNNs) often experience limited scalability in speedup due to substantial communication overheads. While Overlapping technique can mitigate such problem b...
详细信息
The rise in multiprocessors has led to the incorporation of parallel processing in virtually all segments of industry. Creation of and maintenance for the software to run these systems, as well as for the applications...
详细信息
This paper compares sequential and parallel Java and C++ implementations of the B algorithm, a relatively new algorithm for user-equilibrium (UE) (road) traffic assignment (TA). All the versions were implemented in an...
详细信息
ISBN:
(数字)9798331527211
ISBN:
(纸本)9798331527228
This paper compares sequential and parallel Java and C++ implementations of the B algorithm, a relatively new algorithm for user-equilibrium (UE) (road) traffic assignment (TA). All the versions were implemented in an optimized way while using best practices in both languages. Their performances were thoroughly tested using multiple large real road traffic networks. The tests showed that the C++ version is indeed faster in most cases, but only by 8.54 % on average.
In this paper, we present a new algorithm named distributed Multi-root Pipeline MCTS (DMP-MCTS), to improve real-time search efficiency in multi-machine scenarios. By utilizing root parallel technology with significan...
详细信息
ISBN:
(数字)9798350350128
ISBN:
(纸本)9798350350135
In this paper, we present a new algorithm named distributed Multi-root Pipeline MCTS (DMP-MCTS), to improve real-time search efficiency in multi-machine scenarios. By utilizing root parallel technology with significance detection and pipeline pattern for parallel MCTS (3PMCTS), we not only reduce the repetitive computing tasks among subtrees but also allow for flexible operation time based resource allocation. The experiment shows this work achieves better computational performance under linear acceleration conditions, compared with other existing works.
暂无评论