检索结果-内蒙古大学图书馆

arXiv 2017年

作者： Wu, Mingyu Zhao, Ziming Li, Haoyu Li, Heting Chen, Haibo Zang, Binyu Guan, Haibing Shanghai Key Lab. of Scalable Computing and Systems Institute of Parallel and Distributed Systems Shanghai Jiao Tong University

Fast, byte-addressable non-volatile memory (NVM) em-braces both near-DRAM latency and disk-like persistence, which has generated considerable interests to revolutionize system software stack and programming models. However, it is less understood how NVM can be combined with managed runtime like Java virtual machine (JVM) to ease persistence management. This paper proposes Espresso1, a holistic ex-tension to Java and its runtime, to enable Java programmers to exploit NVM for persistence management with high perfor-mance. Espresso first provides a general persistent heap de-sign called Persistent Java Heap (PJH) to manage persistent data as normal Java objects. The heap is then strengthened with a recoverable mechanism to provide crash consistency for heap metadata. It then provides a new abstraction called Persistent Java Object (PJO) to provide an easy-to-use but safe persistent programming model for programmers to per-sist application data. The evaluation confirms that Espresso significantly outperforms state-of-art NVM support for Java (i.e., JPA and PCJ) while being compatible to existing data structures in Java programs. Copyright © 2017, The Authors. All rights reserved.

关键词： Nonvolatile storage

来源：评论

学校读者我要写书评

暂无评论

An efficient local address generation for the block-cyclic distribution 3

An efficient local address generation for the block-cyclic d...

引用

3rd International Conference on Algorithms and Architectures for parallel Processing, ICA3PP 1997

作者： Kwon, Oh-Young Kim, Tae-Geun Han, Tack-Don Yang, Sung-Bong Kim, Shin-Dug Distributed Computing Lab. Systems Engineering Research Institute 1 Ueun-dong Yusong-gu Taejon305-333 Korea Republic of Parallel Processing System Laboratory Dept. of Computer Science Yonsei University Seoul120-749 Korea Republic of

ISBN: (纸本)0780342291

In order to generate local addresses for an array section A(l:h:s) with block-cyclic distribution, an efficient compiling method is required. In this paper, two local address generation methods for the block-cyclic distribution are presented. One is a simple local address generation method that is modified from the virtual-block scheme. The other is a linear-time ΔM table construction method. The array elements of A(l:h:s) to be accessed at run-time build up a family of lines. By using the equation of the lines, a ΔM table can be generated in O(k) time. Experimental results show that a simple local address generation method has poor performance but a linear-time ΔM table generation method is faster than other algorithms in ΔM table generation time and access time for 10,000 array elements. © 1997 IEEE.

关键词： Virtual addresses

来源：评论

学校读者我要写书评

暂无评论

NebulaFL: Effective Asynchronous Federated Learning for JointCloud computing

arXiv

引用

arXiv 2024年

作者： Gao, Fei Hu, Ming Xie, Zhiyu Shi, Peichang Xie, Xiaofei Yi, Guodong Wang, Huaimin National Key Lab. of Parallel and Distributed Processing National University of Defense Technology Changsha China School of Computing and Information Systems Singapore Management University Singapore Xiangjiang Lab Changsha China

With advancements in AI infrastructure and Trusted Execution Environment (TEE) technology, Federated Learning as a Service (FLaaS) through JointCloud computing (JCC) is promising to break through the resource constraints caused by heterogeneous edge devices in the traditional Federated Learning (FL) paradigm. Specifically, with the protection from TEE, data owners can achieve efficient model training with high-performance AI services in the cloud. By providing additional FL services, cloud service providers can achieve collab.rative learning among data owners. However, FLaaS still faces three challenges, i.e., i) low training performance caused by heterogeneous data among data owners, ii) high communication overhead among different clouds (i.e., data centers), and iii) lack of efficient resource scheduling strategies to balance training time and cost. To address these challenges, this paper presents a novel asynchronous FL approach named NebulaFL for collab.rative model training among multiple clouds. To address data heterogeneity issues, NebulaFL adopts a version control-based asynchronous FL training scheme in each data center to balance training time among data owners. To reduce communication overhead, NebulaFL adopts a decentralized model rotation mechanism to achieve effective knowledge sharing among data centers. To balance training time and cost, NebulaFL integrates a reward-guided strategy for data owners selection and resource scheduling. The experimental results demonstrate that, compared to the state-of-the-art FL methods, NebulaFL can achieve up to 5.71% accuracy improvement. In addition, NebulaFL can reduce up to 50% communication overhead and 61.94% costs under a target accuracy. Copyright © 2024, The Authors. All rights reserved.

关键词： Costs

来源：评论

学校读者我要写书评

暂无评论

HPDedup: A Hybrid prioritized data deduplication mechanism for primary storage in the cloud 33

HPDedup: A Hybrid prioritized data deduplication mechanism f...

引用

33rd International Conference on Massive Storage Systems and Technology, MSST 2017

作者： Wu, Huijun Wang, Chen Fu, Yinjin Sakr, Sherif Zhu, Liming Lux, Kai Data CSIRO University of New South Wales Australia PLA University of Science and Technology China Science and Technology on Parallel and Distributed Laboratory State Key Laboratory of High Performance Computing State Key Lab. of High-end Server and Storage Technology Coll. of Computer Natl. Univ. of Def. Technol. China

Eliminating duplicate data in primary storage of clouds increases the cost-efficiency of cloud service providers as well as reduces the cost of users for using cloud services. Most existing primary deduplication techniques either use inline caching to exploit locality in primary workloads or use postprocessing deduplication running in system idle time to avoid the negative impact on I/O performance. However, neither of them works well in the cloud servers running multiple services or applications for the following two reasons: Firstly, the temporal locality of duplicate data writes may not exist in some primary storage workloads thus inline caching often fails to achieve good deduplication ratio. Secondly, the post-processing deduplication allows duplicate data to be written to disks, therefore does not provide the benefit of I/O deduplication and requires high peak storage capacity. This paper presents HPDedup, a Hybrid Prioritized data Deduplication mechanism to deal with the storage system shared by applications running in co-located virtual machines or containers by fusing an inline and a post-processing process for exact deduplication. In the inline deduplication phase, HPDedup gives a fingerprint caching mechanism that estimates the temporal locality of duplicates in data streams from different VMs or applications and prioritizes the cache allocation for these streams based on the estimation. HPDedup also allows different deduplication threshold for streams based on their spatial locality to reduce the disk fragmentation. The post-processing phase removes duplicates whose fingerprints are not able to be cached due to weak temporal locality from disks. The hybrid deduplication mechanism significantly reduces the amount of redundant data written to the storage system while maintaining inline data writing performance. Our experimental results show that HPDedup clearly outperforms the state-of-the-art primary storage deduplication techniques in terms of inline cac

关键词： Efficiency

来源：评论

学校读者我要写书评

暂无评论

Load Balancing a Multi-Block Grids-based Application on Heterogeneous Platform

Load Balancing a Multi-Block Grids-based Application on Hete...

引用

IEEE International Conference on Computational Science and Engineering, CSE

作者： Yonggang Che Chuanfu Xu Zhenghua Wang Institute for Quantum Information & State Key Lab. of High Performance Computing College of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Lab National University of Defense Technology Changsha China

ISBN: (数字)9781665403986

ISBN: (纸本)9781665403993

This paper presents a load balancing method for a multi-block grids-based CFD (Computational Fluid Dynamics) application on heterogeneous platform. This method includes an asymmetric task scheduling scheme and a load balancing model. The idea is to balance the computing speed between the CPU and the coprocessor by adjusting the workload and the numbers of threads on both sides. Optimal load balance parameters are empirically selected, guided by a performance model. Performance evaluation is conducted on a computer server consists of two Intel Xeon E5-2670 v3 CPUs and two MIC coprocessors (Xeon Phi 5110P and Xeon Phi 7120P) for the simulation of turbulent combustion in a supersonic combustor. The results show that the performance is highly sensitive to the load balance parameters. With the optimal parameters, the heterogeneous computing achieves a maximum speedup of 2.30 × for a 6-block mesh, and a maximum speedup of 2.66 × for a 8-block mesh, over the CPU-only computing.

关键词： Computational modeling Computational fluid dynamics Load management Servers Task analysis Load modeling Coprocessors

来源：评论

学校读者我要写书评

暂无评论

Preface

引用

International Journal of High Performance computing and Networking 2017年第1-2期10卷 1-2页

作者： Vijayakumar, V. Anderst-Kotsis, Gabriele Cai, Zhipeng Abawajy, Jemal H. School of Computing Science and Engineering VIT University Chennai600127 India Department of Telecooperation Johannes Kepler Universität AltenbergerStrasse 69 Linz4040 Austria Department of Computer Science Georgia State University AtlantaGA30303 United States Parallel and Distributed Computing Lab. Deakin University GeelongVIC3220 Australia

来源：评论

学校读者我要写书评

暂无评论

Journal of Information Science and Engineering: Editorial notice

引用

Journal of Information Science and Engineering 2005年第5期21卷 i-ii页

作者： Lee, Der-Tsai Amato, Nancy M. Chang, Shih-Fu Chen, Homer H. Chen, Tsuhan Hsu, Tsan-Sheng Hwang, Jenq-Neng Kuo, Sy-Yen Kuo, Tei-Wei Li, Chung-Sheng Tokuyama, Takeshi Wing, Jeannette Wu, Tzong-Chen Yen, John Computer Science at Texas A and M University United States Parasol Laboratory United States IEEE Transactions on Parallel and Distributed Systems Computing Research Association's Committee on the Status of Women in Computing Research CRA-W's Distributed Mentor Program United States Department of Electrical Engineering Columbia University United States Lab. United States College of Electrical Engineering and Computer Science National Taiwan University Taiwan IEEE Transactions on Circuits and Systems for Video Technology IEEE Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh PA United States ACM ACM SIGACT IEEE Computer Society IICM Austria Research and Development of the Department Multimedia Signal Processing Technical Committee IEEE Signal Processing Society United States IEEE Transactions on Circuits and systems for Video Technology College of Electrical Engineering and Computer Science National Taiwan Ocean University Keelung Taiwan Department of Electrical Engineering National Taiwan University Taiwan Department of Computer Science and Information Engineering National Taiwan University Taipei Taiwan IEEE Technical Committee on Real-Time Systems Computer Science Division IBM T.J. Watson Research Center United States IBM Research Division Graduate School of Information Sciences Tohoku University Japan ACM IPSJ Mathematical Society of Japan Japan Department of Computer Science Computer Science Department Carnegie Mellon University United States National Academies of Science's Computer Science and Telecommunications Board United States Microsoft's Trustworthy Academic Advisory Board Intel Research Pittsburgh's Advisory Board United States Dartmouth's Institute for Security Technology Studies Advisory Committee Canada Sloan Research Fellowships Program Committee United States ACM Taiwan China Information Sciences and Technology Pennsylvania State University United States Laboratory for Intelligent Agents Penn State's School of Info

No abstract availab.e

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：