检索结果-内蒙古大学图书馆

IEEE International Conference on Joint Cloud Computing (JCC)

作者： Jiacheng Yang Guodong Yi Fei Gao Peichang Shi Huaimin Wang National Key Laboratory of Parallel and Distributed Processing College of Computer Science National University of Defense Technology Changsha China Xiangjiang Lab Changsha China School of Advanced Interdisciplinary Studies Hunan University Of Technology and Business Changsha China

ISBN: (数字)9798350387339

ISBN: (纸本)9798350387346

With the rapid growth of large language models, cloud computing has become an indispensable component of the AI industry. Cloud service providers(CSPs) are establishing AI data centers to service AI workloads. In the face of this surging need for AI computing power, building a connected computing environment across various clouds and forming a JointCloud presents an attractive solution. However, scheduling AI tasks across multiple AI data centers within a JointCloud environment presents a significant challenge: how to balance users’ demands while ensuring CSPs’ fairness in scheduling. Existing research primarily focuses on optimizing scheduling quality with limited consideration for fairness. Therefore, this paper proposes a Fairness-Aware AI-Workloads Allocation method (F3A), a fair cross-cloud allocation technique for AI tasks. F3A utilizes Point and Token to reflect both the resource status and historical task allocations of AI data centers, enabling the consideration of users’ multidimensional demands and facilitating fair task allocation across multiple centers. In order to better assess the fairness of scheduling, we also devised a fairness indicator(FI), based on the Gini coefficient to measure the fairness of task allocation. The experimental results demonstrate that F3A consistently maintains FI within 0.1 across various cluster sizes and different task quantities, representing an improvement of 76.45% compared to classical fair scheduling algorithms round-robin. F3A exhibits commendable performance in ensuring fairness in task allocation while also demonstrating effectiveness in cost reduction and enhancing user satisfaction.

关键词： Industries Data centers Cloud computing Job shop scheduling Costs Scheduling algorithms Large language models

来源：评论

学校读者我要写书评

暂无评论

Offline Imitation Learning Using Reward-free Exploratory Data 22

Offline Imitation Learning Using Reward-free Exploratory Dat...

引用

Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence

作者： Hao Wang Dawei Feng Bo Ding Wei Li College of Computer National University of Defense Technology China National Laboratory for Parallel and Distributed Processing National University of Defense Technology China Independent Researcher China

ISBN: (纸本)9781450398336

Offline imitative learning(OIL) is often used to solve complex continuous decision-making tasks. For these tasks such as robot control, automatic driving and etc., it is either difficult to design an effective reward for learning or very expensive and time-consuming for agents to collect data interactively with the environment. However, the data used in previous OIL methods are all gathered by reinforcement learning algorithms guided by task-specific rewards, which is not a true reward-free premise and still suffers from the problem of designing an effective reward function in real tasks. To this end, we propose the reward-free exploratory data driven offline imitation learning (ExDOIL) framework. ExDOIL first trains an unsupervised reinforcement learning agent by interacting with the environment, and collects enough unsupervised exploration data during training; Then, a task independent yet simple and efficient reward function is used to relabel the collected data; Finally, an agent is trained to imitate the expert to complete the task through a conventional RL algorithm such as TD3. Extensive experiments on continuous control tasks demonstrate that the proposed framework can achieve better imitation performance(28% higher episode returns on average) comparing with previous SOTA method(ORIL) without any task-specific rewards.

关键词： dataset

来源：评论

学校读者我要写书评

暂无评论

OLM2: Automatic Optimal Strategy Generating for Large-Scale Model Training with Limited-Memory

OLM2: Automatic Optimal Strategy Generating for Large-Scale ...

引用

IEEE International Conference on Joint Cloud Computing (JCC)

作者： Zhilin Yang Yu Tang Linbo Qiao Xi Yang Zhen Huang National Key Laboratory of Parallel and Distributed Computing College of Computer Science National University of Defense Technology Changsha 410073 China

The scale of model parameters and the amount of training data is exponentially increasing. It requires more GPU memory with the exponential increasement of model parameters. Recomputation and swapping are two main memory optimization methods that have been extensively studied, and there are also optimization strategies that combine the two methods. However, most of them are based on heuristic search strategies, which do not explore the complete solution space and can’t guarantee the optimality of the solution results. An optimal search strategy with tensor-level recomputation and swapping is expected in large-scale model training. In this paper, we propose an optimal strategy searching algorithm combining tensor-based recomputation and swapping. Specifically, the memory swapping strategy is reformulated as an optimization problem, which converts the memory constraints into mixed integer programming, to find the optimal memory optimization strategy. By leveraging the advantages of both recomputation and swapping, this approach minimizes computation consumption without exceeding the available memory limitation. Experimental results show that our method exhibits about 60% reduction in memory requirements during the training process. Furthermore, our method can reduce the overall training time beyond the existing algorithms. Compared to Checkmate, our approach achieves about 0.3–0.9% reduction in computation cost per iteration.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Don't Half-listen: Capturing key-part Information in Continual Instruction Tuning

arXiv

引用

arXiv 2024年

作者： He, Yongquan Huang, Xuancheng Tang, Minghao Meng, Lingxun Li, Xiang Lin, Wei Zhang, Wenyuan Gao, Yifu Meituan China Institute of Information Engineering Chinese Academy of Sciences China National Key Laboratory of Parallel and Distributed Computing National University of Defense Technology China

Instruction tuning for large language models (LLMs) can drive them to produce results consistent with human goals in specific downstream tasks. However, the process of continual instruction tuning (CIT) for LLMs may bring about the catastrophic forgetting (CF) problem, where previously learned abilities are degraded. Recent methods try to alleviate the CF problem by modifying models or replaying data, which may only remember the surface-level pattern of instructions and get confused on held-out tasks. In this paper, we propose a novel continual instruction tuning method based on key-part Information Gain (KPIG). Our method computes the information gain on masked parts to dynamically replay data and refine the training objective, which enables LLMs to capture task-aware information relevant to the correct response and alleviate overfitting to general descriptions in instructions. In addition, we propose two metrics, P-score and V-score, to measure the generalization and instruction-following abilities of LLMs. Experiments demonstrate our method achieves superior performance on both seen and held-out tasks. Copyright © 2024, The Authors. All rights reserved.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

Quality evaluation of airfoil hybrid mesh based on graph neural network 19th

Quality evaluation of airfoil hybrid mesh based on graph neu...

引用

19th Chinese Intelligent Systems Conference, CISC 2023

作者： Wang, Huaiqing Pang, Yufei Xiao, Sumei Wang, Zhichao School of Manufacturing Science and Engineering Southwest University of Science and Technology Mianyang621010 China Computational Aerodynamics Istitute China Aerodynamics Research and Development Center Mianyang621010 China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China

ISBN: (纸本)9789819968817

In airfoil numerical simulation, the mesh quality has an important influence on the accuracy and error of numerical simulation. The existing mesh quality evaluation requires a lot of manual interaction, which greatly reduces the efficiency of mesh generation and necessitates the implementation of intelligent mesh evaluation methods. Graph neural networks can extract features from graph data, possess self-adaptability and generalization ability, and have been successfully applied in many industries. In this paper, we propose a deep graph neural network, SDeepNet, to evaluate mesh quality and construct a large-scale mixed mesh dataset, MixSet, for training and validating the model. We test and compare the performance of the mesh quality evaluation models GridNet, GMeshNet, and SDeepNet on the mesh dataset MixSet. The experimental results show that the SDeepNet model can achieve high accuracy and recall in the mixed mesh quality evaluation task. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023.

关键词： Mesh generation

来源：评论

学校读者我要写书评

暂无评论

DCSA: The Deployment Mechanism of Chained Serverless Applications in JointCloud Environment

DCSA: The Deployment Mechanism of Chained Serverless Applica...

引用

IEEE International Conference on Joint Cloud Computing (JCC)

作者： Yaojie Li Peichang Shi Jianfei Liu Rui Li Fei Gao Penghui Ma Dong Xie Guodong Yi National Key Laboratory of Parallel and Distributed Processing College of Computer Science National University of Defense Technology Changsha China Xiangjiang Lab Changsha China School of Advanced Interdisciplinary Studies Hunan University Of Technology and Business Changsha China

ISBN: (数字)9798350387339

ISBN: (纸本)9798350387346

Serverless computing, comprised of Function as a Service (FaaS) and Backend as a Service (BaaS), has garnered widespread attention owning to its features such as maintenance-free operations, pay-per-use pricing, and automatic scalability. However, practical usage encounters several challenges: 1) The diversity of user applications makes comprehensive performance evaluation difficult, as benchmark and application tests only reflect performance under specific conditions and cannot fully capture users’ actual experiences across different serverless platforms. 2) Disparities in performance and costs across different serverless platforms make it challenging to achieve optimal performance and cost efficiency through single-cloud deployment, thereby underutilizing the advantages of each platform. 3) Vendor lock-in issues restrict the migration of user applications and exacerbate dependence on a single cloud *** address these challenges, this paper proposes a collaborative mechanism, referred to as DCSA, which integrates FaaS and storage services to achieve automatic cross-cloud deployment of user applications while considering both performance and cost comprehensively. Firstly, we adapt the interfaces of different serverless platforms, effectively reducing the complexity of cross-cloud deployment. Secondly, we develop cost and latency models for the cross-cloud deployment of chained serverless applications and propose a deployment scheduling algorithm that simultaneously considers both latency and cost. Finally, we conduct experiments to evaluate the performance of the proposed algorithm. Results demonstrate that our method can effectively reduce latency (up to 2.3%) and lower costs (up to 9.9%).

关键词： Performance evaluation Adaptation models Costs Scheduling algorithms Scalability Serverless computing Collaboration

来源：评论

学校读者我要写书评

暂无评论

Sparse Matrix Reordering Method Selection with parallel Computing and Deep Learning

Sparse Matrix Reordering Method Selection with Parallel Comp...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Rui Xia Jihu Guo Huajian Zhang Shun Yang Qinglin Wang Jie Liu College of Computer Science and Techonology National University of Defense Technology Changsha China Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology

ISBN: (数字)9798350359312

ISBN: (纸本)9798350359329

Sparse matrix reordering is an important step in Cholesky decomposition. By reordering the rows and columns of the matrix, the time of computation and storage cost can be greatly reduced. With the proposal of various reordering algorithms, the selection of suitable reordering methods for various matrices has become an important research topic. In this paper, we propose a method to predict the optimal reordering method by visualizing sparse matrices in chunks in a parallel manner and feeding them into a deep convolutional neural network. The results show that the theoretical performance can reach 95% of the optimal performance, the prediction accuracy of the method can reach up to 85%, the parallel framework achieves an average speedup ratio of 11.35 times over the serial framework, and the performance is greatly improved compared with the traversal selection method on large sparse matrices.

关键词： Deep learning Visualization Accuracy Costs Neural networks parallel processing Prediction algorithms

来源：评论

学校读者我要写书评

暂无评论

NebulaFL: Effective Asynchronous Federated Learning for JointCloud Computing

arXiv

引用

arXiv 2024年

作者： Gao, Fei Hu, Ming Xie, Zhiyu Shi, Peichang Xie, Xiaofei Yi, Guodong Wang, Huaimin National Key Lab. of Parallel and Distributed Processing National University of Defense Technology Changsha China School of Computing and Information Systems Singapore Management University Singapore Xiangjiang Lab Changsha China

With advancements in AI infrastructure and Trusted Execution Environment (TEE) technology, Federated Learning as a Service (FLaaS) through JointCloud Computing (JCC) is promising to break through the resource constraints caused by heterogeneous edge devices in the traditional Federated Learning (FL) paradigm. Specifically, with the protection from TEE, data owners can achieve efficient model training with high-performance AI services in the cloud. By providing additional FL services, cloud service providers can achieve collaborative learning among data owners. However, FLaaS still faces three challenges, i.e., i) low training performance caused by heterogeneous data among data owners, ii) high communication overhead among different clouds (i.e., data centers), and iii) lack of efficient resource scheduling strategies to balance training time and cost. To address these challenges, this paper presents a novel asynchronous FL approach named NebulaFL for collaborative model training among multiple clouds. To address data heterogeneity issues, NebulaFL adopts a version control-based asynchronous FL training scheme in each data center to balance training time among data owners. To reduce communication overhead, NebulaFL adopts a decentralized model rotation mechanism to achieve effective knowledge sharing among data centers. To balance training time and cost, NebulaFL integrates a reward-guided strategy for data owners selection and resource scheduling. The experimental results demonstrate that, compared to the state-of-the-art FL methods, NebulaFL can achieve up to 5.71% accuracy improvement. In addition, NebulaFL can reduce up to 50% communication overhead and 61.94% costs under a target accuracy. Copyright © 2024, The Authors. All rights reserved.

关键词： Costs

来源：评论

学校读者我要写书评

暂无评论

Deep Time Series Anomaly Detection with Local Temporal Pattern Learning

Deep Time Series Anomaly Detection with Local Temporal Patte...

引用

International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Yizhou Li Yijie Wang Hongzuo Xu Xiaohui Zhou National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha China Intelligent Game and Decision Lab (IGDL) Beijing China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Self-supervised time series anomaly detection (TSAD) demonstrates remarkable performance improvement by extracting high-level data semantics through proxy tasks. Nonetheless, most existing self-supervised TSAD techniques rely on manual- or neural-based transformations when designing proxy tasks, overlooking the intrinsic temporal patterns of time series. This paper proposes a local temporal pattern learning-based time series anomaly detection (LTPAD). LTPAD first generates sub-sequences. Pairwise sub-sequences naturally manifest proximity relationships along the time axis, and such correlations can be used to construct supervision and train neural networks to facilitate the learning of temporal patterns. Time intervals between two sub-sequences serve as labels for sub-sequence pairs. By classifying these labeled data pairs, our model captures the local temporal patterns of time series, thereby modeling the temporal pattern-aware "normality". Abnormal scores of testing data are acquired by evaluating their conformity to these learned patterns shared in training data. Extensive experiments show that LTPAD significantly outperforms state-of-the-art competitors.

关键词： Time series analysis Semantics Neural networks Training data Manuals Signal processing Data models Speech processing Anomaly detection Testing

来源：评论

学校读者我要写书评

暂无评论

A Survey on Talking Head Generation: The Methods, Status and Challenges

SSRN

引用

SSRN 2023年

作者： Cai, Yali Qiao, Peng Li, Dongsheng National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha410073 China

The talking head generation aims to synthesize a speech video of the source identity from a driving video or audio or text data irrelevant to the source identity. It can not only be applied to games and virtual reality applications, but also provide data for fake data detection. In recent years, the research of talking head generation is widely popular, and the authenticity of the generated results has also been greatly improved. However, the synthetic results still have great room for progress. We summarize the existing researches in this paper, hoping to offer assistance for later researchers. Furthermore, we divide these methods into three categories according to the input data type, namely video, audio and text driven talking head generation methods, and analyze them in detail. In addition, we also summarize the data sets commonly used in this kind of research and explore the evaluation criteria for measuring the performance of the method. Finally, the shortcomings of the existing methods in this field and the future direction are presented in last section. © 2023, The Authors. All rights reserved.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：