咨询与建议

限定检索结果

文献类型

  • 509 篇 会议
  • 191 篇 期刊文献
  • 2 册 图书

馆藏范围

  • 702 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 459 篇 工学
    • 353 篇 计算机科学与技术...
    • 258 篇 软件工程
    • 86 篇 信息与通信工程
    • 58 篇 电子科学与技术(可...
    • 53 篇 控制科学与工程
    • 35 篇 机械工程
    • 35 篇 生物工程
    • 28 篇 电气工程
    • 18 篇 仪器科学与技术
    • 16 篇 动力工程及工程热...
    • 11 篇 土木工程
    • 10 篇 材料科学与工程(可...
    • 10 篇 网络空间安全
    • 8 篇 化学工程与技术
    • 8 篇 农业工程
    • 8 篇 环境科学与工程(可...
    • 7 篇 交通运输工程
    • 6 篇 光学工程
  • 168 篇 理学
    • 101 篇 数学
    • 36 篇 生物学
    • 29 篇 系统科学
    • 25 篇 物理学
    • 24 篇 统计学(可授理学、...
    • 11 篇 化学
  • 120 篇 管理学
    • 81 篇 管理科学与工程(可...
    • 42 篇 图书情报与档案管...
    • 23 篇 工商管理
  • 13 篇 经济学
    • 13 篇 应用经济学
  • 13 篇 法学
    • 11 篇 社会学
  • 9 篇 农学
    • 8 篇 作物学
  • 3 篇 教育学
  • 3 篇 文学
  • 3 篇 医学
  • 3 篇 军事学
  • 1 篇 艺术学

主题

  • 32 篇 computational mo...
  • 22 篇 training
  • 19 篇 benchmark testin...
  • 18 篇 fault tolerance
  • 18 篇 distributed proc...
  • 18 篇 feature extracti...
  • 17 篇 kernel
  • 16 篇 computer archite...
  • 16 篇 semantics
  • 15 篇 deep learning
  • 15 篇 concurrent compu...
  • 15 篇 laboratories
  • 14 篇 servers
  • 14 篇 hardware
  • 13 篇 algorithm design...
  • 13 篇 cloud computing
  • 12 篇 parallel process...
  • 12 篇 graphics process...
  • 12 篇 optimization
  • 12 篇 protocols

机构

  • 112 篇 college of compu...
  • 81 篇 national laborat...
  • 77 篇 science and tech...
  • 47 篇 national laborat...
  • 35 篇 school of comput...
  • 30 篇 national laborat...
  • 22 篇 science and tech...
  • 22 篇 national key lab...
  • 18 篇 national key lab...
  • 18 篇 national laborat...
  • 16 篇 national laborat...
  • 14 篇 national laborat...
  • 13 篇 science and tech...
  • 13 篇 school of comput...
  • 12 篇 national key lab...
  • 11 篇 science and tech...
  • 11 篇 national key lab...
  • 10 篇 national laborat...
  • 10 篇 national key lab...
  • 10 篇 national key lab...

作者

  • 32 篇 dongsheng li
  • 28 篇 yijie wang
  • 28 篇 wang yijie
  • 26 篇 li dongsheng
  • 25 篇 wang huaimin
  • 21 篇 huaimin wang
  • 20 篇 zhigang luo
  • 18 篇 naiyang guan
  • 18 篇 peng yuxing
  • 16 篇 yuxing peng
  • 14 篇 dou yong
  • 14 篇 liu jie
  • 14 篇 ji wang
  • 14 篇 yin gang
  • 13 篇 wang ji
  • 13 篇 ding bo
  • 13 篇 jie liu
  • 12 篇 xiang zhang
  • 12 篇 lai zhiquan
  • 11 篇 zhiquan lai

语言

  • 657 篇 英文
  • 42 篇 中文
  • 3 篇 其他
检索条件"机构=National Laboratory of Parallel and Distributed Processing College of Computer"
702 条 记 录,以下是1-10 订阅
排序:
Training large-scale language models with limited GPU memory:a survey
收藏 引用
Frontiers of Information Technology & Electronic Engineering 2025年 第3期26卷 309-331页
作者: Yu TANG Linbo QIAO Lujia YIN Peng LIANG Ao SHEN Zhilin YANG Lizhi ZHANG Dongsheng LI National Key Laboratory of Parallel and Distributed Computing College of ComputerNational University of Defense TechnologyChangsha 410073China
Large-scale models have gained significant attention in a wide range of fields,such as computer vision and natural language processing,due to their effectiveness across various ***,a notable hurdle in training these l... 详细信息
来源: 评论
SIGNGD with Error Feedback Meets Lazily Aggregated Technique:Communication-Efficient Algorithms for distributed Learning
收藏 引用
Tsinghua Science and Technology 2022年 第1期27卷 174-185页
作者: Xiaoge Deng Tao Sun Feng Liu Dongsheng Li National Laboratory for Parallel and Distributed Processing(PDL) College of ComputerNational University of Defense TechnologyChangsha 410073China
The proliferation of massive datasets has led to significant interests in distributed algorithms for solving large-scale machine learning ***,the communication overhead is a major bottleneck that hampers the scalabili... 详细信息
来源: 评论
FMCC-RT: a scalable and fine-grained all-reduce algorithm for large-scale SMP clusters
收藏 引用
Science China(Information Sciences) 2025年 第5期68卷 362-379页
作者: Jintao PENG Jie LIU Jianbin FANG Min XIE Yi DAI Zhiquan LAI Bo YANG Chunye GONG Xinjun MAO Guo MAO Jie REN School of Computer Science and Technology National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology National Supercomputer Center in Tianjin School of Computer Science Shaanxi Normal University
All-reduce is a widely used communication technique for distributed and parallel applications typically implemented using either a tree-based or ring-based scheme. Each of these approaches has its own limitations: tre... 详细信息
来源: 评论
CD-Sched: An Automated Scheduling Framework for Accelerating Neural Network Training on Shared Memory CPU-DSP Platforms
CD-Sched: An Automated Scheduling Framework for Accelerating...
收藏 引用
2023 International Conference on Power, Communication, Computing and Networking Technologies, PCCNT 2023
作者: Xiao, Yuanyuan Lai, Zhiquan Li, Dongsheng National Key Laboratory of Parallel and Distributed Processing Computer College National University of Defense Technology Changsha China
DSP holds significant potential for important applications in Deep Neural Networks. However, there is currently a lack of research focused on shared-memory CPU-DSP heterogeneous chips. This paper proposes CD-Sched, an... 详细信息
来源: 评论
Smoothing Point Adjustment-Based Evaluation of Time Series Anomaly Detection  48
Smoothing Point Adjustment-Based Evaluation of Time Series A...
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal processing, ICASSP 2023
作者: Liu, Mingyu Wang, Yijie Xu, Hongzuo Zhou, Xiaohui Li, Bin Wang, Yongjun National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory College of Computer Changsha China
Anomalies in time series appear consecutively, forming anomaly segments. Applying the classical point-based evaluation metrics to evaluate the detection performance of segments leads to considerable underestimation, s... 详细信息
来源: 评论
AFMA-Track: Adaptive Fusion of Motion and Appearance for Robust Multi-object Tracking  27th
AFMA-Track: Adaptive Fusion of Motion and Appearance for ...
收藏 引用
27th International Conference on Pattern Recognition, ICPR 2024
作者: Liao, Wei Luo, Lei Zhang, Chunyuan College of Computer Science and Technology National University of Defence Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory College of Computer Science and Technology National University of Defense Technology Changsha China
Motion and appearance cues play a crucial role in Multi-object Tracking (MOT) algorithms for associating objects across consecutive frames. While most MOT methods prioritize accurate motion modeling and distincti... 详细信息
来源: 评论
Funnel: An Efficient Sparse Attention Accelerator with Multi-Dataflow Fusion  22
Funnel: An Efficient Sparse Attention Accelerator with Multi...
收藏 引用
22nd IEEE International Symposium on parallel and distributed processing with Applications, ISPA 2024
作者: Ma, Shenghong Xu, Jinwei Jiang, Jingfei Wang, Yaohua Li, Dongsheng National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing College of Computer Changsha China
The self-attention mechanism is the core component of Transformer, which provides a powerful ability to understand the sequence context. However, the self-attention mechanism also suffers from a large amount of redund... 详细信息
来源: 评论
Mbapp: Efficient Memory-Balanced Pipeline parallelism for Large Model Fine-Tuning on Commodity GPU Servers  24
Mbapp: Efficient Memory-Balanced Pipeline Parallelism for La...
收藏 引用
5th International Conference on computer Information and Big Data Applications, CIBDA 2024
作者: Liu, Yujie Lai, Zhiquan Li, Dongsheng National Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha410000 China
Large-scale models have demonstrated outstanding performance across various downstream tasks. Pipeline parallelism is essential for fine-tuning large models on commodity GPU servers, as it plays a crucial role in maki... 详细信息
来源: 评论
Communication Analysis for Multidimensional parallel Training of Large-scale DNN Models  25
Communication Analysis for Multidimensional Parallel Trainin...
收藏 引用
25th IEEE International Conferences on High Performance Computing and Communications, 9th International Conference on Data Science and Systems, 21st IEEE International Conference on Smart City and 9th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC/DSS/SmartCity/DependSys 2023
作者: Lai, Zhiquan Hao, Yanqi Li, Shengwei Li, Dongsheng College of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China
Multidimensional parallel training has been widely applied to train large-scale deep learning models like GPT-3. The efficiency of parameter communication among training devices/processes is often the performance bott... 详细信息
来源: 评论
Efficient Large Models Fine-tuning on Commodity Servers via Memory-balanced Pipeline parallelism  25
Efficient Large Models Fine-tuning on Commodity Servers via ...
收藏 引用
25th IEEE International Conferences on High Performance Computing and Communications, 9th International Conference on Data Science and Systems, 21st IEEE International Conference on Smart City and 9th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC/DSS/SmartCity/DependSys 2023
作者: Liu, Yujie Lai, Zhiquan Liu, Weijie Wang, Wei Li, Dongsheng College of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China
Large models have achieved impressive performance in many downstream tasks. Using pipeline parallelism to fine-tune large models on commodity GPU servers is an important way to make the excellent performance of large ... 详细信息
来源: 评论