咨询与建议

限定检索结果

文献类型

  • 1 篇 期刊文献

馆藏范围

  • 1 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 1 篇 工学
    • 1 篇 电气工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 信息与通信工程
    • 1 篇 计算机科学与技术...
    • 1 篇 软件工程
    • 1 篇 网络空间安全

主题

  • 1 篇 tp181
  • 1 篇 zero-copy memory...
  • 1 篇 transformer infe...
  • 1 篇 three-tier sched...
  • 1 篇 fast model loadi...

机构

  • 1 篇 school of non-co...
  • 1 篇 national researc...
  • 1 篇 national superco...
  • 1 篇 state key labora...
  • 1 篇 zhejiang lab

作者

  • 1 篇 zhao yulong
  • 1 篇 zhang yaguang
  • 1 篇 liu xin
  • 1 篇 wang yizhuo
  • 1 篇 shen wenyuan
  • 1 篇 wu chunzhi
  • 1 篇 fan hao
  • 1 篇 qin yi
  • 1 篇 zhang lufei
  • 1 篇 fang hankang

语言

  • 1 篇 英文
检索条件"主题词=Transformer inference optimization"
1 条 记 录,以下是1-10 订阅
排序:
Minimizing transformer inference overhead using controlling element on Shenwei AI accelerator
收藏 引用
Frontiers of Information Technology & Electronic Engineering 2025年 第4期26卷 605-622页
作者: Zhao, Yulong Wu, Chunzhi Wang, Yizhuo Zhang, Lufei Zhang, Yaguang Shen, Wenyuan Fan, Hao Fang, Hankang Qin, Yi Liu, Xin State Key Laboratory of Mathematical Engineering and Advanced Computing Wuxi China School of Non-Commissioned Officer Space Engineering University Beijing China National Supercomputing Center in Wuxi Wuxi China Zhejiang Lab Hangzhou China National Research Centre of Parallel Computer Engineering and Technology Beijing China
transformer models have become a cornerstone of various natural language processing (NLP) tasks. However, the substantial computational overhead during the inference remains a significant challenge, limiting their dep... 详细信息
来源: 评论