咨询与建议

限定检索结果

文献类型

  • 2 篇 期刊文献
  • 2 篇 会议

馆藏范围

  • 4 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 4 篇 工学
    • 4 篇 计算机科学与技术...
    • 3 篇 信息与通信工程
    • 2 篇 电气工程
  • 1 篇 管理学
    • 1 篇 管理科学与工程(可...

主题

  • 4 篇 pipeline inferen...
  • 2 篇 model deployment
  • 1 篇 throughput
  • 1 篇 optimization
  • 1 篇 computational mo...
  • 1 篇 large model
  • 1 篇 edge intelligenc...
  • 1 篇 pipelines
  • 1 篇 kernel
  • 1 篇 mobile computing
  • 1 篇 llm inference ac...
  • 1 篇 heteroscedastic ...
  • 1 篇 deep learning (d...
  • 1 篇 heterogeneous mu...
  • 1 篇 bayesian optimiz...
  • 1 篇 serverless
  • 1 篇 systems-on-chip ...
  • 1 篇 latency aware
  • 1 篇 convolutional ne...
  • 1 篇 mobile handsets

机构

  • 1 篇 univ chinese aca...
  • 1 篇 yale univ new ha...
  • 1 篇 univ macau taipa...
  • 1 篇 chinese acad sci...
  • 1 篇 tencent ai lab p...
  • 1 篇 xidian univ sch ...
  • 1 篇 beijing univ pos...
  • 1 篇 beijing univ pos...
  • 1 篇 southern univ sc...
  • 1 篇 univ virginia ch...
  • 1 篇 hong kong polyte...
  • 1 篇 key lab smart hu...

作者

  • 2 篇 wang jingyu
  • 2 篇 yang xiang
  • 2 篇 sun haifeng
  • 2 篇 qi qi
  • 2 篇 liao jianxin
  • 1 篇 zhang bowen
  • 1 篇 ma ruilong
  • 1 篇 lin chengmin
  • 1 篇 zhuang zirui
  • 1 篇 guo song
  • 1 篇 lv wenkai
  • 1 篇 shen haiying
  • 1 篇 hu linwei
  • 1 篇 tang yingfei
  • 1 篇 wang zhenyi
  • 1 篇 xu zikang
  • 1 篇 peng shijie
  • 1 篇 luo shutian
  • 1 篇 yang pengfei
  • 1 篇 wang quan

语言

  • 4 篇 英文
检索条件"主题词=pipeline inference"
4 条 记 录,以下是1-10 订阅
排序:
PICO: pipeline inference Framework for Versatile CNNs on Diverse Mobile Devices
收藏 引用
IEEE TRANSACTIONS ON MOBILE COMPUTING 2024年 第4期23卷 2712-2730页
作者: Yang, Xiang Xu, Zikang Qi, Qi Wang, Jingyu Sun, Haifeng Liao, Jianxin Guo, Song Beijing Univ Posts & Telecommun State Key Lab Networking & Switching Technol Beijing 100876 Peoples R China Hong Kong Polytech Univ Dept Comp Kowloon Hong Kong Peoples R China
Distributing the inference of convolutional neural network (CNN) to multiple mobile devices has been studied in recent years to achieve real-time inference without losing accuracy. However, how to map CNN to devices r... 详细信息
来源: 评论
Flexi-BOPI: Flexible granularity pipeline inference with Bayesian optimization for deep learning models on HMPSoC
收藏 引用
INFORMATION SCIENCES 2024年 678卷
作者: Wang, Zhenyi Yang, Pengfei Zhang, Bowen Hu, Linwei Lv, Wenkai Lin, Chengmin Wang, Quan Xidian Univ Sch Comp Sci & Technol Xian 710071 Peoples R China Key Lab Smart Human Comp Interact & Wearable Techn Xian 710071 Peoples R China Tencent AI Lab Shenzhen 518000 Peoples R China
To achieve high -throughput deep learning (DL) model inference on heterogeneous multiprocessor systems -on -chip (HMPSoC) platforms, the use of pipelining for the simultaneous utilization of multiple resources has eme... 详细信息
来源: 评论
Poster: PipeLLM: pipeline LLM inference on Heterogeneous Devices with Sequence Slicing  23
Poster: PipeLLM: Pipeline LLM Inference on Heterogeneous Dev...
收藏 引用
ACM SIGCOMM Conference (SIGCOMM)
作者: Ma, Ruilong Wang, Jingyu Qi, Qi Yang, Xiang Sun, Haifeng Zhuang, Zirui Liao, Jianxin Beijing Univ Posts & Telecommun Beijing Beijing Peoples R China
Large Language Models (LLMs) has fostered the creation of innovative requirements. Locally deployed LLMs for micro-enterprise mitigates potential issues such as privacy infringements and sluggish response. However, th... 详细信息
来源: 评论
Quart: Latency-Aware FaaS System for Pipelining Large Model inference  44
Quart: Latency-Aware FaaS System for Pipelining Large Model ...
收藏 引用
44th IEEE International Conference on Distributed Computing Systems (ICDCS)
作者: Lin, Yanying Li, Yanbo Peng, Shijie Tang, Yingfei Luo, Shutian Shen, Haiying Xu, Chengzhong Ye, Kejiang Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Southern Univ Sci & Technol Shenzhen Peoples R China Yale Univ New Haven CT USA Univ Virginia Charlottesville VA USA Univ Macau Taipa Macao Peoples R China
pipeline parallelism is a key mechanism to ensure the performance of large model serving systems. These systems need to deal with unpredictable online workloads with low latency and high goodput. However, due to the s... 详细信息
来源: 评论