咨询与建议

限定检索结果

文献类型

  • 19 篇 期刊文献
  • 16 篇 会议

馆藏范围

  • 35 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 24 篇 工学
    • 17 篇 计算机科学与技术...
    • 13 篇 软件工程
    • 8 篇 生物工程
    • 4 篇 生物医学工程(可授...
    • 2 篇 电子科学与技术(可...
    • 2 篇 信息与通信工程
    • 2 篇 控制科学与工程
    • 2 篇 化学工程与技术
    • 1 篇 机械工程
    • 1 篇 仪器科学与技术
    • 1 篇 动力工程及工程热...
    • 1 篇 建筑学
    • 1 篇 土木工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 核科学与技术
  • 23 篇 理学
    • 14 篇 数学
    • 8 篇 生物学
    • 7 篇 统计学(可授理学、...
    • 4 篇 物理学
    • 3 篇 地球物理学
    • 2 篇 化学
    • 1 篇 天文学
    • 1 篇 地质学
  • 7 篇 管理学
    • 6 篇 管理科学与工程(可...
  • 4 篇 法学
    • 4 篇 社会学
  • 3 篇 医学
    • 3 篇 基础医学(可授医学...
    • 3 篇 临床医学
    • 3 篇 药学(可授医学、理...
    • 1 篇 公共卫生与预防医...
  • 1 篇 经济学
    • 1 篇 应用经济学

主题

  • 3 篇 convolution
  • 3 篇 optimization
  • 3 篇 coprocessors
  • 3 篇 kernel
  • 2 篇 parallel process...
  • 2 篇 reinforcement le...
  • 2 篇 deep learning
  • 2 篇 deep neural netw...
  • 2 篇 vectors
  • 2 篇 costs
  • 2 篇 instruction sets
  • 1 篇 covid-19
  • 1 篇 polygenic predic...
  • 1 篇 chemical activat...
  • 1 篇 lattices
  • 1 篇 approximation al...
  • 1 篇 message systems
  • 1 篇 magnetic resonan...
  • 1 篇 computational fl...
  • 1 篇 trees (mathemati...

机构

  • 6 篇 parallel computi...
  • 6 篇 parallel computi...
  • 5 篇 parallel computi...
  • 2 篇 school of comput...
  • 2 篇 riken center for...
  • 2 篇 parallel computi...
  • 2 篇 center for space...
  • 2 篇 department of as...
  • 2 篇 division of sola...
  • 2 篇 chugai pharmaceu...
  • 2 篇 engineering mech...
  • 1 篇 parallel computi...
  • 1 篇 iit kharagpur
  • 1 篇 stony brook univ...
  • 1 篇 parallel computi...
  • 1 篇 university of tu...
  • 1 篇 university of co...
  • 1 篇 the ohio state u...
  • 1 篇 eindhoven univer...
  • 1 篇 product architec...

作者

  • 8 篇 mudigere dheevat...
  • 8 篇 kaul bharat
  • 7 篇 das dipankar
  • 6 篇 banerjee kunal
  • 6 篇 dubey pradeep
  • 6 篇 avancha sasikant...
  • 5 篇 kundu abhisek
  • 5 篇 bharat kaul
  • 4 篇 sanchit misra
  • 4 篇 mellempudi navee...
  • 3 篇 kalamkar dhiraj
  • 3 篇 alexander heinec...
  • 3 篇 mikhail smelyans...
  • 3 篇 kiran pamnany
  • 3 篇 pradeep dubey
  • 2 篇 misra sanchit
  • 2 篇 santara anirban
  • 2 篇 benomar othman
  • 2 篇 aasawat tanuj kr
  • 2 篇 shustrov nikita

语言

  • 35 篇 英文
检索条件"机构=Parallel Computing Lab - India"
35 条 记 录,以下是1-10 订阅
排序:
Distributed Hessian-free optimization for deep neural network  31
Distributed Hessian-free optimization for deep neural networ...
收藏 引用
31st AAAI Conference on Artificial Intelligence, AAAI 2017
作者: He, Xi Mudigere, Dheevatsa Smelyanskiy, Mikhail Takáč, Martin Industrial and Systems Engineering Lehigh University United States Parallel Computing Lab Intel Labs India Parallel Computing Lab Intel Labs SC United States
Training deep neural network is a high dimensional and a highly non-convex optimization problem. In this paper, we revisit Hessian-free optimization method for deep networks with negative curvature direction detection... 详细信息
来源: 评论
MIXED PRECISION TRAINING OF CONVOLUTIONAL NEURAL NETWORKS USING INTEGER OPERATIONS  6
MIXED PRECISION TRAINING OF CONVOLUTIONAL NEURAL NETWORKS US...
收藏 引用
6th International Conference on Learning Representations, ICLR 2018
作者: Das, Dipankar Mellempudi, Naveen Mudigere, Dheevatsa Kalamkar, Dhiraj Avancha, Sasikanth Banerjee, Kunal Sridharan, Srinivas Vaidyanathan, Karthik Kaul, Bharat Georganas, Evangelos Heinecke, Alexander Dubey, Pradeep Corbal, Jesus Shustrov, Nikita Dubtsov, Roma Fomenko, Evarist Pirogov, Vadim Parallel Computing Lab Intel Labs India Parallel Computing Lab Intel Labs SC Product Architecture Group Intel OR United States Software Services Group Intel OR United States
The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand... 详细信息
来源: 评论
Translation validation of loop and arithmetic transformations in the presence of recurrences  2016
Translation validation of loop and arithmetic transformation...
收藏 引用
17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools and Theory for Embedded Systems, LCTES 2016
作者: Banerjee, Kunal Mandal, Chittaranjan Sarkar, Dipankar Department of Computer Science and Engineering Indian Institute of Technology Kharagpur India Intel Parallel Computing Lab Bangalore India
Compiler optimization of array-intensive programs involves extensive application of loop transformations and arithmetic transformations. Hence, translation validation of array-intensive programs requires manipulation ... 详细信息
来源: 评论
Blackout: Speeding up recurrent neural network language models with very large vocabularies  4
Blackout: Speeding up recurrent neural network language mode...
收藏 引用
4th International Conference on Learning Representations, ICLR 2016
作者: Ji, Shihao Vishwanathan, S.V.N. Satish, Nadathur Anderson, Michael J. Dubey, Pradeep Parallel Computing Lab. Intel India Univ. of California Santa Cruz United States
We propose BlackOut, an approximation algorithm to efficiently train massive recurrent neural network language models (RNNLMs) with million word vocabularies. BlackOut is motivated by using a discriminative loss, and ... 详细信息
来源: 评论
Lattice QCD on Intel® Xeon Phi™ coprocessors
Lattice QCD on Intel® Xeon Phi™ coprocessors
收藏 引用
28th International Supercomputing Conference on Supercomputing, ISC 2013
作者: Joó, Bálint Kalamkar, Dhiraj D. Vaidyanathan, Karthikeyan Smelyanskiy, Mikhail Pamnany, Kiran Lee, Victor W. Dubey, Pradeep Watson III, William Thomas Jefferson National Accelerator Facility Newport News VA United States Parallel Computing Lab. Intel Corporation Bangalore India Parallel Computing Lab. Intel Corporation Santa Clara CA United States
Lattice Quantum Chromodynamics (LQCD) is currently the only known model independent, non perturbative computational method for calculations in the theory of the strong interactions, and is of importance in studies of ... 详细信息
来源: 评论
Data-race detection: The missing piece for an end-to-end semantic equivalence checker for parallelizing transformations of array-intensive programs  3
Data-race detection: The missing piece for an end-to-end sem...
收藏 引用
3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY 2016
作者: Banerjee, Kunal Banerjee, Soumyadip Sarkar, Santonu Dept of Computer Sc and Engg IIT Kharagpur India Dept of CSIS BITS Pilani-Goa India Intel Parallel Computing Lab. Bangalore India
The parallelizing transformation (hand-crafted or compiler-assisted) is error prone as it is often performed without verifying any semantic equivalence with the sequential counterpart. Even when the parallel program c... 详细信息
来源: 评论
PRACTICAL MASSIVELY parallel MONTE-CARLO TREE SEARCH APPLIED TO MOLECULAR DESIGN  9
PRACTICAL MASSIVELY PARALLEL MONTE-CARLO TREE SEARCH APPLIED...
收藏 引用
9th International Conference on Learning Representations, ICLR 2021
作者: Yang, Xiufeng Aasawat, Tanuj Kr Yoshizoe, Kazuki Chugai Pharmaceutical Co. Ltd Japan Parallel Computing Lab - India Intel Labs India RIKEN Center for Advanced Intelligence Project Japan
It is common practice to use large computational resources to train neural networks, known from many examples, such as reinforcement learning applications. However, while massively parallel computing is often used for... 详细信息
来源: 评论
Ternary Residual Networks
arXiv
收藏 引用
arXiv 2017年
作者: Kundu, Abhisek Banerjee, Kunal Mellempudi, Naveen Mudigere, Dheevatsa Das, Dipankar Kaul, Bharat Dubey, Pradeep Parallel Computing Lab Bangalore India Parallel Computing Lab Santa ClaraCA United States
Sub-8-bit representation of DNNs incur some discernible loss of accuracy despite rigorous (re)training at low-precision. Such loss of accuracy essentially makes them equivalent to a much shallower counterpart, diminis... 详细信息
来源: 评论
AUTOSPARSE: TOWARDS AUTOMATED SPARSE TRAINING OF DEEP NEURAL NETWORKS
arXiv
收藏 引用
arXiv 2023年
作者: Kundu, Abhisek Mellempudi, Naveen K. Vooturi, Dharma Teja Kaul, Bharat Dubey, Pradeep Parallel Computing Lab Intel Labs India
Sparse training is emerging as a promising avenue for reducing the computational cost of training neural networks. Several recent studies have proposed pruning methods using learnable thresholds to efficiently explore... 详细信息
来源: 评论
Mixed low-precision deep learning inference using dynamic fixed point
arXiv
收藏 引用
arXiv 2017年
作者: Mellempudi, Naveen Kundu, Abhisek Das, Dipankar Mudigere, Dheevatsa Kaul, Bharat Parallel Computing Lab Intel Labs Bangalore India
We propose a cluster-based quantization method to convert pre-trained full precision weights into ternary weights with minimal impact on the accuracy. In addition we also constrain the activations to 8-bits thus enabl... 详细信息
来源: 评论