检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2,872 篇 会议
64 册 图书
45 篇 期刊文献

馆藏范围

2,980 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

2,088 篇 工学
- 1,866 篇 计算机科学与技术...
- 969 篇 软件工程
- 351 篇 电气工程
- 271 篇 信息与通信工程
- 267 篇 电子科学与技术（可...
- 109 篇 控制科学与工程
- 76 篇 机械工程
- 63 篇 生物工程
- 50 篇 仪器科学与技术
- 48 篇 生物医学工程（可授...
- 41 篇 动力工程及工程热...
- 37 篇 光学工程
- 33 篇 建筑学
- 30 篇 材料科学与工程（可...
- 30 篇 土木工程
- 25 篇 化学工程与技术
- 25 篇 交通运输工程
- 24 篇 网络空间安全
- 23 篇 安全科学与工程
601 篇 理学
- 397 篇 数学
- 115 篇 物理学
- 68 篇 生物学
- 62 篇 系统科学
- 41 篇 化学
- 32 篇 统计学（可授理学、...
239 篇 管理学
- 160 篇 管理科学与工程(可...
- 101 篇 图书情报与档案管...
- 72 篇 工商管理
55 篇 医学
- 48 篇 临床医学
25 篇 经济学
- 25 篇 应用经济学
21 篇 法学
15 篇 文学
14 篇 农学
4 篇 军事学
3 篇 教育学
1 篇 艺术学

主题

365 篇 parallel process...
190 篇 graphics process...
170 篇 computer archite...
135 篇 parallel archite...
121 篇 graphics process...
113 篇 hardware
106 篇 parallel algorit...
104 篇 parallel process...
83 篇 computational mo...
79 篇 instruction sets
78 篇 image processing
75 篇 signal processin...
70 篇 multicore proces...
69 篇 parallel program...
68 篇 field programmab...
63 篇 concurrent compu...
63 篇 gpu
62 篇 algorithm design...
62 篇 kernel
60 篇 optimization

机构

9 篇 natl univ def te...
6 篇 hosei univ dept ...
6 篇 school of comput...
6 篇 inria rennes
6 篇 national laborat...
5 篇 college of compu...
5 篇 univ aizu dept c...
5 篇 college of compu...
5 篇 karlsruhe instit...
5 篇 city university ...
5 篇 st francis xavie...
4 篇 queens univ belf...
4 篇 nanyang technol ...
4 篇 chinese acad sci...
4 篇 univ chinese aca...
4 篇 hainan internati...
4 篇 department of co...
4 篇 universidad carl...
4 篇 sun yat-sen univ...
4 篇 institute of com...

作者

11 篇 jack dongarra
8 篇 roman wyrzykowsk...
8 篇 quintana-orti en...
7 篇 hannig frank
7 篇 teich juergen
7 篇 nakano koji
7 篇 konrad karczewsk...
6 篇 ito yasuaki
6 篇 liu jie
6 篇 carretero jesus
6 篇 peng shietung
6 篇 li yamin
6 篇 chu wanming
6 篇 wang gang
5 篇 dongarra jack
5 篇 wanlei zhou
5 篇 qian depei
5 篇 namyst raymond
5 篇 ewa deelman
5 篇 dolz manuel f.

语言

2,938 篇 英文
32 篇 其他
11 篇 中文
2 篇 俄文

检索条件"任意字段=7th International Conference on Algorithms and Architectures for Parallel Processing"

共 2981 条记录，以下是151-160 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Migration and Tuning of Software Prefetching for Sunway Multi-Core Processor 7

Migration and Tuning of Software Prefetching for Sunway Mult...

引用

7th international conference on Intelligent Computing and Signal processing, ICSP 2022

作者： Gao, Xiuwu Jiang, Jun Huang, Liangming Wei, Hongmei Jiangnan Institute of Computing Technology Wuxi China National Research Center of Parallel Computer Engineering and Technology Beijing China

ISBN: (纸本)9781665478571

Data prefetching is a widely used technique to alleviate "memory wall"problem by fetching the data that may be touched in the near future in advance. Generally, data prefetching is classified into hardware prefetching and software prefetching. Compared to hardware prefetching, software prefetching is more flexible, and typically achieves higher prefetch accuracy. Currently, Sunway multiple-core processor only supports hardware prefetching. To study how software prefetching perform on Sunway processor, in this paper, we first migrate the software data prefetching in the GCC complier to Sunway processor. then we tune the loop-level prefetching cost model according to Sunway processor's hardware features. Finally, we conduct experiments to evaluate the effectiveness of the tuned software prefetching. Results show that, compared to the baseline where no prefetching is applied, software prefetching delivers an average speedup of 1.08x (up to 2.46x) and 1.16x (up to 1.88x) for SPECint 2006 and SPECfp 2006 benchmark suite, respectively. Moreover, software prefetching outperforms hardware prefetching for both benchmark suites. this demonstrates the efficacy of software data prefetching. © 2022 IEEE.

关键词： Degradation Costs Multicore processing Prefetching Computational modeling Benchmark testing Signal processing

来源：评论

学校读者我要写书评

暂无评论

SwinMixer Detector: A Two-Level Traffic Flow Detection System Based on Swin Transformer and MLP-Mixer 24th

SwinMixer Detector: A Two-Level Traffic Flow Detection Syste...

引用

24th international conference on algorithms and architectures for parallel processing, ICA3PP 2024

作者： Lu, Lu Fang, Menghao Li, Xia Li, Minghui Lei, Tianwei Kong, Zixiao School of Cyber Science and Engineering University of International Relations Beijing100091 China Institute of Disaster and Emergency Medicine Tianjin University Tianjin300072 China Marine Engineering of College Dalian Maritime University Dalian116026 China School of Computer Science and Technology Beijing Institute of Technology Beijing100081 China

ISBN: (纸本)9789819615414

this paper explores the problem of boundary data classification ambiguity that arises when machine learning techniques are applied in the field of intrusion detection. the features and attributes of the boundary data place them in the boundaries or boundaries between different categories in the classification decision, and these data points may be close to the decision boundaries of different categories in the feature space, resulting in relative uncertainty or some ambiguity in their classification. To address this problem, this study innovatively proposes a secondary detection system called the SwinMixer Detector, which utilises the Swin Transformer and MLP-Mixer architecture. the system consists of a primary triage detection platform utilizing Swin Transformer classifier and a secondary detection platform utilizing MLP-Mixer, which can effectively solve the problem of fuzzy classification of boundary data. By validating on the SR-BH2020 dataset and the CIC-DDOS2019 dataset, the method in this paper achieves a precision of 96.40% and 95.81%, respectively, which provides a new research idea and solution for the field of intrusion detection. © the Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Mixers (machinery)

来源：评论

学校读者我要写书评

暂无评论

Multi-band discriminant speech synthesis analysis based on Natural Language processing 7

Multi-band discriminant speech synthesis analysis based on N...

引用

7th international conference on Intelligent Computing and Signal processing, ICSP 2022

作者： Liu, Songjia Yao, Yuancheng Liu, Binyu Zhu, Rui Ren, Jing Southwest University of Science and Technology School of Information Engineering Mianyang China

ISBN: (纸本)9781665478571

With the rapid development of neural networks and deep learning, speech synthesis technology has been significantly improved. the end-to-end speech synthesis systems based on deep learning have been able to synthesize speech with naturalness close to the original human pronunciation. However, the existing end-to-end speech synthesis system model is complex, and it is impossible to achieve real-time speech synthesis on devices with low computing power. In this paper, a multi-band discriminative autoregressive speech synthesis model is proposed based on natural language processing. the model uses an encoder-decoder architecture with attention mechanism, which is mainly composed of DSC-GRN modules. Stacking multiple convolutions with different expansion coefficients by gating the residual structure can increase the receptive field so that the encoder and decoder can pay attention to the context information with a longer time span, which can improve the performance of the model. the whole model uses full convolution architecture and can be trained in parallel. Compared with the existing autoregressive model, the number of parameters of the model is greatly reduced. the synthesis speed is improved, and the quality of synthesized speech is ensured. © 2022 IEEE.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

A Novel Multi-objective Neural Architecture Search Algorithm via Gaussian Progress Sampling 5

A Novel Multi-objective Neural Architecture Search Algorithm...

引用

5th IEEE international conference on Artificial Intelligence and Big Data, ICAIBD 2022

作者： Chen, Xuehui Jiang, Jingfei Niu, Xin Pan, Hengyue Dong, Peijie Wei, Zimian National University of Defense Technology National Laboratory for Parallel and Distributed Processing Changsha China

ISBN: (纸本)9781665499132

Multi-objective neural architecture search (NAS) algorithms aim to automatically search the neural architecture suitable for different computing power platforms by using multi-objective optimization methods. the LEMONADE algorithm, which is a representative algorithm of multi-objective NAS algorithms, maintains a population of networks on an approximation of the Pareto front of the multiple objectives, such as predictive performance, number of parameters or FLOPs. To address the irrationality and repeatability of only sampling based on cheap objectives in LEMONADE, we propose a novel multi-objective neural architecture search algorithm via Gaussian Process sampling, dubbed GP-LEMONADE. Meanwhile, to make the sampling process more efficient, we design the online predictor based on Gaussian Process to predict expensive objectives, and sample candidate networks by combining cheap objectives and expensive objectives, so as to ensure the rationality and efficiency of sampling. Experiments show that the GP-LEMONADE algorithm evolves 100 generations and obtains the SOTA model with 3.98% test error. this process only takes 7.38 GPU days, which is 26.75 GPU days shorter than that of LEMONADE. Our methods have improved the performance of the LEMONADE algorithm and ensured the rationality and efficiency of sampling during the evolution, which effectively improving the search efficiency of multi-objective NAS algorithms. © 2022 IEEE.

关键词： Multiobjective optimization

来源：评论

学校读者我要写书评

暂无评论

IoT Revolutionization using Federated Learning and Decentralized AI on Edge Devices 7

IoT Revolutionization using Federated Learning and Decentral...

引用

7th international conference on Contemporary Computing and Informatics, IC3I 2024

作者： Chaudhary, Diwakar Verma, S.K. Pundir, Sumit Ranganathan, S. Umadevi, V. Srinivasan, K. Mangalmay Institute of Management & Technology Department of Management Uttar Pradesh Greater Noida India Noida International University School of Business Management Uttar Pradesh Greater Noida India Graphic Era Deemed to be University Department of Computer Science and Engineering Uttarakhand Dehradun India Affiliated to Bharathidasan University Department of Computer Science Tamil Nadu Trichy India Saveetha College of Liberal Arts and Sciences Simats Department of Commerce General Tamil Nadu Chennai India

ISBN: (纸本)9798350350067

this research proposes a novel strategy for addressing the limitations of centralized architectures in IoT data processing. Traditional systems experience significant bandwidth use, privacy difficulties, and scalability issues. To address these difficulties, the proposed approach combines Federated Learning (FL) with Decentralized AI on Edge Devices (DAI-ED). By decentralizing processing duties and utilizing FL methods, data is kept local to edge devices, lowering latency and improving privacy. Deployment entails choosing AI-capable edge devices and configuring them for seamless integration. Secure communication techniques provide privacy throughout FL operations. Model optimization approaches help to improve efficiency even further. Compared to centralized systems, the results show higher model accuracy, decreased communication overhead, and increased resource utilization. the proposed system has an overall accuracy of 95%, with edge device accuracy at 94.5% and central server accuracy at 96%. Communication overhead during the training and inference phases is greatly minimized, with the proposed system requiring 150 MB per device against 400 MB in centralized systems. the proposed system has low memory consumption and computational complexity, but it is very scalable. © 2024 IEEE.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

A PRAM-based PIM Macro Using the Gilbert Multiplier-based Active Feedback and Input-aware SAR ADC 20

A PRAM-based PIM Macro Using the Gilbert Multiplier-based Ac...

引用

20th international SoC Design conference (ISOCC)

作者： Yu, Seongyeon Hur, Namwook Kim, Wansun Cho, Mann-Ho Sohn, Hyunchul Suh, Joonki Jeong, Hongsik Yoon, Jong-Hyeok DGIST Dept Elect Engn & Comp Sci Daegu South Korea UNIST Grad Sch Semicond Mat & Devices Engn Ulsan South Korea Yonsei Univ Dept Mat Sci & Engn Seoul South Korea Yonsei Univ Dept Phys Seoul South Korea

ISBN: (纸本)9798350327038

this paper addresses the challenges of voltage-sensing read operations on a PRAM-based 1S1R crossbar array, which can be used for MAC operations in processing-in-memory architectures. the nonlinearity of the readout voltage due to the parallel resistance of the accessed cells leads to a narrow sensing margin. Moreover, the SAR ADC widely used in the readout circuits for area and power efficiency leads to high latency. To overcome these challenges, we introduce active feedback using a Gilbert multiplier to the bitline (BL) structure to regulate the resistance of the BL transmission gate and an input-aware SAR logic to optimize the conversion time. the proposed macro design in a 65nm process achieves a 3.79x voltage sensing margin with a Gilbert multiplier under a 3x3 kernel convolution operation. Furthermore, a 6-bit input-aware SAR ADC reduces average latency from 6 to 4.4 clock cycles.

关键词： PRAM 1S1R crossbar array processing-in-memory Gilbert multiplier SAR ADC

来源：评论

学校读者我要写书评

暂无评论

Evaluating Micro-batch and Data Frequency for Stream processing Applications on Multi-cores 30

Evaluating Micro-batch and Data Frequency for Stream Process...

引用

30th Euromicro international conference on parallel, Distributed and Network-Based processing (PDP)

作者： Garcia, Adriano Marques Griebler, Dalvan Schepke, Claudio Fernandes, Luiz Gustavo L. Pontifical Catholic Univ Rio Grande Sul PUCRS Sch Technol Porto Alegre RS Brazil Tres Maio Fac SETREM Lab Adv Res Cloud Comp LARCC Tres De Maio Brazil Fed Univ Pampa UNIPAMPA Alegrete Brazil

ISBN: (纸本)9781665469586

In stream processing, data arrives constantly and is often unpredictable. It can show large fluctuations in arrival frequency, size, complexity, and other factors. these fluctuations can strongly impact application latency and throughput, which are critical factors in this domain. therefore, there is a significant amount of research on self-adaptive techniques involving elasticity or micro-batching as a way to mitigate this impact. However, there is a lack of benchmarks and tools for helping researchers to investigate micro-batching and data stream frequency implications. In this paper, we extend a benchmarking framework to support dynamic micro-batching and data stream frequency management. We used it to create custom benchmarks and compare latency and throughput aspects from two different parallel libraries. We validate our solution through an extensive analysis of the impact of micro-batching and data stream frequency on stream processing applications using Intel TBB and FastFlow, which are two libraries that leverage stream parallelism on multi-core architectures. Our results demonstrated up to 33% throughput gain over latency using micro-batches. Additionally, while TBB ensures lower latency, FastFlow ensures higher throughput in the parallel applications for different data stream frequency configurations.

关键词： Fluctuations Distributed databases Benchmark testing parallel processing Elasticity throughput Libraries

来源：评论

学校读者我要写书评

暂无评论

Prototyping of Low-Cost Configurable Sparse Neural processing Unit with Buffer and Mixed-Precision Reshapeable MAC Array 28

Prototyping of Low-Cost Configurable Sparse Neural Processin...

引用

IEEE 28th international conference on parallel and Distributed Systems (IEEE ICPADS)

作者： Wu, Binyi Furtner, Wolfgang Waschneck, Bernd Mayr, Christian Tech Univ Dresden Fac Elect & Comp Engn Dresden Germany Infineon Technol Div Power & Sensor Syst Munich Germany Tech Univ Dresden Ctr Tactile Internet Human In The Loop CeTI Dresden Germany

ISBN: (纸本)9781665473156

More recently, it has become possible to run deep learning algorithms on edge devices such as microcontrollers due to continuous improvements in neural network optimization algorithms such as quantization and neural architecture search. Nonetheless, most of the embedded hardware available today still falls short of the requirements of running deep neural networks. As a result, specialized processors have emerged to improve the inference efficiency of deep learning algorithms. However, most are not for edge applications that require efficient and low-cost hardware. therefore, we design and prototype a low-cost configurable sparse Neural processing Unit (NPU). the NPU has a built-in buffer and a reshapable mixed-precision multiply-accumulator (MAC) array. the computing and memory resources of the NPU are parameterized, and different NPUs can be derived. Besides, users can also configure the NPU at runtime to fully utilize the resources. In our experiments, the 200MHz NPU with only 32 MACs is more than 32 times faster than the 400MHz STM32H7 when inferring MobileNet-V1. Besides, the yielded NPUs can achieve roofline or even beyond roofline performance. the buffer and reshapeable MAC array push the NPU's attainable performance to the roofline, while the feature of supporting sparsity allows the NPU to obtain performance beyond the roofline.

关键词： low-cost configurable neural processing unit mixed-precision sparsity

来源：评论

学校读者我要写书评

暂无评论

Optimized NLP Model with Caching, parallel processing and Streamlined I/O 7

Optimized NLP Model with Caching, Parallel Processing and St...

引用

7th international conference on Circuit Power and Computing Technologies, ICCPCT 2024

作者： Rajalakshmi, B. Rani, C Nagamanjula Vijayasekaran, G. Kumar, S Mahesh Hegde, Shabrish B Srikanth, Sai New Horizon College of Engineering Computer Science and Engineering Bangalore India

ISBN: (纸本)9798350372816

this research explores optimization strategies employed within ***, an advanced question generation system driven by natural language processing (NLP) and machine learning (ML) algorithms. the study delves into three pivotal areas of optimization: caching, parallel processing, and streamlining I/O operations. By implementing caching mechanisms using Python's ***-cache, parallelization via *** and multiprocessing modules, and streamlining I/O operations through efficient batching and buffering techniques, significant enhancements in time and space complexity are realized. the effectiveness of these optimization approaches is empirically evaluated, showcasing their profound impact on the efficiency and scalability of the *** platform. this research contributes to the advancement of question generation systems, providing valuable insights into effective optimization strategies for NLP applications, such as question generation, semantic analysis, and language modeling. © 2024 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

TP-BFT: A Faster Asynchronous BFT Consensus with parallel Structure 24th

TP-BFT: A Faster Asynchronous BFT Consensus with Parallel S...

引用

24th international conference on algorithms and architectures for parallel processing, ICA3PP 2024

作者： Ye, Shunliang Chen, Qi Xiao, Fuan Ke, Zhihui Yang, Guoyu Ma, Huawei Institute of Artificial Intelligence Guangzhou University Guangzhou510006 China Guangdong Provincial Key Laboratory of Blockchain Security Guangzhou University Guangzhou510006 China

ISBN: (纸本)9789819615445

the massive success of blockchains has significantly catalyzed interest in the extensive deployment of practical asynchronous Byzantine fault-tolerant (BFT) consensus protocols across wide area networks. However, existing asynchronous consensus protocols suffer from high communication cost and low throughput, which can’t meet the high-performance requirements of blockchain platforms. We identify two key factors that hinder the efficiency of asynchronous BFT protocols: (1) Reliable broadcast protocols incur high communication costs;(2) the broadcast and agreement phases are executed strictly sequentially, resulting in wasted bandwidth resources. To address above issues, we propose the lightweight provable broadcast and a novel asynchronous BFT protocol called TP-BFT. TP-BFT implements a lightweight broadcast followed by a recovery mechanism and supports parallel broadcast, agreement, and recovery phases. this approach offers two key advantages: (1) it mitigates the impact of low-bandwidth nodes on the efficiency of the broadcast phase;(2) it fully utilizes the bandwidth during the agreement phase to enhance the protocol’s throughput. We implement TP-BFT and conduct experiments with HoneyBadger and Dumbo. the results demonstrate that TP-BFT improves throughput by 106% and reduces latency by 64% compared to Dumbo. © the Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Bandwidth

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共299页 << < 12 13 14 15 16 17 18 19 20 21 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：