检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

10,911 篇 会议
230 篇 期刊文献
173 册 图书
3 篇 学位论文

馆藏范围

11,317 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

6,677 篇 工学
- 6,107 篇 计算机科学与技术...
- 2,914 篇 软件工程
- 1,181 篇 电气工程
- 1,067 篇 信息与通信工程
- 482 篇 电子科学与技术（可...
- 358 篇 控制科学与工程
- 176 篇 仪器科学与技术
- 152 篇 机械工程
- 117 篇 动力工程及工程热...
- 113 篇 生物医学工程（可授...
- 101 篇 生物工程
- 87 篇 光学工程
- 86 篇 建筑学
- 68 篇 土木工程
- 68 篇 网络空间安全
- 65 篇 化学工程与技术
- 61 篇 材料科学与工程（可...
- 58 篇 安全科学与工程
1,468 篇 理学
- 976 篇 数学
- 311 篇 物理学
- 156 篇 系统科学
- 140 篇 统计学（可授理学、...
- 136 篇 生物学
- 83 篇 化学
897 篇 管理学
- 647 篇 管理科学与工程(可...
- 306 篇 图书情报与档案管...
- 289 篇 工商管理
145 篇 医学
- 120 篇 临床医学
- 62 篇 基础医学(可授医学...
62 篇 经济学
- 62 篇 应用经济学
56 篇 法学
34 篇 农学
26 篇 教育学
18 篇 文学
11 篇 军事学

主题

1,463 篇 parallel process...
584 篇 computer archite...
553 篇 distributed comp...
534 篇 application soft...
530 篇 computational mo...
528 篇 concurrent compu...
342 篇 hardware
319 篇 graphics process...
309 篇 scalability
307 篇 parallel program...
305 篇 graphics process...
304 篇 computer science
285 篇 runtime
263 篇 big data
261 篇 optimization
249 篇 parallel process...
226 篇 throughput
225 篇 processor schedu...
214 篇 resource managem...
209 篇 bandwidth

机构

31 篇 national laborat...
23 篇 science and tech...
21 篇 department of co...
19 篇 national laborat...
18 篇 university of ch...
17 篇 school of comput...
17 篇 institute of com...
16 篇 school of comput...
15 篇 college of compu...
15 篇 georgia inst tec...
15 篇 barcelona superc...
15 篇 intel corporatio...
14 篇 school of comput...
14 篇 chinese acad sci...
14 篇 department of co...
13 篇 tsinghua univ de...
13 篇 ibm thomas j. wa...
13 篇 cent s univ sch ...
13 篇 college of compu...
13 篇 ohio state univ ...

作者

21 篇 jack dongarra
16 篇 badia rosa m.
16 篇 liu jie
15 篇 wang guojun
15 篇 yijie wang
15 篇 a. choudhary
14 篇 anon
14 篇 kurt rothermel
13 篇 jun wang
13 篇 koldehofe boris
13 篇 mencagli gabriel...
13 篇 prodan radu
13 篇 wang yijie
12 篇 fahringer thomas
12 篇 dongsheng li
12 篇 zomaya albert y.
12 篇 navaux philippe ...
12 篇 yong dou
11 篇 fernandes luiz g...
11 篇 guangwen yang

语言

11,188 篇 英文
95 篇 其他
35 篇 中文
2 篇 俄文
2 篇 土耳其文
1 篇 德文

检索条件"任意字段=International Conference on Parallel and Distributed Processing Techniques and Applications"

共 11317 条记录，以下是641-650 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

A New distributed-STBC Scheme for Cooperative Relaying in Wireless Networks 7

A New Distributed-STBC Scheme for Cooperative Relaying in Wi...

引用

7th international conference on Image and Signal processing and their applications, ISPA 2022

作者： Tayakout, Hakim Zohra Bouchibane, Fatma Boutellaa, Elhocine Centre de Développement des Technologies Avancées Division Télécommunication Algiers Algeria

ISBN: (纸本)9781665480420

Space-time encoded communications aim at improving the quality and reliability of a wireless high-data rate link by exploiting both the temporal and spatial signal dimensions. However, these techniques are impeded by typically low profile requirement of the terminals, making impractical the deployment of multiple antennas, more particularly at the relay side. In this paper, we propose a new and simplified distributed space-time block Coding (D-STBC) scheme in which the STBC code is artificially generated by relying on three single-antenna nodes in the cooperative network and using less time slots than the conventional counterpart. The proposed scheme is then generalized to encompass both multiple cooperative relays and multiple receive antennas cases. Furthermore, our proposal is investigated with both decode- and-forward (DaF) and amplify-and-forward (AaF) relaying protocols. At the destination, zero-forcing detector is adopted for the D-STBC decoding, and maximum ratio combining (MRC) of the relays signals is performed when retaining multiple relays configuration. It is shown that, with one relay, our proposed scheme exhibits similar performance in terms of data reliability with the conventional STBC alternative, with the advantage of resorting to only one antenna per transmitting/retransmitting node. Moreover, the viable capabilities of the proposed scheme are fully pointed out with an increased number of relays. © 2022 IEEE.

关键词： Cooperative communication

来源：评论

学校读者我要写书评

暂无评论

Optimal Rate Control for Latency-constrained High Throughput Big Data applications

Optimal Rate Control for Latency-constrained High Throughput...

引用

2022 IEEE international conference on Big Data, Big Data 2022

作者： Xiao, Ziren Harwood, Aaron Rodriguez, Maria A. The University of Melbourne Melbourne Australia

ISBN: (纸本)9781665480451

High performance distributed systems such as distributed stream processing systems and message-passing parallel programs are often deployed on platforms that make use of vanilla TCP/IP communication, which in turn uses the conventional Nagle's algorithm for congestion control. Recent research in Reinforcement Learning (RL) techniques to either replace or control the conventional TCP approach shows promise in achieving a greater degree of performance, especially when the demand for network resources in a multi-tenant platform is highly dynamic and infeasible to model. Existing results are, however, focused on RL for general Internet communication, with the learning objective being some combination of throughput, loss, and latency, and predominately use a continuous action space to adjust the packet rate at the sender. In this work, we propose a coefficient-free RL objective that perfectly matches the data transmission rate to the underlying communication system's bottleneck, which naturally deters packet loss and thereby converges to the ideal throughput even in lock-free and latency-constrained Big Data applications where packets are dropped due to load shedding or exceeding latency thresholds. Our results compare favorably to other state-of-the-art objective functions using an RL framework, e.g., providing up to 48% reduction in packet loss while obtaining up to a 4% increase in overall throughput when packet latency is highly constrained. © 2022 IEEE.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

An Advanced Ensemble Directionality Pattern (EDP) based Block Ensemble Neural Network (BENN) Classification Model for Face Recognition System

An Advanced Ensemble Directionality Pattern (EDP) based Bloc...

引用

2022 IEEE international conference on distributed Computing and Electrical Circuits and Electronics, ICDCECE 2022

作者： Harshitha, C.J. Bharathi, R.K. Jss Science and Technology University Dept. of Computer Applications Mysuru India

ISBN: (纸本)9781665483162

Developing the face detection and recognition system is one of the most demanding and challenging tasks in the recent days. For this purpose, there are different types of image processing techniques have been developed in the conventional works, which are mainly focusing on detecting the facial images under occlusions. Still, it limits with the major problems of high complexity in algorithm design, does not have the ability to handle large dimensional data, requires more time consumption for training the model, and increased misclassification results. Thus, this paper intends to develop an efficient face recognition system for detecting the occluded faces by implementing an advanced image processing techniques using COFW-100 dataset which includes 507 occluded images. Here, the Gaussian filter technique is utilized to improve the overall quality of original image by suppressing the noise/artifacts. After that, an Ensemble Directionality Pattern (EDP) extraction technique is applied to extract the texture features from the normalized image by estimating the similarity between the pixel intensity in different angles of projection plane. It is mainly used to obtain the clear features of the person in the cell of each image frame and, this type of feature extraction helps to increase the accuracy of overall classification system. Then, the Block Ensemble Neural Network (BENN) classification model is deployed to accurately detect the occluded faces by processing the image with separate patterns of features. During experimentation, the performance of both existing and proposed techniques is validated and compared by using various evaluation metrics. Then, the results show that the proposed technique outperforms the other approaches by accurately detecting the face based on the image patterns with separate blocks. © 2022 IEEE.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Misty: Microservice-Based Streaming Trajectory Similarity Search 20th

Misty: Microservice-Based Streaming Trajectory Similarity Se...

引用

20th international conference on Service-Oriented Computing (ICSOC)

作者： Tao, Jiachun Pan, Zhicheng Fang, Junhua Chao, Pingfu Zhao, Pengpeng Xu, Jiajie Soochow Univ Dept Comp Sci & Technol Suzhou Peoples R China

ISBN: (纸本)9783031209833;9783031209840

As a fundamental operation in various LBS (Location Based Service) applications, the trajectory similarity search has long been a performance bottleneck in applications like (e.g., traffic optimization and contact tracing). When handling streaming trajectory data, the variable workload and stateful compute requirement are two crucial challenges that further complicate the problem. distributed microservice, a mainstream industrial software design architecture, is the preferred way to address such issues. However, the trajectory instance will inevitably be split under the parallel framework. Therefore, how to distribute trajectory data among the parallel processing tasks in a real-time and lightweight manner is the crux. In this paper, we propose a Microservice-based real-time processing framework for streaming trajectory similarity search, called Misty, which effectively reduces the update cost of the secondary index and supports high scalability. Moreover, on top of Misty, we can build resilient and stateful cloud-native applications. Misty is composed of the assembler, index, coordinator, and executor. Specifically, the assembler and the index module ensure retrieval performance, while the coordinator and executor module enable the system with elastic scaling. Extensive experimental studies on real-world data demonstrate higher query throughput and lower latency over traditional approaches.

关键词： Real-time data processing Trajectory similarity Microservice distributed processing Streaming spatio-temporal data

来源：评论

学校读者我要写书评

暂无评论

Plex: Scaling parallel Lexing with Backtrack-Free Prescanning 35

Plex: Scaling Parallel Lexing with Backtrack-Free Prescannin...

引用

35th IEEE international parallel and distributed processing Symposium (IPDPS)

作者： Li, Le Sato, Shigeyuki Liu, Qiheng Taura, Kenjiro Univ Tokyo Tokyo Japan

ISBN: (纸本)9781665440660

Lexical analysis, which converts input text into a list of tokens, plays an important role in many applications, including compilation and data extraction from texts. To recognize token patterns, a lexer incorporates a sequential computation model - automaton as its basic building component. As such, it is considered difficult to parallelize due to the inherent data dependency. Much work has been done to accelerate lexical analysis through parallel techniques. Unfortunately, existing attempts mainly rely on language-specific remedies for input segmentation, which makes it not only tricky for language extension, but also challenging for automatic lexer generation. This paper presents Plex - an automated tool for generating parallel lexers from user-defined grammars. To overcome the inherent sequentiality, Plex applies a fast prescanning phase to collect context information prior to scanning. To reduce the overheads brought by prescanning, Plex adopts a special automaton, which is derived from that of the scanner, to avoid backtracking behavior and exploits data-parallel techniques. The evaluation under several languages shows that the prescanning overhead is small, and consequently Plex is scalable and achieves 9.8-11.5X speedups using 18 threads.

关键词： Lexical Analysis Finite Automaton parallelism

来源：评论

学校读者我要写书评

暂无评论

A flexible hardware accelerator for morphological filters on FPGA 8

A flexible hardware accelerator for morphological filters on...

引用

8th international conference on Control, Decision and Information Technologies (CoDIT)

作者： Elloumi, Hejer Sellami, Dorra Rabah, Hassan Natl Engn Sch Sfax Elect Dept CEM Res Lab Sfax 3038 Tunisia Digital Res Ctr Sfax Km 10 3021 Tunis Rd Sfax Tunisia

ISBN: (纸本)9781665496070

Nowadays, the demand for embedded image processing, especially for computer vision applications is growing. Digital image processing is seeing intensive use in real time applications for several areas, such as the industrial environment, military equipment, automobile, entertainment, and medical instruments. Mathematical Morphological theory, offers a wide variety of tools, applied efficiently in most image processing steps at different scales, like image enhancement, segmentation, and classification. Therefore, such features yield a high potential of reconfiguration in image processing blocks. Morphological opening and closing operators are used as primary blocks in diverse image processing applications such as image cleaning and contrast enhancement. Moreover, using appropriate shapes and scales of structuring elements(SE), provides more flexibility and can significantly improve the enhancement performance. Nevertheless, such operators can be computationally intensive due to the increased complexity and the significant memory requirement. Also, the use of high image resolution and re-configurable size and shape of structuring elements may deepen this limitation significantly. In this paper, we consider hardware acceleration on FPGA as a promising solution to raise the above mentioned-challenges. Both intra-level and inter-level parallelism techniques based on the reuse of optimized dilation and erosion blocs are exploited to accelerate computation and save memory consumption. Unlike most existing works, this paper proposes a highly flexible architecture for opening and closing-based morphological filters. The architecture supports re-configurable shapes and sizes of structuring elements while keeping a high ability to meet real-time processing requirements and low power consumption. An improvement in terms of frame rates up to 8% with more than 2000 Fps for moderate size of structuring elements and high image resolution is achieved.

关键词： Image resolution Shape Memory management parallel processing Real-time systems Filtering theory Cleaning

来源：评论

学校读者我要写书评

暂无评论

Joint Partitioning and Sampling Algorithm for Scaling Graph Neural Network 29

Joint Partitioning and Sampling Algorithm for Scaling Graph ...

引用

29th Annual IEEE international conference on High Performance Computing, Data, and Analytics (HiPC)

作者： Das, Manohar Lal Jatala, Vishwesh Gupta, Gagan Raj Indian Inst Technol Bhilai Dept EECS Raipur Madhya Pradesh India

ISBN: (纸本)9781665494236

Graph Neural Network (GNN) has emerged as a popular toolbox for solving complex problems on graph data structures. Graph neural networks use machine learning techniques to learn the vector representations of nodes and/or edges. Learning these representations demands a huge amount of memory and computing power. The traditional shared-memory multiprocessors are insufficient to meet real-world data's computing requirements;hence, research has gained momentum toward distributed GNN. Scaling the distributed GNN has the following challenges: (1) the input graph needs to be efficiently partitioned, (2) the cost of communication between compute nodes should be reduced, and (3) the sampling strategy should be efficiently chosen to minimize the loss in accuracy. To address these challenges, we propose a joint partitioning and sampling algorithm, which partitions the input graph with weighted METIS and uses a bias sampling strategy to minimize total communication costs. We implemented our approach using the DistDGL framework and evaluated it using several real-world datasets. We observe that our approach (1) shows an average reduction in communication overhead by 53%, (2) requires less partitioning time to partition a graph, (3) shows improved accuracy, (4) shows a speed up of 1.5x on OGB-Arxiv dataset, when compared to the state-of-the-art DistDGL implementation.

关键词： Graph Neural Network Graph Partitioning distributed processing Deep Learning

来源：评论

学校读者我要写书评

暂无评论

Performance Analysis of Compiler Support for parallel Evaluation of C++ Constant Expressions 24th

Performance Analysis of Compiler Support for Parallel Eval...

引用

24th conference on Practical Aspects of and Solutions for Software Engineering, KKIO 2023 and 8th Workshop on Advances in Programming Languages, WAPL 2023, Held as Part of FedCSIS 2023

作者： Gozillon, Andrew Haeri, Seyed Hossein Riordan, James Keir, Paul Advanced Micro Devices AB Nordenskioldsgatan 11 A Office 233 Malmö211 19 Sweden School of Computing Engineering and Physical Sciences University of the West of Scotland PaisleyPA1 2BE United Kingdom University of Bergen Norway & PLWorkz R&D Belgium Av. Chapelle-aux-Champs 49 Brussels Belgium

ISBN: (纸本)9783031510748

Metaprogramming, the practice of writing programs that manipulate other programs at compile-time, continues to impact software development;enabling new approaches to optimisation, static analysis, and reflection. Nevertheless, a challenge associated with metaprogramming techniques, including C++ constexpr functionality, is an increase in compilation times. This paper presents ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation time. By evaluating constant expressions in parallel, ClangOz significantly reduces compilation times for metaprogramming-intensive codebases, enhancing developer productivity and iterative software development processes. To control this, ClangOz includes novel compiler intrinsics allowing developers to take full advantage of constexpr language features. New i9-13900K benchmark results here demonstrate the performance advantage of ClangOz over traditional compilers, including a decrease in compile times across more benchmarks;and 100% parallel efficiency in two cases. Also introduced here is the C’est library, which provides a subset of the C++ standard library, with extended constexpr support. We highlight applications of the constexpr language feature, and emphasise the relevance of ClangOz, a compiler tailored for parallel evaluation of relevant constant expressions. Developers can now utilise modern metaprogramming, while minimising compile times parametrically. © 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Static analysis

来源：评论

学校读者我要写书评

暂无评论

Optimizing distributed Training on Frontier for Large Language Models 39

Optimizing Distributed Training on Frontier for Large Langua...

引用

39th international conference on High Performance Computing, ISC High Performance 2024

作者： Dash, Sajal Lyngaas, Isaac R. Yin, Junqi Wang, Xiao Egele, Romain Ellis, J. Austin Maiterth, Matthias Cong, Guojing Wang, Feiyi Balaprakash, Prasanna Oak Ridge National Laboratory United States Université Paris-Saclay France AMD

ISBN: (纸本)9783982633602

Large language models (LLMs) have demonstrated remarkable success as foundational models, benefiting various downstream applications through fine-tuning. Loss scaling studies have demonstrated the superior performance of larger LLMs compared to their smaller counterparts. Nevertheless, training LLMs with billions of parameters poses significant challenges and requires considerable computational resources. For example, training a one trillion parameter GPT-style model on 20 trillion tokens requires a staggering 120 million exaflops. This research explores efficient distributed training strategies to extract this computation from Frontier, the world's first exascale supercomputer. We enable and investigate various model and data parallel training techniques, such as tensor parallelism, pipeline parallelism, and sharded data parallelism, to facilitate training a trillion-parameter model on Frontier. We empirically assess these techniques and their associated parameters to determine their impact on memory footprint, communication latency, and GPU's computational efficiency. We analyze the complex interplay among these techniques and find a strategy to combine them to achieve high throughput through hyperparameter tuning. We have identified efficient strategies for training large LLMs of varying sizes through empirical analysis and hyperparameter tuning. For 22 Billion, 175 Billion, and 1 Trillion parameters, we achieved GPU throughputs of 38.38%, 36.14%, and 31.96%, respectively. For the training of the 175 Billion parameter model and the 1 Trillion parameter model, we achieved 100% weak scaling efficiency on 1024 and 3072 MI250X GPUs, respectively. We also achieved strong scaling efficiencies of 89% and 87% for these two models. We trained these models only tens of iterations instead of training till completion. © 2024 Research Paper Proceedings of the ISC High Performance 2024. All rights reserved.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Dynamic Adaptive Checkpoint Mechanism for Streaming applications Based on Reinforcement Learning

Dynamic Adaptive Checkpoint Mechanism for Streaming Applicat...

引用

international conference on parallel and distributed Systems (ICPADS)

作者： Zhan Zhang Tianming Liu Yanjun Shu Siyuan Chen Xian Liu Department of Computer Science and Technology Harbin Institute of Technology Harbin China

For a stream processing system that uses checkpoints as a fault-tolerant method, selecting the appropriate checkpoint period is the key to ensuring the efficient operation of streaming applications. State-of-art stream processing systems currently only support fixed-cycle checkpoints, which is difficult to make a good trade-off between fault-tolerant processing and the cost of failure recovery in dynamically changing streaming application scenarios. Moreover, in a complex distributed streaming application environment, the dynamic environmental indicators (e.g., the values of workloads and failure rates) are not in coincidence with the model assumptions, such as the dynamics of Twitter’s hot events data changing quickly. In this paper, we consider the dynamic changes of environmental indicators and adaptively optimize the processing delay and fault recovery time. Then, we propose a dynamic adjustment method for the checkpoint interval by reinforcement learning, which is named DACM. DACM adaptively optimizes the processing delay and fault recovery time, while avoiding the overall environment modeling of streaming applications. The experiments conducted on the Flink platform show that DACM reduces the processing delay by 10% and the failure recovery time by 37% compared with the existing checkpoint interval optimization models.

关键词： Fault tolerance Adaptation models Costs Q-learning Heuristic algorithms Fault tolerant systems distributed databases

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 61 62 63 64 65 66 67 68 69 70 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：