检索结果-内蒙古大学图书馆

International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC)

作者： Yuxin Guo Xiaofeng Li Linmin Jia Yong Qin School of Traffic and Transportation Beijing Jiaotong University Beijing China State Key Lab of Rail Traffic Control &Safety Beijing Jiaotong University Beijing China

ISBN: (数字)9781728170503

ISBN: (纸本)9781728170510

In this paper, a rail recognition scheme is presented to pilot UAV autonomous flying along the track rail. Firstly, the pulse coupled neural network is used to iteratively process the single-channel brightness image and the binary image of the track contour is obtained. Here, the image entropy is adopted as the judgment basis for the stop of iterative processing. Then, the third order Bézier curve is used to fit the contour, and the vanishing point is obtained. Based on the vanishing point and the inverse perspective transformation method, the local flight target point is calculated. Then the yaw and pitch are calculated based on the vanishing point for the next step of inverse perspective transformation to calculate the local flight target point. Finally, the angle between flight direction and target point is calculated to pilot the UAV flying along the track. To speed the processing, CUDA based parallel programming in NVIDIA TX2 is adopted. In the end, various scene images, including forward flight, side flight, height change of UAV and the influence of invasion objects, are used to test the effective of the scheme presented in this paper. Experiments show that the rail recognition rate is 96.08% and the false alarm rate is 1.80%.

关键词： Rails Target tracking parallel programming Neural networks Brightness Graphics processing units Rail transportation

来源：评论

学校读者我要写书评

暂无评论

A Comprehensive Comparison and Analysis of OpenACC and OpenMP 4.5 for NVIDIA GPUs

A Comprehensive Comparison and Analysis of OpenACC and OpenM...

引用

IEEE Conference on High Performance Extreme Computing (HPEC)

作者： R. Usha Prachi Pandey N. Mangala Centre for Development of Advanced Computing Bengaluru

ISBN: (数字)9781728192192

ISBN: (纸本)9781728192208

HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.

关键词： parallel programming Conferences Graphics processing units Data transfer Complexity theory Kernel

来源：评论

学校读者我要写书评

暂无评论

PQL: A Purely-Declarative Java Extension for parallel programming 12

PQL: A Purely-Declarative Java Extension for Parallel Progra...

引用

26th European Conference on Object-Oriented programming (ECOOP)

作者： Reichenbach, Christoph Smaragdakis, Yannis Immerman, Neil Univ Massachusetts Amherst MA 01003 USA Univ Athens GR-10679 Athens Greece

ISBN: (纸本)9783642310577;9783642310560

The popularization of parallelism is arguably the most fundamental computing challenge for years to come. We present an approach where parallel programming takes place in a restricted (sub-Turing-complete), logic-based declarative language, embedded in Java. Our logic-based language, PQL, can express the parallel elements of a computing task, while regular Java code captures sequential elements. This approach offers a key property: the purely declarative nature of our language allows for aggressive optimization, in much the same way that relational queries are optimized by a database engine. At the same time, declarative queries can operate on plain Java data, extending patterns such as map-reduce to arbitrary levels of nesting and composition complexity. We have implemented PQL as extension to a Java compiler and showcase its expressiveness as well as its scalability compared to competitive techniques for similar tasks (Java + relational queries, in-memory Hadoop, etc.).

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Introducing parallel programming to Traditional Undergraduate Courses

Introducing Parallel Programming to Traditional Undergraduat...

引用

Frontiers in Education Conference (FIE)

作者： de Freitas, Henrique Cota Pontificia Univ Catolica Minas Gerais PUC Minas Dept Comp Sci Belo Horizonte MG Brazil

ISBN: (纸本)9781467313513

parallel programming is an important issue for current multi-core processors and necessary for new generations of many-core architectures. This includes processors, computers, and clusters. However, the introduction of parallel programming in undergraduate courses demands new efforts to prepare students for this new reality. This paper describes an experiment on a traditional Computer Science course during a two-year period. The main focus is the question of when to introduce parallel programming models in order to improve the quality of learning. The goal is to propose a method of introducing parallel programming based on OpenMP (a shared-variable model) and MPI (a message-passing model). Results show that when the OpenMP model is introduced before the MPI model the best results are achieved. The main contribution of this paper is the proposed method that correlates several concepts such as concurrency, parallelism, speedup, and scalability to improve student motivation and learning.

关键词： parallel programming Computer Science and Engineering Education Learning Evaluation

来源：评论

学校读者我要写书评

暂无评论

Architectural Support for Synchronization-Free Deterministic parallel programming

Architectural Support for Synchronization-Free Deterministic...

引用

18th IEEE International Symposium on High-Performance Computer Architecture (HPCA)

作者： Segulja, Cedomir Abdelrahman, Tarek S. Univ Toronto Edward S Rogers Sr Dept Elect & Comp Engn Toronto ON M5S 1A1 Canada

ISBN: (纸本)9781467308243;9781467308267

We propose a novel synchronization mechanism called versioning. It dynamically establishes a deterministic order of memory accesses in parallel programs that have serial semantics, in a way that is transparent to the program m er. This order is created in a distributed manner and is enforced by monitoring memory accesses and stalling threads if necessary. Versioning gives rise to parallel programming models in which programmers need not explicitly synchronize threads and only need to specify shared data, which greatly simplifies parallel programming. However, versioning introduces overheads and thus demands architectural support. We describe versioning and the architectural support it needs. We also propose one parallel programming model that utilizes versioning and use it to parallelize 13 benchmark applications. We build an FPGA prototype of a multiprocessor system with versioning support and show that good parallel speedups are obtained. Our analysis shows minimal impact of versioning, both in terms of timing over-heads and in terms of additional hardware.

关键词： parallel programming THREADS Stalling Computer personnel deterministic

来源：评论

学校读者我要写书评

暂无评论

A Unified Approach to parallel programming

A Unified Approach to Parallel Programming

引用

World Congress on Engineering and Computer Science

作者： Eijkhout, Victor Univ Texas Austin Texas Adv Comp Ctr TACC Austin TX 78712 USA

ISBN: (纸本)9789881925169

We propose a new theoretical model for parallelism. The model is explictly based on data and work distributions, a feature missing from other theoretical models. The major theoretic result is that data movement can then be derived by formal reasoning. While the model has an immediate interpretation in distributed memory parallelism, we show that it can also accomodate shared memory and hybrid architectures such as clusters with accelerators. The model gives rise in a natural way to objects appearing in widely different parallel programming systems such as the PETSc library or the Quark task scheduler. Thus we argue that the model offers the prospect of a high productivity programming system that can be compiled down to proven high-performance environments.

关键词： parallel programming DAG model distributed memory

来源：评论

学校读者我要写书评

暂无评论

The Spanish parallel programming Contests and its use as an educational resource

The Spanish Parallel Programming Contests and its use as an ...

引用

26th IEEE International parallel and Distributed Processing Symposium (IPDPS) / Workshop on High Performance Data Intensive Computing

作者： Almeida, Francisco Cuenca, Javier Fernandez-Pascual, Ricardo Gimenez, Domingo Palomino Benito, Juan Alejandro Univ La Laguna Dept Estadist Invest Operat & Computac E-38207 San Cristobal la Laguna Spain Univ Murcia Dept Ingenieria & Tecnol Comp E-30001 Murcia Spain Univ Murcia Dept Informat & Sistemas E-30001 Murcia Spain Fundac Parque Cientifico Ctr Supercomputac Murcia Spain

ISBN: (纸本)9780769546766

The first Spanish parallel programming Contest was organized in September 2011 within the Spanish Jornadas de Paralelismo. The aim of the contest is to disseminate parallelism among Computer Science students. The website and the material generated can be used for educational purposes. This paper comments on the organization of the contest and summarizes some training activities in which the material of the contest is being or can be used.

关键词： programming contests Online judge parallel programming parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Improved Viewshed Analysis Algorithms For Avionics Applications

Improved Viewshed Analysis Algorithms For Avionics Applicati...

引用

作者： Mustafa Ozkidik Middle East Technical University

学位级别：硕士

Viewshed analysis is a common GIS capability used in various domains with various requirements. In avionics, viewshed analysis is a part of accuracy critical applications and the real time operating systems in embedded devices use preemptive scheduling algorithms to satisfy performance requirements. Therefore, to effectively benefit from the viewshed analysis, a method should be both fast and accurate. Although R3 algorithm is accepted as an accuracy benchmark, R2 algorithm with lower accuracy is preferred in many cases due to its better execution time performance. This thesis prioritizes accuracy and presents an alternative approach to improve execution time performance of the R3 algorithm. Considering different execution environments, improved versions of R3 are implemented for CPU and GPU. The experiment results show that CPU implementation of improved algorithms achieve 1.23x to 13.51x speedup depending on the observer altitude, range and topology of the terrain. In GPU implementation experiments up to 2.27x speedup is recorded. In addition to execution time performance improvements, the analysis results prove that proposed algorithms are capable of providing higher accuracy like R3.

关键词： Geographic Information Systems Avionic Applications Viewshed Analysis Line Of Sight Analysis parallel programming

来源：评论

学校读者我要写书评

暂无评论

Introductory Concurrency and parallelism Education 19

Introductory Concurrency and Parallelism Education

引用

4th ACM Conference on Global Computing Education (CompEd)

作者： Giacaman, Nasser Adams, Joel Univ Auckland Auckland New Zealand Calvin Coll Grand Rapids MI 49506 USA

ISBN: (纸本)9781450362597

Undergraduate or novice programmers are often challenged by higher-level and abstract concepts in programming courses. Compared to constructing a sequential program, parallel and concurrent programming requires a different and more complex mental model of control flow. Now that multi-core processors have become the norm for computers and mobile devices, the responsibility of developing software to take advantage of this extra computing power now rests with the modern software developer. In recognition of this new era, curricula guidelines have been proposed specifically targeting the complex world of parallel and distributed computing. CS2013 also recognizes this with a dedicated parallel and Distributed Computing knowledge area with core hours, as well as dispersing parallelism concepts across other fundamental knowledge areas. parallel programming was once considered an advanced area of computing, and only taught to students by experts in graduate-level elective courses. However, it is now expected that all undergraduate computing students will become familiar with the fundamentals of parallelism. Concurrency and parallelism concepts are undoubtedly difficult for students to learn. This can even be daunting for teachers that are inexperienced with all elements of the underlying parallelism concepts, but even more daunting is devising pedagogically-sound materials that will allow undergraduate students to grasp the concepts. This is especially challenging for early undergraduate courses where students are often novice programmers, barely confident in sequential programming let alone parallel programming. This session will provide an opportunity for instructors to discuss and share ideas and experiences in this area, as well as explore potential collaboration opportunities.

关键词： Concurrency parallel programming Multi-Core Multi-Threading

来源：评论

学校读者我要写书评

暂无评论

A parallel solution to finding nodal neighbors in generic meshes

引用

METHODSX 2020年 7卷 100954页

作者： Qi, Pian Mei, Gang Xu, Nengxiong Tian, Hong China Univ Geosci Beijing Sch Engn & Technol Beijing 100083 Peoples R China China Univ Geosci Wuhan Fac Engn Wuhan 430074 Peoples R China

In this paper we specifically present a parallel solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel solution is heavily dependent on the parallel sorting, scan, and reduction. Our parallel solution is efficient and easy to implement, but requires the allocation of large device memory. Our parallel solution can generate the speedups of approximately 55 and 90 over the serial solution when finding the neighboring nodes and elements, respectively. It is easy to implement due to the reason it does not need to perform the mesh-coloring before finding neighbors There are no complex data structures, only integer arrays are needed, which makes our parallel solution very effective. (C) 2020 The Author(s). Published by Elsevier B.V.

关键词： Computational geometry Mesh topology Neighbors finding parallel programming GPU

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：