检索结果-内蒙古大学图书馆

Innovations in mathematical modeling, AI, and optimization techniques

JOURNAL OF SUPERCOMPUTING 2025年第1期81卷 1-4页

作者： Ohue, Masahito Yasuo, Nobuaki Takata, Masami Inst Sci Tokyo Sch Comp Dept Comp Sci Yokohama Kanagawa 2268501 Japan Inst Sci Tokyo Acad Convergence Mat & Informat TAC MI Tokyo 1528550 Japan Nara Womens Univ Res Grp Informat & Commun Technol Life Nara 6308506 Japan

this special issue is dedicated to examining the rapidly evolving fields of artificial intelligence, mathematical modeling, and optimization, with particular emphasis on their growing importance in computational science. It features the most notable papers from the "Mathematical Modeling and Problem Solving" workshop at PDPTA'24, the 30th international conference on parallel and Distributed processing Techniques and Applications. the issue showcases pioneering research in areas such as natural language processing, system optimization, and high-performance computing. the nine selected studies include novel AI-driven methods for chemical compound generation, historical text recognition, and music recommendation, along with advancements in hardware optimization through reconfigurable accelerators and vector register sharing. Additionally, evolutionary and hyper-heuristic algorithms are explored for sophisticated problem-solving in engineering design, and innovative techniques are introduced for high-speed numerical methods in large-scale systems. Collectively, these contributions demonstrate the significance of AI, supercomputing, and advanced algorithms in driving the next generation of scientific discovery.

关键词： Mathematical modeling Artificial intelligence parallel and distributed computing Reconfigurable computing Drug discovery

来源：评论

学校读者我要写书评

暂无评论

QR-PULP: Streamlining QR Decomposition for RISC-V parallel Ultra-Low-Power Platforms 21

QR-PULP: Streamlining QR Decomposition for RISC-V Parallel U...

引用

21st ACM international conference on Computing Frontiers (CF)

作者： Kiamarzi, Amirhossein Rossi, Davide Tagliavini, Giuseppe Univ Bologna Bologna Italy

ISBN: (纸本)9798400705977

QR decomposition is a numerical method used in many applications from the High-Performance Computing (HPC) domain to embedded systems. this broad spectrum of applications has drawn academic and commercial attention to developing many software libraries and domain-specific hardware solutions. In the Internet of things (IoT) domain, multicore parallel Ultra-Low-Power (PULP) architectures are emerging as energy-efficient alternatives, outperforming conventional single-core devices by coupling parallel processing with near-threshold computing. To the best of the authors' knowledge, our study introduces the first parallelized and optimized implementation of three distinct QR decomposition methods (Givens rotations, Gram-Schmidt process, and Householder transformation) on GAP-9, a commercial embodiment of the PULP architecture. parallel execution on the 8-core cluster leads to a reduction in the total number of cycles by 241% for Givens rotations, 470% for Gram-Schmidt, and 567% for Householder, compared to the GAP9 1-core scenario. while each of them only consumes 0.013 mJ, 0.012 mJ, and 0.216 mJ, respectively. Compared to traditional single-core architectures based on ARM architectures, we achieve 8x, 24x, and 30x better performance and 36x, 35x, and 30x better energy efficiency, paving the way for broad adoption of complex linear algebra tasks in the IoT domain.

关键词： QR decomposition parallel algorithms ultra-low-power computing

来源：评论

学校读者我要写书评

暂无评论

Efficient OAM-Based Programmable Hardware Accelerator Architecture 14

Efficient OAM-Based Programmable Hardware Accelerator Archit...

引用

14th international conference on Advanced Computer Information Technologies, ACIT 2024

作者： Melnyk, Viktor Melnyk, Anatoliy Rahma, Mohammad Lviv Polytechnic National University Department of Information Technologies Security Lviv Ukraine John Paul Ii Catholic University of Lublin Faculty of Natural and Technical Sciences Lublin Poland It Step University Lviv Ukraine Al-Mustaqbal University Computer Engineering Department Babylon Iraq

ISBN: (纸本)9798350350036

Today, there are a large number of tasks that involve processing data in the form of arrays. these include, in particular, algorithms for fast orthogonal transformations and cryptographic protection of information, whose structure remains consistent regardless of the data. For such algorithms, hardware accelerators are usually used to improve the system performance characteristics. their architecture traditionally involves an addressable memory, which limits their productivity increase. An efficient programmable hardware accelerator architecture based on conflict-free parallel ordered access memory (OAM) that is based on the corresponding model of computing is proposed in this paper. the approach to OAM-based programmable parallel hardware accelerator design is described and its structure is proposed. the benefits of the proposed programmable hardware accelerator architecture compared to existing conventional architectures are emphasized. © 2024 IEEE.

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

2024 9th international conference on Intelligent Computing and Signal processing, ICSP 2024

2024 9th International Conference on Intelligent Computing a...

引用

9th international conference on Intelligent Computing and Signal processing, ICSP 2024

ISBN: (纸本)9798350376548

the proceedings contain 359 papers. the topics discussed include: parallel channel separate attention network for concealed object detection in millimeter-wave images;YOLOv8 detection head improvements for FPGA deployments;improving the performance of OPTICS on short text clustering by isokernel and UMAP;robust multi-object tracking with CVAE motion model for maritime vessels;molecular property prediction based on graph contrastive learning;a novel design and simulation of band-pass filter for the 5G System based on MEMS;transcending information cocoons: integrating TransR embeddings with tensor decomposition in recommender systems;topic analysis of Chinese documents based on key phrases and latent Dirichlet allocation model;and real-valued beamspace direct position determination for multi-station systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Multi-Objective Optimization Design of Cable Driven parallel Robot Based on Aircraft Spraying 9

Multi-Objective Optimization Design of Cable Driven Parallel...

引用

9th international conference on Automation, Control and Robotics Engineering (CACRE)

作者： Yang, Zonghui Liu, Ting Jiao, Sen Chang, Xudong Jilin Inst Chem Technol Coll Informat & Control Engn Jilin Jilin Peoples R China Jilin Inst Chem Technol Coll Aerosp Engn Jilin Jilin Peoples R China

ISBN: (纸本)9798350350319;9798350350302

this article focuses on the cable driven parallel robot for aircraft spraying. Based on a multi-objective optimization model, the performance of the cable driven parallel robot (CDPR) is optimized using Prairie Dogs optimization algorithm. Firstly, a static model of 8-cable 6-degree- of-freedom CDPR suitable for aircraft spraying was established, and corresponding evaluation indicators were designed for four performance indicators: workspace, average stiffness, stiffness fluctuation, and flexibility. Establish a multi- objective optimization model by processing performance indicators and solve it using the Prairie Dogs optimization algorithm. the final results indicate that the groundhog optimization algorithm based on multi-objective optimization design has a good effect on the performance optimization of CDPR, providing important reference for the design and optimization of aircraft spraying robots. this study has important theoretical and practical significance in the field of aircraft spraying robots, providing new ideas and methods for optimizing robot performance, and has certain guiding significance for engineering practice.

关键词： Cable driven parallel robot multi-objective optimization design average stiffness workspace stiffness fluctuation flexibility Prairie Dogs optimization algorithm

来源：评论

学校读者我要写书评

暂无评论

BachLedger: Orchestrating parallel Execution with Dynamic Dependency Detection and Seamless Scheduling 30

BachLedger: Orchestrating Parallel Execution with Dynamic De...

引用

30th IEEE international conference on parallel and Distributed Systems, ICPADS 2024

作者： Yang, Yi Shang, Guangyong Qi, Guangpeng Ma, Zhen Liu, Yaxiong Tian, Jiazhou Duan, Aocheng Zhang, Meng Li, Jingying Ding, Xuan Tsinghua University School of Software Beijing China Inspur Yunzhou Industrial Internet Co. Ltd Shandong Jinan China Xidian University School of Cyber Engineering Shaanxi Xi'an China

ISBN: (纸本)9798331515966

Blockchain technology inherently necessitates redundant computation to achieve consensus among untrusted parties because of its fundamental threat model. this requirement, however, compromises system performance and impedes the widespread adoption of blockchain. To leverage existing physical resources, current research on high-performance consortium blockchain algorithms and architectures frequently employs cluster-node architectures to expand the parallel processing capability of traditional single physical nodes. Our investigation reveals a significant trend as the parallel capability of individual nodes improves. the idle time caused by synchronization of all transactions within each block, previously considered negligible, has become increasingly significant. To address this, we present BachLedger, which implements Seamless Scheduling to fully utilize inter-block thread idle time, thereby augmenting system resource utilization and achieving overall performance improvements. Our experimental results demonstrate that our algorithm surpasses current state-of-the-art (SOTA) performance levels in high-performance consortium blockchains and effectively resolves the aforementioned synchronization issue. Furthermore, this scheduling algorithm offers enhanced scalability for BachLedger, positioning it as a promising solution for future blockchain implementations. © 2024 IEEE.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

Hardware Implementation of Image processing Morphological and Convolution Operations as SoC on FPGAs 24

Hardware Implementation of Image Processing Morphological an...

引用

7th international conference on Digital Medicine and Image processing, DMIP 2024

作者： Moussa, Alfred M. Groves, Richard Rafla, Nader Electrical and Computer Engineering Department Boise State University BoiseID United States

ISBN: (纸本)9798400709586

Efficient image processing architectures are consistently in demand across a multitude of applications, particularly those customized for resource-constrained systems-on-chip (SoC). the increasing need for high-performance image processing in various sectors has driven the development of specialized architectures. However, deploying such architectures on platforms with limited resources, such as SoCs, poses significant challenges. Furthermore, the implementation of complex algorithms to handle large datasets using software solutions often leads to slower response times, prompting exploration into hardware implementations. Field-Programmable Gate Arrays (FPGAs) are becoming popular for hardware implementations because of their attributes: low latency, connectivity, parallel computing capabilities, and flexibility. Consequently, the utilization of FPGA-based implementations has resulted in faster and more efficient performance of unique architectures tailored to specific requirements. this paper presents a novel hardware/software co-design approach to implement erosion, dilation, and neighborhood image processing operations on the FPGA development board, "Zedboard". In this approach, the FPGA is programmed by connecting it to a PC via USB, facilitating the transfer of an image pixel by pixel. the pixels are temporarily stored in on-chip DDR and accessed through DMA (Direct Memory Access) until they are requested by an interrupt signal from the Image processing IP, at which point they are moved to line buffers for faster processing. Once processed, the image is transmitted back to the PC via UART, facilitating pixel-by-pixel transfer for verification, where it is compared with a reference image generated using Python. this comparison confirms a 99.22% match between the processed image and the reference image, with the discrepancy occurring at the image's edges due to initial padding. Additionally, the time required to process the entire image was measured and displayed

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

Comparative Analysis of yOLO architectures for Automated Detection of Liver Disease in Histopathological Images 24

Comparative Analysis of yOLO Architectures for Automated Det...

引用

9th international conference on Biomedical Imaging Signal processing

作者： Zou, Junting Arshad, Mohd Rizal Wang, Ziyan Univ Sains Malaysia George Town Malaysia Osaka Metropolitan Univ Osaka Japan

ISBN: (纸本)9798400717499

Liver disease is one of the major health problems worldwide and usually leads to serious complications if not diagnosed accurately and in time. Effective detection and classification of liver pathology at early stages is crucial, in which histopathologic examination of liver tissue plays a key role. However, manual analysis of histopathological images is easily affected by inter-observer variability. Recent advances in deep learning, on the other hand, have introduced methods to significantly improve the accuracy and efficiency of image-based diagnosis. this study focuses on the application of the You Only Look Once (YOLO) object detection model, specifically YOLOv4, v5, v7, v8, and v9, for automated detection of liver diseases from stained microscopic liver slices. We perform a comprehensive comparative analysis to evaluate the detection accuracy of these models across four common liver conditions: ballooning, fibrosis, inflammation, and steatosis. the results of the study show that the latest versions, in particular YOLOv9, show significant improvements in accuracy and computational efficiency compared to other versions. In this paper, the performance of each model is evaluated in detail, and our results emphasize the potential of the advanced YOLO architecture to enhance medical diagnostics by facilitating faster and more reliable detection of liver disease.

关键词： Liver disease Histopathologic Deep learning YOLO Detection

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 9th international conference on Communication and Electronics Systems, ICCES 2024

Proceedings of the 9th International Conference on Communica...

引用

9th international conference on Communication and Electronics Systems, ICCES 2024

ISBN: (纸本)9798350377972

the proceedings contain 336 papers. the topics discussed include: unveiling the potential of natural language processing in collaborative robots (Cobots): a comprehensive survey;unleashing the power of machine learning for enhanced capabilities in consumer electronics drones;convolution driven vision transformer for the prediction of mild cognitive impairment to Alzheimer’s disease progression;enhancing the fairness and performance of edge cameras with explainable ai;theoretical analysis of serial/parallel variations of hash-mining for smaller variance of confirmation time;fault detection in 3D-printing with deep learning;development of a battery-less wireless sensor node for sediment disaster monitoring system;and evaluation of ensemble learning models for hardware-trojan identification at gate-level netlists.

关键词：

来源：评论

学校读者我要写书评

暂无评论

SuperCut: Communication-Aware Partitioning for Near-Memory Graph processing 23

SuperCut: Communication-Aware Partitioning for Near-Memory G...

引用

20th ACM international conference on Computing Frontiers (CF)

作者： Zhao, Chenfeng Chamberlain, Roger D. Zhang, Xuan Washington Univ McKelvey Sch Engn St Louis MO 63110 USA

ISBN: (纸本)9798400701405

the parallel execution of many graph algorithms is frequently dominated by data communication overheads between compute nodes. this bottleneck becomes even more pronounced in Near-Memory processing (NMP) architectures with multiple memory cubes as local memory accesses are less expensive. Existing near-memory architectures typically use graph partitioning methods with a fixed vertex assignment, which limits their potential to improve performance and reduce energy consumption. Here, we argue that an NMP-based graph processing system should also consider the distribution of vertices onto memory cubes. We propose SuperCut, a framework for near-memory architectures to effectively reduce communication overheads while maintaining computational balance. We evaluate SuperCut via architectural simulation with 6 real-world datasets and 4 representative applications. the results show that it provides up to 1.8x total energy reduction and 2.6x speedup relative to current state-of-the-art approaches.

关键词： near-data processing 3D-stacked memory graph processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：