检索结果-内蒙古大学图书馆

International Conference on Data Engineering

作者： Yuhan Wu Feiyu Wang Yifan Zhu Zhuochen Fan Zhiting Xiong Tong Yang Bin Cui National Key Laboratory for Multimedia Information Processing School of Computer Science Peking University Beijing China College of Computer National University of Defense Technology Changsha China

ISBN: (数字)9798350317152

ISBN: (纸本)9798350317169

In key-value storage scenarios where storage space is at a premium, our focus is on a class of solutions that only store the value, which is highly space-efficient. While these solutions have proven their worth in distributed storage, networking, and bioinformatics, they still face two significant issues: one is that their space cost could be further reduced; the other is their are vulnerable to update failures, which can necessitate a complete table reconstruction. To address these issues, we introduce VisionEmbedder, a compact key-value embedder with constant-time lookup, fast dynamic updates, and a near-zero risk of reconstruction. VisionEmbedder cuts down the storage requirement from 2.2L bits to just 1.6L bits per key-value pair with an L-bit value, and it significantly reduces the chance of update failures by a factor of n, where $n$ is the number of keys (for instance, 1 million or more). The compromise with VisionEmbedder comes with a minor reduction in query throughput on certain data sizes. The enhancements offered by VisionEmbedder have been theoretically validated and are effective across any dataset. Additionally, we have implemented VisionEmbedder on both FPGA and CPU platforms, with all codes made available as open-source.

关键词： Costs Codes Mathematical analysis Throughput Data engineering Bioinformatics Field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

科学研究的第五范式——以智能驱动的材料设计为例

引用

Engineering 2023年第5期24卷 126-137,I0003,I0004页

作者： Can Leng Zhuo Tang Yi-Ge Zhou Zean Tian Wei-Qing Huang Jie Liu Keqin Li Kenli Li Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense TechnologyChangsha 410073China Laboratory of Software Engineering for Complex Systems National University of Defense TechnologyChangsha 410073China National Supercomputing Center in Changsha Changsha 410082China College of Computer Science and Electronic Engineering Hunan UniversityChangsha 410082China Institute of Chemical Biology and Nanomedicine State Key Laboratory of Chemo/Biosensing and ChemometricsCollege of Chemistry and Chemical EngineeringHunan UniversityChangsha 410082China Department of Applied Physics School of Physics and ElectronicsHunan UniversityChangsha 410082China Department of Computer Science State University of New YorkNew PaltzNY 12561USA

科学正在进入一个新时代——第五范式——它被认为是知识整合到不同领域的主要特征,是基于无所不在的机器学习系统的计算社区中智能驱动的工作。在此,我们通过在天河一号超级计算机系统上构建的催化材料专门设计的典型平台案例,生动地... 详细信息

科学正在进入一个新时代——第五范式——它被认为是知识整合到不同领域的主要特征,是基于无所不在的机器学习系统的计算社区中智能驱动的工作。在此,我们通过在天河一号超级计算机系统上构建的催化材料专门设计的典型平台案例,生动地阐明了第五范式的本质,旨在促进第五范式在其他领域的培养。第五范式平台主要包括模型自动构建(原始数据提取)、指纹自动构建(神经网络特征选择)以及跨学科知识串联的重复迭代(“火山图”)。与分解一起进行的是对迭代中实现的体系结构的性能评估。通过讨论,第五范式的智能驱动平台可以极大地简化和改进研究中极其繁琐和具有挑战性的工作,并通过补偿机器学习中缺少样本和替代一些由于计算资源不足而导致的数值计算来实现数值计算与机器学习之间的相互反馈,从而加速探索过程。在数据驱动的学科中,跨学科专家的协同作用和对动态数据需求的急剧增长仍然是一个挑战。我们相信,对第五范式平台的一瞥可以为其在其他领域的应用铺平道路。

关键词：机器学习自动构建天河一号数据驱动神经网络动态数据知识整合科学研究

来源：评论

学校读者我要写书评

暂无评论

Simulation of Nuclear Reactor Accident Scenarios using Physics-Informed Neural Networks and Transfer-Learning 4

Simulation of Nuclear Reactor Accident Scenarios using Physi...

引用

4th International Conference on Electronic Information Engineering and computer science, EIECS 2024

作者： Yu, Yang Xie, Yufei Wang, Wenlin Wu, Guohua Lan, Haitao Lin, Enbo An, Ping Sun, Zibin Zhang, Haichuan Wu, Yixian National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China Nuclear Power Institute of China Science and Technology on Reactor System Design Technology Laboratory Chengdu China School of Automation Wuhan University of Technology Wuhan China Sino-German College of Intelligent Manufacturing Shenzhen Technology University Shenzhen China Institute of Automotive Engineers Hubei University of Automotive Technology Shiyan China Nuclear Power Institute of China Chengdu China

ISBN: (纸本)9798331531409

In a Loss of Coolant Accident (LOCA), reactor core temperatures can rise rapidly, leading to potential fuel damage and radioactive material release. This research presents a groundbreaking method that combines the power of Monte Carlo Sampling and Physics-Informed Neural Networks (PINNs) to simulate and effectively address the challenging Loss of Coolant Accidents (LOCA) in nuclear reactors. In the event of a LOCA, reactor core temperatures can soar rapidly, posing a significant threat to fuel integrity and potentially leading to the release of radioactive materials. By leveraging the strengths of both Monte Carlo Sampling and PINNs, this approach aims to provide a comprehensive and accurate simulation framework for assessing and mitigating the consequences of such accidents. The method yields high prediction accuracy (MAE: 0.033, RMSE: 0.098, R2: 0.814) and demonstrates robustness through transfer learning, maintaining strong performance (MAE: 0.064, RMSE: 0.163, R2: 0.735). © 2024 IEEE.

关键词： Coolants

来源：评论

学校读者我要写书评

暂无评论

MBAPIS: Multi-Level Behavior Analysis Guided Program Interval Selection for Microarchitecture Studies 23

MBAPIS: Multi-Level Behavior Analysis Guided Program Interva...

引用

Proceedings of the 32nd International Conference on parallel Architectures and Compilation Techniques

作者： Hongwei Cui Yujie Cui Honglan Zhan Shuhao Liang Xianhua Liu Chun Yang Xu Cheng Engineering Reserach Center of Microprocessor & System Ministry of Education School of Computer Science Peking University Beijing National Key Laboratory for Multimedia Information Processing School of Computer Science Peking University Beijing

ISBN: (纸本)9798350342543

Understanding program behavior is crucial in computer architecture research, but the growing size of benchmarks makes analyzing and simulating entire programs increasingly challenging. In practice, researchers often select representative program intervals for analysis and testing. These intervals are different sections of continuous execution of a program. SimPoint is a well-known method for selecting representative intervals using hardware-independent information. However, when focusing on a specific microarchitecture study, it is desirable to select intervals that are more relevant to that study. For instance, intervals with more branch mispredictions are more appropriate for branch prediction studies. We refer to these intervals as "tailored intervals" for branch prediction *** paper presents a Multi-level Behavior Analysis guided Program Interval Selection (MBAPIS) for selecting tailored intervals. For a given microarchitecture study, the first level of MBAPIS uses hardware performance counters to prioritize selecting the intervals that exhibit clearer microarchitectural characteristics relevant to that study. The second level analyzes the processor performance bottlenecks to further select the intervals where the concerned microarchitecture design more strongly impacts performance. Finally, MBAPIS performs clustering analysis with the basic block information of each interval selected by the first two levels, and selects the representative intervals among them while preserving the diverse software behavior. Additionally, we present a general and extensible interval-replaying design to accurately re-execute selected *** SPEC CPU2006 and CPU2017 benchmarks are used for evaluation. The results demonstrate that MBAPIS can select representative and tailored intervals for two typical microarchitecture studies and deliver accurate estimates of the concerned hardware events for all tailored intervals in each benchmark, with an average error rate of le

关键词： Benchmark

来源：评论

学校读者我要写书评

暂无评论

Optimizing Batched Small Matrix Multiplication on Multi-core DSP Architecture

Optimizing Batched Small Matrix Multiplication on Multi-core...

引用

International Symposium on parallel and distributed processing with Applications, ISPA

作者： Xiaohan Zuo Chunhua Xiao Qinglin Wang Chen Shi College of Computer Science Chongqing University China Key Laboratory of Dependable Service Computing in Cyber Physical Society Ministry of Education China National Key Laboratory of Parallel and Distributed Computing National University of Defense Technology Changsha China Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology Changsha China

ISBN: (数字)9798331509712

ISBN: (纸本)9798331509729

General Matrix Multiplication (GEMM) is a critical computational operation in scientific computing and machine learning domains. While traditional GEMM performs well on large matrices, it is inefficient in terms of data transfer and computation for small matrices. Many High-Performance Computing (HPC) tasks can be decomposed into large batches of small matrix multiplication operations. Multi-core Digital Signal Processors (DSPs) are commonly used to accelerate high-performance computing. We present a design for batched fusion small matrix multiplication (BFMM) tailored for multi-core DSP architecture. To address the inefficiencies and redundancy in storage and computational operations associated with batch small matrix multiplications, we designed several strategies. We design a matrix fusion concatenation strategy, an access coordination mechanism, and a mechanism for fragment aggregation. BFMM supports an efficient K-dimension multi-core parallelization strategy. The parameter constraint model makes BFMM highly portable. BFMM also includes a performance evaluation model that facilitates assessment and verification. Experimental results demonstrate that, compared to traditional GEMM (TGEMM) on multi-core DSP and traditional GEMM with concatenated data access (TGEMM Op), BFMM exhibits superior performance. For large batches of small matrices, our design achieves 1.21x to 18x higher performance than TGEMM Op on single-core DSP, while on multi-core DSP, it outperforms TGEMM Op by 1.14x to 18.1x.

关键词： Performance evaluation Scientific computing High performance computing Digital signal processors Redundancy Batch production systems Signal processing algorithms computer architecture Machine learning Matrix decomposition

来源：评论

学校读者我要写书评

暂无评论

Unlearnable 3D Point Clouds: Class-wise Transformation Is All You Need 38

Unlearnable 3D Point Clouds: Class-wise Transformation Is Al...

引用

38th Conference on Neural Information processing Systems, NeurIPS 2024

作者： Wang, Xianlong Li, Minghui Liu, Wei Zhang, Hangtao Hu, Shengshan Zhang, Yechao Zhou, Ziqi Jin, Hai National Engineering Research Center for Big Data Technology and System China Services Computing Technology and System Lab China Cluster and Grid Computing Lab China Hubei Engineering Research Center on Big Data Security China Hubei Key Laboratory of Distributed System Security China School of Cyber Science and Engineering Huazhong University of Science and Technology China School of Software Engineering Huazhong University of Science and Technology China School of Computer Science and Technology Huazhong University of Science and Technology China

Traditional unlearnable strategies have been proposed to prevent unauthorized users from training on the 2D image data. With more 3D point cloud data containing sensitivity information, unauthorized usage of this new type data has also become a serious concern. To address this, we propose the first integral unlearnable framework for 3D point clouds including two processes: (i) we propose an unlearnable data protection scheme, involving a class-wise setting established by a category-adaptive allocation strategy and multi-transformations assigned to samples;(ii) we propose a data restoration scheme that utilizes class-wise inverse matrix transformation, thus enabling authorized-only training for unlearnable data. This restoration process is a practical issue overlooked in most existing unlearnable literature, i.e., even authorized users struggle to gain knowledge from 3D unlearnable data. Both theoretical and empirical results (including 6 datasets, 16 models, and 2 tasks) demonstrate the effectiveness of our proposed unlearnable framework. Our code is available at https://***/CGCL-codes/UnlearnablePC. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Class of Fast and Accurate Multi-layer Block Summation and Dot Product Algorithms 18th

A Class of Fast and Accurate Multi-layer Block Summation a...

引用

18th IFIP WG 10.3 International Conference on Network and parallel Computing, NPC 2021

作者： He, Kang Barrio, Roberto Chen, Lin Jiang, Hao Liu, Jie Gu, Tongxiang Qi, Jin Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China Department of Applied Mathematics University of Zaragoza ZaragozaE50009 Spain College of Computer National University of Defense Technology Changsha410073 China Institute of Applied Physics and Computational Mathematics Beijing100000 China

ISBN: (纸本)9783030935702

Basic recursive summation and common dot product algorithm have a backward error bound that grows linearly with the vector dimension. Blanchard [1] proposed a class of fast and accurate summation and dot product algorithms respectively called FABsum and FABdot, which trades off the calculation accuracy and speed by the block size. Castaldo [2] proposed a multi-layer block summation and dot product algorithm called SuperBlocksum and SuperBlockdot that can increase the accuracy while adding almost no additional calculations. We combine the idea of [1] with the multi-layer block structure to propose SuperFABsum (for "super fast and accurate block summation") and SuperFABdot (for "super fast and accurate block dot product"). Our algorithms have two variants, one is SuperFAB(within), the other is SuperFAB(outside). Our algorithms further improve accuracy and speed compared with FAB and SuperBlock. We conducted accuracy and speed tests on the high-performance FT2000+ processor. Experimental results show that SuperFABdot(within) algorithm is more accurate than FABdot and SuperBlockdot. Compared with FABdot, SuperFABdot(outside) algorithm can achieve up to 1.2 × performance speedup while ensuring similar accuracy. © 2022, IFIP International Federation for Information processing.

关键词： Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Accelerate Graph Neural Network Training by Reusing Batch Data on GPUs

Accelerate Graph Neural Network Training by Reusing Batch Da...

引用

IEEE International Conference on Performance, Computing and Communications (IPCCC)

作者： Zhejiang Ran Zhiquan Lai Lizhi Zhang Dongsheng Li National Key Laboratory of Parallel and Distributed Processing School of Computer National University of Defense Technology Changsha China

ISBN: (纸本)9781665443326

With the increasing adoption of graph neural networks (GNNs) in the graph-based deep learning community, various graph programming frameworks and models have been developed to improve the productivity of GNNs. The current GNN frameworks choose GPU as an essential tool to accelerate GNN training. However, it is still challenging to train GNNs on large graphs with limited GPU memory. Unlike traditional neural networks, generating mini-batch data by sampling in GNNs requires some complicated tasks such as traversing the graph to select neighboring nodes and gathering their features. This process takes up most of the training and we find the main bottleneck comes from transferring nodes features from CPU to GPU through limited bandwidth. In this paper, We propose a method Reusing Batch Data for the problem of data transmission. This method utilizes the similarity between adjacent mini-batches to reduce repeated data transmission from CPU to GPU. Furthermore, to reduce the overhead introduced by this method, we design a fast algorithm based on GPU to detect repeated nodes’ data and achieve shorter additional computation time. Evaluations on three representative GNN models show that our method can reduce transmission time by up to 60% and speed the end-to-end GNN training by up to 1.79× over the state-ofthe-art baselines. Besides, Reusing Batch Data can effectively save GPU memory footprint by about 19% to 40% while still reducing the training time compared to the static cache strategy.

关键词： Training Productivity Deep learning Design methodology Conferences Graphics processing units Programming

来源：评论

学校读者我要写书评

暂无评论

OpenMedIA: Open-Source Medical Image Analysis Toolbox and Benchmark under Heterogeneous AI Computing Platforms

arXiv

引用

arXiv 2022年

作者： Zhuang, Jia-Xin Huang, Xiansong Yang, Yang Chen, Jiancong Yu, Yue Gao, Wei Li, Ge Chen, Jie Zhang, Tong Peng Cheng Laboratory Shenzhen China School of Computer Science and Technology Harbin Institute of Technology Shenzhen China School of Electronic and Computer Engineering Peking University China National Laboratory for Parallel and Distributed Processing National University of Defense Technology China

In this paper, we present OpenMedIA, an open-source toolbox library containing a rich set of deep learning methods for medical image analysis under heterogeneous Artificial Intelligence (AI) computing platforms. Various medical image analysis methods, including 2D/3D medical image classification, segmentation, localisation, and detection, have been included in the toolbox with PyTorch and/or MindSpore implementations under heterogeneous NVIDIA and Huawei Ascend computing systems. To our best knowledge, OpenMedIA is the first open-source algorithm library providing compared PyTorch and MindSpore implementations and results on several benchmark datasets. The source codes and models are available at https://***/ OpenMedIA. © 2022, CC BY-NC-SA.

关键词： Open systems

来源：评论

学校读者我要写书评

暂无评论

DarkSAM: Fooling Segment Anything Model to Segment Nothing 38

DarkSAM: Fooling Segment Anything Model to Segment Nothing

引用

38th Conference on Neural Information processing Systems, NeurIPS 2024

作者： Zhou, Ziqi Song, Yufei Li, Minghui Hu, Shengshan Wang, Xianlong Zhang, Leo Yu Yao, Dezhong Jin, Hai National Engineering Research Center for Big Data Technology and System China Services Computing Technology and System Lab China Cluster and Grid Computing Lab China Hubei Engineering Research Center on Big Data Security China Hubei Key Laboratory of Distributed System Security China School of Computer Science and Technology Huazhong University of Science and Technology China School of Cyber Science and Engineering Huazhong University of Science and Technology China School of Software Engineering Huazhong University of Science and Technology China School of Information and Communication Technology Griffith University Australia

Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversarial perturbation (UAP) have not been thoroughly investigated yet. In this paper, we propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic decoupling-based spatial attack and a texture distortion-based frequency attack. We first divide the output of SAM into foreground and background. Then, we design a shadow target strategy to obtain the semantic blueprint of the image as the attack target. DarkSAM is dedicated to fooling SAM by extracting and destroying crucial object features from images in both spatial and frequency domains. In the spatial domain, we disrupt the semantics of both the foreground and background in the image to confuse SAM. In the frequency domain, we further enhance the attack effectiveness by distorting the high-frequency components (i.e., texture information) of the image. Consequently, with a single UAP, DarkSAM renders SAM incapable of segmenting objects across diverse images with varying prompts. Experimental results on four datasets for SAM and its two variant models demonstrate the powerful attack capability and transferability of DarkSAM. Our codes are available at: https://***/CGCL-codes/DarkSAM. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：