检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Yan Jia Yuqing Cheng Kele Xu Yong Dou Peng Qiao Zhouyu He National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha China College of Systems Engineering National University of Defense Technology Changsha China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

B-mode ultrasound tongue imaging is a non-invasive and real-time method for visualizing vocal tract deformation. However, accurately extracting the tongue’s surface contour remains a significant challenge due to the low signal-to-noise ratio (SNR) and prevalent speckle noise in ultrasound images. Traditional supervised learning models often require large labeled datasets, which are labor-intensive to produce and susceptible to noise interference. To address these limitations, we present a novel Counterfactual Ultrasound Anti-Interference Self-Supervised Network (CUAI-SSN), which integrates self-supervised learning (SSL) with counterfactual data augmentation, progressively disentangles confounding factors, ensuring that the model generalizes well across varied ultrasound conditions. Our approach leverages causal reasoning to decouple noise from relevant features, enabling the model to learn robust representations that focus on essential tongue structures. By generating counterfactual image-label pairs, our method introduces alternative, noise-independent scenarios that enhance model training. Furthermore, we introduce attention mechanisms to enhance the network’s ability to capture fine-grained details even in noisy conditions. Extensive experiments on real ultrasound tongue images demonstrate that CUAI-SSN outperforms existing methods, setting a new benchmark for automated contour extraction in ultrasound tongue imaging. Our code is publicly available at https://***/inexhaustible419/CounterfactualultrasoundAI.

关键词： Training Ultrasonic imaging Tongue Self-supervised learning Data augmentation Data models Cognition Data mining Noise measurement Signal to noise ratio

来源：评论

学校读者我要写书评

暂无评论

A Unified Co-Processor Architecture for Matrix Decomposition

引用

Journal of Computer science & technology 2010年第4期25卷 874-885页

作者：窦勇周杰邬贵明姜晶菲雷元武倪时策 National Laboratory for Parallel & Distributed Processing National University of Defense Technology National Laboratory for Parallel & Distributed Processing.National University of Defense Technology

QR and LU decompositions are the most important matrix decomposition algorithms. Many studies work on accelerating these algorithms by FPGA or ASIC in a case by case style. In this paper, we propose a unified framework for the matrix decomposition algorithms, combining three QR decomposition algorithms and LU algorithm with pivoting into a unified linear array structure. The QR and LU decomposition algorithms exhibit the same two-level loop structure and the same data dependency. Utilizing the similarities in loop structure and data dependency of matrix decomposition, we unify a fine-grained algorithm for all four matrix decomposition algorithms. Furthermore, we present a unified co-processor structure with a scalable linear array of processing elements （PEs）, in which four types of PEs are same in the structure of memory channels and PE connections, but the only difference exists in the internal structure of data path. Our unified co-processor, which is IEEE 32-bit floating-point precision, is implemented and mapped onto a Xilinx Virtex5 FPGA chip. Experimental results show that our co-processors can achieve speedup of 2.3 to 14.9 factors compared to a Pentium Dual CPU with double SSE threads.

关键词： co-processor matrix decomposition fine-grained parallel FPGA

来源：评论

学校读者我要写书评

暂无评论

Flip-flops soft error rate evaluation approach considering internal single-event transient

引用

science China(Information sciences) 2015年第6期58卷 159-170页

作者： SONG RuiQiang CHEN ShuMing HE YiBai DU YanKang College of Computer Science National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory

The internal single-event transient(SET) induced upset in flip-flops is becoming significant with the increase of the operating frequency. However, the conventional soft error rate(SER) evaluation approach could only produce an approximate upset prediction result caused by the internal SET. In this paper, we propose an improved SER evaluation approach based on Monte Carlo simulation. A novel SET-based upset model is implemented in the proposed evaluation approach to accurately predict upsets caused by the internal SET. A test chip was fabricated in a commercial 65 nm bulk process to validate the accuracy of the improved SER evaluation approach. The predicted single-event upset cross-sections are consistent with the experimental data.

关键词： soft error rate Monte Carlo internal SET single-event upset flip-flops

来源：评论

学校读者我要写书评

暂无评论

Large-scale graph processing systems: a survey

引用

信息与电子工程前沿（英文版） 2020年第3期21卷 384-404页

作者： Ning LIU Dong-sheng LI Yi-ming ZHANG Xiong-lve LI Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense TechnologyChangsha 410000China

Graph is a significant data structure that describes the relationship between entries. Many application domains in the real world are heavily dependent on graph data. However, graph applications are vastly different from traditional applications. It is inefficient to use general-purpose platforms for graph applications, thus contributing to the research of specific graph processing platforms. In this survey, we systematically categorize the graph workloads and applications, and provide a detailed review of existing graph processing platforms by dividing them into general-purpose and specialized systems. We thoroughly analyze the implementation technologies including programming models, partitioning strategies, communication models, execution models, and fault tolerance strategies. Finally, we analyze recent advances and present four open problems for future research.

关键词： Graph workloads Graph applications Graph processing systems

来源：评论

学校读者我要写书评

暂无评论

Representation learning on textual network with personalized Page Rank

引用

science China(Information sciences) 2021年第11期64卷 95-104页

作者： Teng LI Yong DOU National Laboratory for Parallel and Distributed Processing National University of Defense Technology

Representation learning on textual network or textual network embedding, which leverages rich textual information associated with the network structure to learn low-dimensional embedding of vertices, has been useful in a variety of tasks. However, most approaches learn textual network embedding by using direct neighbors. In this paper, we employ a powerful and spatially localized operation: personalized Page Rank(PPR) to eliminate the restriction of using only the direct connection relationship. Also, we analyze the relationship between PPR and spectral-domain theory, which provides insight into the empirical performance boost. From the experiment, we discovered that the proposed method provides a great improvement in linkprediction tasks, when compared to existing methods, achieving a new state-of-the-art on several real-world benchmark datasets.

关键词： representation learning network embedding PageRank textual network personalized PageRank

来源：评论

学校读者我要写书评

暂无评论

Demand-Driven Memory Leak Detection Based on Flow-and Context-Sensitive Pointer Analysis

引用

Journal of Computer science & technology 2009年第2期24卷 347-356页

作者：王戟马晓东董威徐厚峰刘万伟 National Laboratory for Parallel and Distributed Processing National University of Defense Technology

We present a demand-driven approach to memory leak detection algorithm based on flow- and context-sensitive pointer analysis. The detection algorithm firstly assumes the presence of a memory leak at some program point and then runs a backward analysis to see if this assumption can be disproved. Our algorithm computes the memory abstraction of programs based on points-to graph resulting from flow- and context-sensitive pointer analysis. We have implemented the algorithm in the SUIF2 compiler infrastructure and used the implementation to analyze a set of C benchmark programs. The experimental results show that the approach has better precision with satisfied scalability as expected.

关键词： flow-sensitive memory leak detection demand-driven static analysis

来源：评论

学校读者我要写书评

暂无评论

Scalability of 3D deterministic particle transport on the Intel MIC architecture

引用

Nuclear science and Techniques 2015年第5期26卷 88-97页

作者：王庆林刘杰龚春叶邢座程 Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Science and Technology on Space Physics Laboratory

The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer architectures. However, the number of on-chip cores grows quickly with the scale-down of feature size in semiconductor technology. In this paper, we present a scalability investigation of one energy group time-independent deterministic discrete ordinates neutron transport in 3D Cartesian geometry(Sweep3D) on Intel's Many Integrated Core(MIC) architecture, which can provide up to 62 cores with four hardware threads per core now and will own up to 72 in the future. The parallel programming model, Open MP, and vector intrinsic functions are used to exploit thread parallelism and vector parallelism for the discrete ordinates method, respectively. The results on a 57-core MIC coprocessor show that the implementation of Sweep3 D on MIC has good scalability in performance. In addition, the application of the Roofline model to assess the implementation and performance comparison between MIC and Tesla K20 C Graphics processing Unit(GPU) are also reported.

关键词：计算机体系结构可扩展性粒子输运三维几何英特尔麦克风离散坐标法计算性能

来源：评论

学校读者我要写书评

暂无评论

Deep reinforcement learning:a survey

引用

Frontiers of Information technology & Electronic Engineering 2020年第12期21卷 1726-1744页

作者： Hao-nan WANG Ning LIU Yi-yun ZHANG Da-wei FENG Feng HUANG Dong-sheng LI Yi-ming ZHANG Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense TechnologyChangsha 41OOOOChina

Deep reinforcement learning(RL)has become one of the most popular topics in artificial intelligence *** has been widely used in various fields,such as end-to-end control,robotic control,recommendation systems,and natural language dialogue *** this survey,we systematically categorize the deep RL algorithms and applications,and provide a detailed review over existing deep RL algorithms by dividing them into modelbased methods,model-free methods,and advanced RL *** thoroughly analyze the advances including exploration,inverse RL,and transfer ***,we outline the current representative applications,and analyze four open problems for future research.

关键词： Reinforcement learning Deep reinforcement learning Reinforcement learning applications

来源：评论

学校读者我要写书评

暂无评论

Surveying concurrency bug detectors based on types of detected bugs

引用

science China(Information sciences) 2017年第3期60卷 5-31页

作者： Zhendong WU Kai LU Xiaoping WANG Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology

Concurrency bugs widely exist in concurrent programs and have caused severe failures in the real world. Researchers have made significant progress in detecting concurrency bugs, which improves software reliability. In this paper, we survey the most up-to-date and well-known concurrency bug detectors. We categorize the existing detectors based on the types of concurrency bugs. Consequently, we analyze data race detectors, atomicity violation detectors, order violation detectors, and deadlock detectors, respectively. We also discuss some other techniques which are mostly related to concurrency bug detection, including schedule bounding techniques, interleaving optimizing techniques, path expanding techniques, and deterministic replay techniques. Additionally, we statistically analyze the reviewed detectors and get some interesting findings, for instance, nearly 86% of previous detectors focus on data races and atomicity violations, and dynamic approaches are popular(74%). We also discuss the limitations of previous detectors, finding that 91% of previous detectors suffer from false negatives and 64% of previous detectors suffer from runtime overhead. Based on the reviewed detectors and statistical analysis, we conclude some future research directions, including accuracy, performance,applicability, and integrality.

关键词： concurrency bug detection data race atomicity violation order violation deadlock

来源：评论

学校读者我要写书评

暂无评论

VirtMan:design and implementation of a fast booting system for homogeneous virtual machines in iVCE

引用

Frontiers of Information technology & Electronic Engineering 2016年第2期17卷 110-121页

作者： Zi-yang LI Yi-ming ZHANG Dong-sheng LI Peng-fei ZHANG Xi-cheng LU National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology

Internet-based virtual computing environment （iVCE） has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services. Virtual machines （VMs） have been widely used in iVCE to isolate different users/jobs and ensure trustworthiness, but traditionally VMs require a long period of time for booting, which cannot meet the requirement of iVCE＇s large-scale and highly dynamic applications. To address this problem, in this paper we design and implement VirtMan, a fast booting system for a large number of virtual machines in iVCE. VirtMan uses the Linux Small Computer System Interface （SCSI） target to remotely mount to the source image in a scalable hierarchy, and leverages the homogeneity of a set of VMs to transfer only necessary image data at runtime. We have implemented VirtMan both as a standalone system and for OpenStack. In our 100-server testbed, VirtMan boots up 1000 VMs （with a 15 CB image of Windows Server 2008） on 100 physical servers in less than 120 s, which is three orders of magnitude lower than current public clouds.

关键词： Virtual machine Fast booting Homogeneity Internet-based virtual computing environment (iVCE)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：