检索结果-内蒙古大学图书馆

16th IEEE International Wireless Communications and Mobile Computing conference (IEEE IWCMC)

作者： Li, Jixi Bai, Xu Han, Shuai Yu, Yue Harbin Inst Technol Sch Elect & Informat Engn Harbin Peoples R China China Acad Informat & Commun Technol Beijing Peoples R China

ISBN: (纸本)9781728131290

As one of the basic components of digital signal processing, digital finite impulse response (FIR) filters are widely used in image processing, speech recognition, and many other fields. This paper proposes an improved distributed algorithm (DA) to implement high-order digital FIR filters with less logical delay and hardware utilization. Firstly, the parallel DA is designed and then improved by look-up-table (LUT) decomposition. Secondly, the improved DA FIR filters are implemented on the Xilinx kintex-7 FPGA chip and used in high-speed ground penetrating radar (GPR) system to process radar signals. Finally, the performance of the DA filters with different order and structures are analyzed and compared, taking logical delay and hardware utilization as the key indicators. It comes to a conclusion that the parallel DA with LUT decomposition can implement high-order filter more effectively than traditional structures.

关键词： FIR filter Look-up-table distributed Algorithm FPGA parallel processing

来源：评论

学校读者我要写书评

暂无评论

Reconstructing the mind's eye: fMRI-to-image with contrastive learning and diffusion priors 23

Reconstructing the mind's eye: fMRI-to-image with contrastiv...

引用

Proceedings of the 37th International conference on Neural Information processing Systems

作者： Paul S. Scotti Atmadeep Banerjee Jimmie Goode Stepan Shabalin Alex Nguyen Ethan Cohen Aidan J. Dempster Nathalie Verlinde Elad Yundler David Weisberg Kenneth A. Norman Tanishq Mathew Abraham Princeton Neuroscience Institute and Medical AI Research Center (MedARC) Medical AI Research Center (MedARC) Princeton Neuroscience Institute Ecole Normale Supérieure PSL University University of Toronto Hebrew University of Jerusalem Medical AI Research Center (MedARC) and EleutherAI and Stability AI

We present MindEye, a novel fMRI-to-image approach to retrieve and reconstruct viewed images from brain activity. Our model comprises two parallel submodules that are specialized for retrieval (using contrastive learning) and reconstruction (using a diffusion prior). MindEye can map fMRI brain activity to any high dimensional multimodal latent space, like CLIP image space, enabling image reconstruction using generative models that accept embeddings from this latent space. We comprehensively compare our approach with other existing methods, using both qualitative side-by-side comparisons and quantitative evaluations, and show that MindEye achieves state-of-the-art performance in both reconstruction and retrieval tasks. In particular, MindEye can retrieve the exact original image even among highly similar candidates, indicating that its brain embeddings retain fine-grained image-specific information. This allows us to accurately retrieve images even from large-scale databases like LAION-5B. We demonstrate through ablations that Mind-Eye's performance improvements over previous methods result from specialized submodules for retrieval and reconstruction, improved training techniques, and training models with orders of magnitude more parameters. Furthermore, we show that MindEye can better preserve low-level image features in the reconstructions by using img2img, with outputs from a separate autoencoder. All code is available on https://***/MedARC-AI/fMRI-reconstruction-NSD.

关键词：

来源：评论

学校读者我要写书评

暂无评论

parallel sampling of diffusion models 23

Parallel sampling of diffusion models

引用

Proceedings of the 37th International conference on Neural Information processing Systems

作者： Andy Shih Suneel Belkhale Stefano Ermon Dorsa Sadigh Nima Anari Computer Science Stanford University

Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPM-Solver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 14.6s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score. Code for our paper can be found at https://***/AndyShih12/paradigms.

关键词：

来源：评论

学校读者我要写书评

暂无评论

ST-PINN: A Self-Training Physics-Informed Neural Network for Partial Differential Equations

ST-PINN: A Self-Training Physics-Informed Neural Network for...

引用

International Joint conference on Neural Networks (IJCNN)

作者： Junjun Yan Xinhai Chen Zhichao Wang Enqiang Zhoui Jie Liu Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology Changsha China

Partial differential equations (PDEs) are an essential computational kernel in physics and engineering. With the advance of deep learning, physics-informed neural networks (PINNs), as a mesh-free method, have shown great potential for fast PDE solving in various applications. To address the issue of low accuracy and convergence problems of existing PINNs, we propose a self-training physics-informed neural network, ST-PINN. Specifically, ST-PINN introduces a pseudo label based self-learning algorithm during training. It employs governing equation as the pseudo-labeled evaluation index and selects the highest confidence examples from the sample points to attach the pseudo labels. To our best knowledge, we are the first to incorporate a self-training mechanism into physics-informed learning. We conduct experiments on five PDE problems in different fields and scenarios. The results demonstrate that the proposed method allows the network to learn more physical information and benefit convergence. The ST-PINN outperforms existing physics-informed neural network methods and improves the accuracy by a factor of 1.33x-2.54x.

关键词：

来源：评论

学校读者我要写书评

暂无评论

11th International conference on Future Data and Security Engineering, FDSE 2024

11th International Conference on Future Data and Security En...

引用

11th International conference on Future Data and Security Engineering, FDSE 2024

ISBN: (纸本)9789819604364

The proceedings contain 57 papers. The special focus in this conference is on Future Data and Security Engineering. The topics include: Predicting Bitcoin’s Price: A Critical Review of Forecasting Models and methods;Enhancing Explainable Herbal Recognition with Vision Transformer Features and SVM;sequential Recommendation Using Graph Neuron Networks;improving the Polyp image Segmentation Based on parallel Reverse Attention Network;adversarial Perturbations for License Plate Information Privacy;IU-VecCert: A Scalable Credentials Issuance Protocol Using Non-interactive Vector Commitment Scheme;a Study on Deep Graph Neural Networks for Security Vulnerabilities Detection in Web Applications;Secure as a Service VNF: A Proposed VNF Architecture for Proactive Malware Detection During File Downloads in Enterprises;enhancing Efficiency of Multi-Label X-Ray image Classification with Self-Supervised Learning Based On Compact Swin Transformers;curated Colon Disease Diagnosis Using Principal Component Analysis and Deep Learning with Integrated Gradients;Clinical Data-Driven Explainable AI for COVID-19 Treatment Outcome Analysis;Medical Signal Analysis For Early ECG Classification Using Machine Learning Models;detection and Classification of Liver Lesions Using Vision Transformer and Active Learning;detection and Classification of Osteoarthritis Using Vision Transformer in distributed Environment;legal-Onto Model for Efficient Land Law Updates in Vietnam;exploring Distillation Models for Cultural Heritage Preservation: Traditional Vietnamese Instruments;Optimizing Customer Feedback Analysis with BERT-Based Sentiment Classification: A Case Study of Toyota Dong Sai Gon;estimating Traffic Density Using Convolutional Neural Networks Based on Crowd Counting;MedQAS: A Medical Question Answering System Based on Finetuning Large Language Models;Machine Learning UHF-RFID to Support Video Tracking and Recommendation for Attendance System;a Pronunciation Practice System Based on Pre-trained

关键词：

来源：评论

学校读者我要写书评

暂无评论

11th International conference on Future Data and Security Engineering, FDSE 2024

11th International Conference on Future Data and Security En...

引用

11th International conference on Future Data and Security Engineering, FDSE 2024

ISBN: (纸本)9789819604333

关键词：

来源：评论

学校读者我要写书评

暂无评论

A technique for early detection of cyberattacks using the traffic self-similarity property and a statistical approach

A technique for early detection of cyberattacks using the tr...

引用

Euromicro conference on parallel, distributed and Network-Based processing

作者： Igor Kotenko Igor Saenko Aleksander Kribel Oleg Lauta St. Petersburg Federal Research Center of the Russian Academy of Sciences St. Petersburg Signal Academy Admiral Makarov State University of Maritime and Inland Shipping

The paper discusses a technique for detecting cyberattacks on computer networks, based on identifying anomalies in network traffic by assessing its self-similarity and determining the impact of cyber attacks using statistical methods. The proposed technique includes three stages, at which the analysis of the self-similarity property for the reference traffic is performed (using the methods of the Dickey-Fuller test, rescaled range, and detrended fluctuation), the analysis of the self-similarity property for the real traffic (by the same methods) and additional processing of time series with statistical methods (methods of moving average, Z-Score, and CUSUM). The issues of software implementation of the proposed approach and the formation of a dataset containing network packets are considered. The experimental results demonstrated the presence of self-similarity in network traffic and confirmed the high efficiency of the proposed method. This technique allows detecting cyberattacks in real or near real time.

关键词： Fluctuations Statistical analysis Time series analysis distributed databases Telecommunication traffic Fractals Computer networks

来源：评论

学校读者我要写书评

暂无评论

On GPU optimizations of stencil codes for highly parallel simulations

On GPU optimizations of stencil codes for highly parallel si...

引用

Euromicro conference on parallel, distributed and Network-Based processing

作者： Nikolai Pfisterer Marco Berghoff Achim Streit Steinbuch Centre for Computing Karlsruhe Institute of Technology Karlsruhe Germany

Stencil codes are valuable methods to solve partial differential equations of models in a wide range of applications in science and engineering. Graphics processing units (GPUs) provide a highly parallel architecture with fast directly accessible memory that is desirable to run stencil codes. They enable larger and more complex simulations that are solved faster compared to simulating on CPUs. We provide a solution for users to run highly parallel stencil codes on GPUs, that can also be efficiently used on a distributed GPU *** this work, we present an extension to the multi-disciplinary framework NAStJA originally designed to efficiently run stencil codes on the CPUs in current high-performance computing (HPC) systems, to also be able to run on GPUs. We describe different methods to increase the performance of stencil codes on GPUs like a border exchange method which is transparent to the user, as well as a buffer for gradient values that are needed multiple times. We show their performance using the phase-field method as a stencil code *** this GPU extension and optimizations implemented into the NAStJA framework, we can show highly improved performance compared to the CPU implementation, and an efficiency of 92% (weak scaling) for a large-scale example simulation on 64 GPUs on the ForHLR II HPC system.

关键词： Partial differential equations Graphics processing units Clustering algorithms Performance gain parallel architectures Central processing Unit Mathematical model

来源：评论

学校读者我要写书评

暂无评论

HCGrid: a convolution-based gridding framework for radio astronomy in hybrid computing environments

引用

MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY 2021年第2期501卷 2734-2744页

作者： Wang, Hao Yu, Ce Zhang, Bo Xiao, Jian Luo, Qi Tianjin Univ Coll Intelligence & Comp 135 Yaguan RoodHaihe Educ Pk Tianjin 300350 Peoples R China Chinese Acad Sci Natl Astron Observ 20 Datun Rd Beijing 100012 Peoples R China Chinese Acad Sci Natl Astron Observ CAS Key Lab FAST 20 Datun Rd Beijing 100012 Peoples R China

Gridding operation, which is to map non-uniform data samples on to a uniformly distributed grid, is one of the key steps in radio astronomical data reduction process. One of the main bottlenecks of gridding is the poor computing performance, and a typical solution for such performance issue is the implementation of multicore CPU platforms. Although such a method could usually achieve good results, in many cases, the performance of gridding is still restricted to an extent due to the limitations of CPU, since the main workload of gridding is a combination of a large number of single instruction, multidata stream operations, which is more suitable for GPU, rather than CPU implementations. To meet the challenge of massive data gridding for the modern large single-dish radio telescopes, e.g. the Five-hundred-meter Aperture Spherical radio Telescope, inspired by existing multicore CPU gridding algorithms such as Cygrid, here we present an easy-to-install, high-performance, and open-source convolutional gridding framework, HCGrid, in CPU-GPU heterogeneous platforms. It optimizes data search by employing multithreading on CPU, and accelerates the convolution process by utilizing massive parallelization of GPU. In order to make HCGrid a more adaptive solution, we also propose the strategies of thread organization and coarsening, as well as optimal parameter settings under various GPU architectures. A thorough analysis of computing time and performance gain with several GPU parallel optimization strategies show that it can lead to excellent performance in hybrid computing environments.

关键词： methods: data analysis techniques: image processing software: public release

来源：评论

学校读者我要写书评

暂无评论

GPU-accelerated QPSK Transceiver with FEC over a Flat-fading Channel 6

GPU-accelerated QPSK Transceiver with FEC over a Flat-fading...

引用

6th International conference on parallel, distributed and Grid Computing, PDGC 2020

作者： Muzammil, Rehan Wajid, Mohd Aligarh Muslim University Department of Electronics Engineering Z.H.C.E.T Aligarh India

ISBN: (纸本)9781728171326

Rayleigh flat-fading path in wireless-channels leads to errors, and this makes the detection task very difficult. In such cases, forward error correction (FEC) is used to provide good performance. This paper gives the testing of a QPSK-transceiver using threshold detection and FEC in the form of (8, 4) block coding-decoding. The whole system was tested by transmitting a known digital image over a flat-fading channel, and detection was performed using the threshold detection process. Very recently, the advent of programmable graphics processing units (GPUs) as excessive parallel programming system has enabled high-performance computation. NVIDIA GTX 1050 Ti GPU has been used for implementing and testing transceiver in this work. The image is transmitted over a flat-fading channel along with FEC, and the results are obtained in the form of Bit Error Rate (BER) versus signal-to-noise ratio (SNR) curve. All the baseband processing is performed in the NVIDIA GPU, and some of the computation is performed in the CPU. The purpose of this paper is to show that a lot of processing time can be saved using a highly parallel computing machine, the GPU, as compared to a sequentially programming device, the CPU. The speedup is indicated in the results. © 2020 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：