检索结果-内蒙古大学图书馆

Commissioning and Operation of the Upgraded Belle II DAQ System With PCI-Express-Based High-Speed Readout

IEEE TRANSACTIONS ON NUCLEAR SCIENCE 2023年第6期70卷 890-897页

作者： Lai, Y. -T. Bessner, M. Biswas, D. Charlet, D. Lau, T. S. Levit, D. Hartbrich, O. Higuchi, T. Itoh, R. Jules, E. Kapusta, P. Kunigo, T. Nakao, M. Nishimura, K. Park, S. Plaige, E. Purwar, H. Robbe, P. Sugiura, R. Suzuki, S. Y. Taurigna, M. Varner, G. Yamada, S. Zhou, Q. -D. Univ Tokyo Kavli Inst Phys & Math Universe IPMU Chiba 2778583 Japan Univ Hawaii Manoa Dept Phys & Astron Honolulu HI 96822 USA Univ Louisville Dept Phys & Astron Louisville KY 40292 USA Lab Phys Deux Infinis Irene Joliot Curie IJCLab F-91898 Orsay France Univ Paris Saclay Lab Phys Deux Infinis Irene Joliot Curie IJCLab CNRS IN2P3 F-91898 Orsay France High Energy Accelerator Res Org KEK Ibaraki 3050801 Japan Polish Acad Sci PAN Henryk Niewodniczanski Inst Nucl Phys IFJ PL-31342 Krakow Poland Univ Tokyo Fac Sci Dept Phys Tokyo 1130033 Japan Univ Tokyo Grad Sch Sci Tokyo 1130033 Japan Nagoya Univ Inst Adv Res Nagoya Aichi 4648601 Japan Nagoya Univ Kobayashi Maskawa Inst Nagoya Aichi 4648601 Japan

The Belle II experiment and the SuperKEKB collider are designed to operate under a higher luminosity compared to that of Belle for the improvement of rare $B$ meson decay study and new physics search. To break the bottleneck of bandwidth and to improve the stability in the operation of the Belle II data acquisition (DAQ) system, a new PCI-express-based readout system has been developed. The new system includes a PCI-express-based high-speed readout board (PCIe40), which was originally developed for the upgrades of the LHCb and ALICE experiments, the PCIe40 firmware, the slow control, and readout software running on a readout PC. The new readout system's commissioning with most of the Belle II subdetectors has been performed, and the readout upgrade is complete for the particle-identification detectors and the neutral kaon and muon detector in Belle II, which has been operating stably with the new system in the beam collision "physics runs." The results of the commissioning and the performance of the global DAQ operation will be reported.

关键词： Data acquisition (DAQ) data transfer field-programmable gate arrays high-energy physics instrumentation

来源：评论

学校读者我要写书评

暂无评论

Analysis and architecture design of scalable fractional motion estimation for H.264 encoding

引用

INTEGRATION-THE VLSI JOURNAL 2012年第4期45卷 427-438页

作者： Vasiljevic, Jasmina Ye, Andy Ryerson Univ Dept Elect & Comp Engn Toronto ON M5B 2K3 Canada

Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion estimation runtime. To maximize the performance and utilization of FME implementations on field-programmable gate arrays (FPGAs), one needs to effectively exploit the inherent parallelism in the algorithm. In this work, we explore two approaches to FME algorithm parallelization in order to effectively increase the processing power of the computing hardware. We call the first method vertical scaling and the second horizontal scaling. We implemented six scaled FME designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. We found that scaling vertically within a 4 x 4 sub-block is more efficient than scaling horizontally across several sub-blocks. As a result, we were able to achieve higher video resolutions at lower hardware resource cost. In particular, it is shown that the best vertically scaled design can achieve 30 fps of QSXGA video with 4 reference frames with only 25.5 K LUTS and 28.7 K registers. (C) 2011 Elsevier B.V. All rights reserved.

关键词： Fractional motion estimation H.264 field-programmable gate arrays Scalability

来源：评论

学校读者我要写书评

暂无评论

Real-Time Hardware Implementation of a Sound Recognition System with In-field Learning

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2016年第7期E99D卷 1885-1894页

作者： Kugler, Mauricio Tossavainen, Teemu Nakatsu, Miku Kuroyanagi, Susumu Iwata, Akira Nagoya Inst Technol Dept Comp Sci & Engn Nagoya Aichi 4668555 Japan Aalto Univ Dept Comp Sci & Engn Sch Sci Konemiehentie 2 Espoo 02150 Finland

The development of assistive devices for automated sound recognition is an important field of research and has been receiving increased attention. However, there are still very few methods specifically developed for identifying environmental sounds. The majority of the existing approaches try to adapt speech recognition techniques for the task, usually incurring high computational complexity. This paper proposes a sound recognition method dedicated to environmental sounds, designed with its main focus on embedded applications. The pre-processing stage is loosely based on the human hearing system, while a robust set of binary features permits a simple k-NN classifier to be used. This gives the system the capability of in-field learning, by which new sounds can be simply added to the reference set in real-time, greatly improving its usability. The system was implemented in an FPGA based platform, developed in-house specifically for this application. The design of the proposed method took into consideration several restrictions imposed by the hardware, such as limited computing power and memory, and supports up to 12 reference sounds of around 5.3 s each. Experimental results were performed in a database of 29 sounds. Sensitivity and specificity were evaluated over several random subsets of these signals. The obtained values for sensitivity and specificity, without additional noise, were, respectively, 0.957 and 0.918. With the addition of +6 dB of pink noise, sensitivity and specificity were 0.822 and 0.942, respectively. The in-field learning strategy presented no significant change in sensitivity and a total decrease of 5.4% in specificity when progressively increasing the number of reference sounds from 1 to 9 under noisy conditions. The minimal signal-to-noise ration required by the prototype to correctly recognize sounds was between -8 dB and 3 dB. These results show that the proposed method and implementation have great potential for several real life applicatio

关键词： environmental sound recognition binary features field-programmable gate arrays in-field learning

来源：评论

学校读者我要写书评

暂无评论

RNS-Based FPGA Accelerators for High-Quality 3D Medical Image Wavelet Processing Using Scaled Filter Coefficients

引用

IEEE ACCESS 2022年 10卷 19215-19231页

作者： Nagornov, Nikolay N. Lyakhov, Pavel A. Valueva, Maria V. Bergerman, Maxim, V North Caucasus Fed Univ Dept Math Modeling Stavropol 355017 Russia North Caucasus Fed Univ North Caucasus Ctr Math Res Stavropol 355017 Russia

Medical imaging using different modalities has many problems. The main ones are low informativeness, various distortion noises, and a large amount of information. Fusion, denoising, and visual data compression are used to solve them in practice. Discrete wavelet transform is one way to implement various fusion, denoising, and compression methods for 2D and 3D medical image processing. Medical imaging systems produce increasingly accurate images with scanning technology and digital devices development. These images have improved quality using both higher spatial resolutions and color bit-depth. Processing a large volume of medical imaging data requires considerable resources and processing time. Modern wavelet-based devices for medical image processing do not meet the current performance demand. Hardware accelerators are being designed to solve this problem. This paper proposes new (field-programmable gate array) FPGA accelerators using wavelet processing (WP) with scaled filter coefficients (SFC) and parallel computing in residue number system (RNS) to improve the performance of high-quality 3D medical image WP systems. The computational complexity is reduced using the developed WP method with SFC and the proposed wavelet filter coefficients scaling algorithm. Parallel computing is organized in RNS using moduli sets of a particular type. Hardware implementation of 3D medical image WP using the proposed FPGA accelerators increases device performance by 2.89-3.59 times, increasing the hardware resources by 1.18-3.29 times compared to state-of-the-art solutions. The device performance improvement is achieved while maintaining high-quality 3D medical image processing in peak signal-to-noise ratio terms.

关键词： Biomedical imaging Three-dimensional displays Discrete wavelet transforms Performance evaluation field programmable gate arrays Image coding Wavelet transforms Medical image processing discrete wavelet transform scaled filter coefficients residue number system high-performance computing hardware accelerator field-programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Reconfigurable content-addressable memory (CAM) on FPGAs: A tutorial and survey

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 2022年 128卷 451-465页

作者： Irfan, Muhammad Sanka, Abdurrashid Ibrahim Ullah, Zahid Cheung, Ray C. C. City Univ Hong Kong Dept Elect Engn Hong Kong Peoples R China Ghulam Ishaq Khan Inst Engn Sci & Technol Fac Elect Engn Topi Pakistan Inst Appl Sci & Technol Pak Austria Fachhsch Haripur Pakistan

Content-addressable memory (CAM) is a massively parallel searching device that returns the address of a given search input in one clock cycle. field-programmable gate array (FPGA)-based CAMS are becoming popular due to their applications in the latest networking systems, e.g., software-defined networks (SDNs) leading to upcoming 5G networks. Ternary CAM (TCAM) implements a routing table in a network router to classify and forward data packets where don't care bits (X-bits) correspond to multiple addresses. FPGAs do not have a hard-core CAM, although it is a prime element in networking applications. This paper serves as a comprehensive survey on FPGA-based CAM/TCAMs implemented using block random-access memory (BRAM), lookup table RAM (LUTRAM), and flip-flops (FFs). BRAM-based TCAM suffers from the pre-processing of mapping data, requires the data to be in a specific order in some cases, and has a large SRAM/TCAM bit ratio. LUTRAM-based CAM/TCAM suffers from wide bit-wise ANDing, high routing complexity, but has a small SRAM/TCAM bit ratio of 14 compared to 16 in the case of BRAM-based TCAM. Shallow and wide RAM blocks are required to implement large-size RAM-based TCAMs (BRAM-based and LUTRAM-based TCAMs). FF-based TCAMs use FFs as their memory elements and have reduced hardware costs per TCAM bit. However, due to the routing complexity, it suffers from scalability and a large amount of power consumption. The update latency of BRAM-based TCAM and LUTRAM-based TCAM is proportional to the depth of BRAM and LUTRAM, respectively. However, FF-based CAM updates in 1 or 2 clock cycles depending on the availability of input/output pins on target FPGA. (C) 2021 Published by Elsevier B.V.

关键词： field-programmable gate arrays Content-addressable memory Reconfigurable computing Network router Random-access memory SRAM-based TCAM

来源：评论

学校读者我要写书评

暂无评论

Towards Complete and Scalable Emulation of Quantum Algorithms on High-Performance Reconfigurable Computers

引用

IEEE TRANSACTIONS ON COMPUTERS 2023年第8期72卷 2350-2364页

作者： El-Araby, Esam Mahmud, Naveed Jeng, Mingyoung Joshua MacGillivray, Andrew Chaudhary, Manu Nobel, Md. Alvir Islam Ul Islam, S. M. Ishraq Levy, David Kneidel, Dylan Watson, Madeline R. Bauer, Jack G. Riachi, Andrew E. Univ Kansas Dept Elect Engn & Comp Sci Lawrence KS 66045 USA Florida Inst Technol Comp Engn & Sci Dept Melbourne FL 32901 USA

Contemporary quantum computers face many critical challenges that limit their usefulness for practical applications. A primary limiting factor is classical-to-quantum (C2Q) data encoding, which requires specific circuits for quantum state initialization. The required state initialization circuits are often complex and violate decoherence constraints, particularly for I/O intensive applications. Existing Noisy Intermediate-Scale Quantum (NISQ) devices are noise-sensitive and have low quantum bit (qubit) counts, thus limiting the applicability of C2Q circuits for encoding large and realistic datasets. This has made the study of complete and realistic circuits that include data encoding challenging and has also led to a heavy dependency on costly and resource-intensive simulations on classical platforms. In this work, we propose a cost-effective, classical-hardware-accelerated framework for realistic and complete emulation of quantum algorithms. The emulation framework incorporates components for the critical C2Q data encoding process, as well as architectures for quantum algorithms such as the quantum Haar transform (QHT). The framework is used to investigate optimizations for C2Q and QHT algorithms, and the corresponding optimized quantum circuits are presented. The framework is implemented on a High-Performance Reconfigurable Computer (HPRC) which emulates the proposed QHT circuits combined with proposed C2Q data encoding methods. For performance benchmarks, CPU-based emulations and simulations on a state-of-the-art quantum computing simulator are also carried out. Results show that the proposed hardware-accelerated emulation framework is more efficient in terms of speed and scalability compared to CPU-based emulation and simulation.

关键词： field-programmable gate arrays parallel processing quantum computing quantum encoding

来源：评论

学校读者我要写书评

暂无评论

FPGA-Based Architecture for Medium Access Techniques in Broadband PLC

引用

IEEE ACCESS 2018年 6卷 9534-9542页

作者： Poudereux, Pablo Hernandez, Alvaro Cruz-Roldan, Fernando Mateos, Raul Univ Alcala Dept Elect E-28805 Alcala De Henares Spain Univ Alcala Signal Theory & Commun Dept E-28805 Alcala De Henares Spain

In this paper, two real-time architectures of medium access techniques useful for future generation of wireline and wireless communication systems are presented. One architecture is based on discrete cosine transform (DCT), while the second approach implements a filter-bank multi-carrier (FBMC) system. A comparative analysis, in terms of resource consumption, performance, and precision, is shown. The comparison considers a floating-point model, a fixed-point model, and experimental tests. These models make it possible to evaluate the effect of the fixed-point precision in the implementation and, in turn, to verify the correctness of the developed architecture. The simulation models and the experimental tests have been carried out in different practical environments in order to achieve a further analysis. The two proposed architectures have been implemented on a field-programmable gate array (FPGA) device. Furthermore, the architectures have been included as advanced peripherals in a system-on-chip, which also integrates a soft microprocessor to monitor the whole system and manage the data transfers. As a communication scenario, the proposed architectures have been particularized to operate in real time while meeting all timing requirements de fined by a broadband power line communications standard. For that case, the system has achieved a desired transmission rate of 62.5 Ms/s at the converters, providing mean squared errors, at the output for an ideal channel, below 3 .10(-5) for both the DCT and FBMC approaches, whereas each transmitter/receiver requires around 50% of the DSP cells available in the Xilinx XC6VLX240T FPGA, the most demanded resource in the device.

关键词： field-programmable gate arrays multi-carrier communication (MCM) filter-bank multicarrier (FBMC) systems broadband power-line communications discrete cosine transform (DCT)

来源：评论

学校读者我要写书评

暂无评论

HIGH-PERFORMANCE HETEROGENEOUS COMPUTING WITH THE CONVEY HC-1

引用

COMPUTING IN SCIENCE & ENGINEERING 2010年第6期12卷 80-87页

作者： Bakos, Jason D. Univ S Carolina Dept Comp Sci & Engn Columbia SC 29208 USA

Unlike other socket-based reconfigurable coprocessors, the Convey HC-1 contains nearly 40 field-programmable gate arrays, scatter-gather memory modules, a high-capacity crossbar switch, and a fully coherent memory sys... 详细信息

关键词： Reconfigurable computing coherent memory system convey HC-1 coprocessors field programmable gate arrays field-programmable gate arrays field-programmable gate arrays heterogeneous computing high-capacity crossbar switch high-performance computing high-performance heterogeneous computing hybrid computing reconfigurable architectures scatter-gather memory modules socket-based reconfigurable coprocessors

来源：评论

学校读者我要写书评

暂无评论

Enhanced Self-Synchronized Reduced Media-Independent Interface for Robotic and Automotive Applications

引用

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 2022年第4期18卷 2274-2286页

作者： Romanov, Alexey Mikhailovitch Gringoli, Francesco MIREA Russian Technol Univ Inst Cybernet Moscow 119454 Russia Univ Brescia Dept Informat Engn I-25123 Brescia Italy Italian Natl Interuniv Consortium Telecommun I-25123 Brescia Italy

The increasing pervasiveness of control systems used in robotic and automotive applications requires the installation of a growing number of sensors and actuators. In parallel to the downsizing of all the components, new techniques for tracing versatile printed circuit boards (PCBs) are emerging: a 3-D molded interconnection device, for example, creates the opportunity to reduce up to 75% of weight by combining a single-layer PCB with mechanical parts. Getting rid of unnecessary wires, hence, becomes indispensable, and new on-board interfaces with fewer pins must be designed. This article proposes a novel encoding scheme and the corresponding interface that reduces the number of wires between automotive Ethernet (100BASE-T1) MAC and PHY down to 2 and corrects up to 37.8% of single-bit errors. As this interface can be clocked at 33.33 MHz, it does not require differential transmitters, receivers, or any other special block, and for this reason, it can be easily implemented on a small-sized field-programmable gate array.

关键词： Ethernet field programmable gate arrays Encoding Wires Robots Clocks Media Encoding field-programmable gate arrays network interfaces reduced media-independent interface self-synchronization wire communication interference 3-D molded interconnection device (3D-MID)

来源：评论

学校读者我要写书评

暂无评论

A robust multiplexer-based FPGA inspired by biological systems

引用

JOURNAL OF SYSTEMS ARCHITECTURE 1997年第10期43卷 719-733页

作者： Tempesti, G Mange, D Stauffer, A Logic Systems Laboratory Department of Computer Science Swiss Federal Institute of Technology (EPFL) CH-1015 Lausanne Switzerland

Biological organisms are among the most robust systems known to man. Their robustness is based on a set of processes which cannot be adapted directly to the world of silicon but can provide an inspiration for the design of robust circuits. This paper introduces a multiplexer-based field programmable gate Array (FPGA) which we made capable of self-test and self-repair using an approach loosely based on biological mechanisms at the cellular level. The system is designed to provide on-line self-test and self-repair using a completely distributed system and a minimal amount of additional logic.

关键词： self-test self-repair field-programmable gate arrays reconfiguration embryonics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：