Recent advancements in 3D Convolutional Neural Network (CNN) architectures have demonstrated superior performance across diverse computer vision tasks, albeit with a trade-off of intense computational and memory deman...
详细信息
ISBN:
(数字)9798350383638
ISBN:
(纸本)9798350383645
Recent advancements in 3D Convolutional Neural Network (CNN) architectures have demonstrated superior performance across diverse computer vision tasks, albeit with a trade-off of intense computational and memory demands. thus, the tiling of incoming data becomes mandatory for 3D CNN acceleration in memory-constrained platforms such as Field Programmable Gate Arrays (FPGA). In this paper, different memory access techniques are explored to reduce the data traffic between on-chip and off-chip memories during the inference stage of a 3D CNN. the most suitable data traffic mode is identified by considering multiple parameters like latency, on-chip memory utilization and off-chip memory access. A parameterized and modular design approach for 3D CNNs has been implemented on an FPGA, where the input and weight data mapping modules are designed to minimize the on-chip memory requirements. these modules are parameterized for variable tiling sizes and different memory access modes while the main computation is performed on a systolic-array-based pipelined architecture. the experiments conducted on three widely adopted 3D networks, I3D, C3D, and R(2+1)D, have shown 16%, 28%, and 10% improvement in latency respectively. the proposed methodology also results in a lower energy dissipation profile.
LDPC codes have been intensively used in various wireless communication applications, due to their increased BER performance. the present paper summarizes the state of the art applications of short length LDPC codes a...
详细信息
ISBN:
(纸本)9781479914920
LDPC codes have been intensively used in various wireless communication applications, due to their increased BER performance. the present paper summarizes the state of the art applications of short length LDPC codes and proposes FPGA based application specific hardware architectures for short-length LDPC decoders. the decoding algorithms considered for implementation are both belief propagation and min-sum algorithm. Due to the increased BER performances, the proposed architecture make use of parallel computation capabilities offered by FPGA technology in order to implement the belief propagation algorithm. In spite of the iterative nature and increased computational complexity of the LDPC decoding algorithm, the proposed architecture achieves high-throughput, mandatory in real-time application and data transmission. the architecture for the LDPC belief propagation based decoder is based on arctangent hyperbolic function approximation used for check nodes update.
Sequence alignment has been widely utilized in biological computing science. To obtain the optimal alignment results many algorithms adopts dynamic programming method to achieve this goal. Smith-Waterman algorithm is ...
详细信息
ISBN:
(纸本)9781479909735
Sequence alignment has been widely utilized in biological computing science. To obtain the optimal alignment results many algorithms adopts dynamic programming method to achieve this goal. Smith-Waterman algorithm is the famous in the sequence alignment approach. However, such dynamic programming algorithms are computation-consuming. It is impossible to use these algorithms to compare query sequence with a sequence database such as GenBank and PDB. Recently, GPU computing has been applied in many sequence alignment algorithms to enhance the performance. In this paper, we proposed a GPU-based Smith-Waterman algorithm by combining the CPU and GPU computing capabilities to accelerate alignments on a sequence database. In the proposed algorithm, a filtration mechanism using frequency distance is used to decrease the number of compared sequences. We implemented the Smith-Waterman alignments by CUDA on the NVIDIA Tesla C2050. the experimental results show that the highest speedup ratio is about 80 to 90 times over CPU-based Smith-Waterman algorithm.
Network policy plays a crucial role in cloud-native networking, especially in multi-tenant scenarios. It provides precise control over connectivity by specifying source and destination endpoints, traffic types, and ot...
详细信息
ISBN:
(数字)9798350386059
ISBN:
(纸本)9798350386066
Network policy plays a crucial role in cloud-native networking, especially in multi-tenant scenarios. It provides precise control over connectivity by specifying source and destination endpoints, traffic types, and other criteria to allow or deny traffic. However, manual configuration of these policies introduces the risk of errors, leading to isolation violations or network service unavailability. therefore, network policy verification is essential for maintaining security and quality of service in cloud-native networking. Currently, a naïve approach involves individually checking each policy within the cluster, which can take over 100s for verification in a cluster size of over 100k. Existing verification frameworks, like Kano and Verikube, improve performance by leveraging pre-filtering and Satisfiability Modulo theories (SMT) solvers, achieving a 3.12x to 12.99x performance boost over the naïve baseline. However, as network policy changes rapidly within 100ms in real cloud-native networks, both frameworks need over 10s to perform verification for cluster sizes over 100k, which is far from satisfying. To overcome these issues, we propose and implement a novel network policy verification framework NPV, which utilizes the policy-label pre-filter process with bitwise compression. We further enhance the policy verification algorithm with a policy-namespace divide-and-conquer strategy to improve the data-level parallelism. We implement NPV on commodity servers and evaluate its performance using real network policy datasets. Our experiments indicate that, compared withthe state-of-the-art methods, NPV can achieve up to 139.00x to 651.06x improvement in verification time compared to Kano and Verikube, with 65% less memory usage.
Triangular current mode (TCM) enables the benefit of zero voltage switching, but it is accompanied by two significant functional limitations. Initially, it often requires expensive FPGA, ASIC, and/or specialized hardw...
详细信息
ISBN:
(数字)9798350351330
ISBN:
(纸本)9798350351347
Triangular current mode (TCM) enables the benefit of zero voltage switching, but it is accompanied by two significant functional limitations. Initially, it often requires expensive FPGA, ASIC, and/or specialized hardware-based sensing. Secondly, the time delays linked with such sensing and processing can lead to a trade-off between the achievable switching frequency and the precision of the triangular inductor current waveform. this paper delivers the experimental verification of an alternative method where the inductor current envelopes are measured using straightforward analog quasi-peak detectors. the sampling rate needed for these envelopes is in first approximation independent of the applied switching frequency. the control behavior is mainly unaffected by time and signal delays due to the low frequency envelope signals being used, allowing for an affordable standard DSP to implement the proposed method. Importantly, the same control loop can also facilitate continuous conduction mode (CCM) operation, whereby a normal CCM operation at constant switching frequency and a mode with variable switching frequency and constant ripple current can be featured. Using a standard two-level grid-tied inverter configuration as a case study, this paper shows a measurement based verification of the concept of the envelope tracking-based TCM (E-TCM) and CCM (E-CCM) method. A prototype is presented to demonstrate the behavior of an envelope tracking hardware solution, including a measurement-based evaluation of the proposed circuits in operation with a 0.5 kVA, two-level SiC-based converter. this converter, together with its envelope tracking circuit, is capable of operating in TCM at frequencies up to several hundred kHz and can dynamically transition to CCM during operation.
the current approach to marking attendance in colleges is tedious and time consuming. I propose AttenFace, a standalone system to analyze, track and grant attendance in real time using face recognition. Using snapshot...
详细信息
ISBN:
(数字)9781665473125
ISBN:
(纸本)9781665473132
the current approach to marking attendance in colleges is tedious and time consuming. I propose AttenFace, a standalone system to analyze, track and grant attendance in real time using face recognition. Using snapshots of class from live camera feed, the system identifies students and marks them as present in a class based on their presence in multiple snapshots taken throughout the class duration. Face recognition for each class is performed independently and in parallel, ensuring that the system scales with number of concurrent classes. Further, the separation of the face recognition server from the back-end server for attendance calculation allows the face recognition module to be integrated with existing attendance tracking software like Moodle. the face recognition algorithm runs at 10 minute intervals on classroom snapshots, significantly reducing computation compared to direct processing of live camera feed. this method also provides students the flexibility to leave class for a short duration (such as for a phone call) without losing attendance for that class. Attendance is granted to a student if he remains in class for a number of snapshots above a certain threshold. the system is fully automatic and requires no professor intervention or any form of manual attendance or even camera set-up, since the back-end directly interfaces with in-class cameras. AttenFace is a first-of-its-kind one-stop solution for face-recognition-enabled attendance in educational institutions that prevents proxy, handling all aspects from students checking attendance to professors deciding their own attendance policy, to college administration enforcing default attendance rules.
暂无评论