This special issue of IEEE Micro explores exciting, new ideas in the vast design space of approximate computing. We present articles that range from programming languages to circuits and cover important application do...
详细信息
This special issue of IEEE Micro explores exciting, new ideas in the vast design space of approximate computing. We present articles that range from programming languages to circuits and cover important application domains such as machine learning and the Internet of Things.
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical...
详细信息
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical services, voice and speech recognition, computer vision, natural language processing, and video and image processing. Neural networks have many layers and consume much energy. approximate computing is a promising way to reduce energy consumption in applications that can tolerate a degree of accuracy reduction. This paper proposes an effective method to prevent accuracy reduction after using approximate computing methods in the CNNs. The method exploits the k-means clustering algorithm to label pixels in the first convolutional layer. Then, using one of the existing pruning methods, different pruning amounts have been applied to all layers. The experimental results on three CNNs and four different datasets show that the accuracy of the proposed method has significantly improved (by 17%) compared to the baseline network.
approximate computing (AC) in arithmetic logic has become a viable option in applications requiring error tolerance in energy-efficient architectures. AC relied on approximate arithmetic functions to reduce delay, are...
详细信息
approximate computing (AC) in arithmetic logic has become a viable option in applications requiring error tolerance in energy-efficient architectures. AC relied on approximate arithmetic functions to reduce delay, area, and power consumption while sacrificing accuracy to reduce delay, power, and area. In this research, a novel two approximate recursive multipliers (RMul-1, RMul-2) and an approximate adder have been designed to reduce power consumption, area, and computational delay in error-tolerant systems. The recursive multipliers utilize a combination of NOR, AND, half adder, and full adder gates to achieve low power and area-efficient designs. Furthermore, to reduce time, the approximation adder uses an optimal combination of AND, OR, and MUX gates. Furthermore, the cadence RTL compiler synthesizes the proposed multiplier using 28nm technology, and it is compared to previous approximation multipliers. Image processing applications are simulated, and the performance of the proposed multipliers is verified using simulations using the Xilinx ISE 13.2 tool. The proposed RMul designs outperform current techniques by up to 30.3% in area, 20.2% in power, and 43.9% in delay, according to experimental results. In addition, the suggested multipliers outperform existing multipliers in terms of SSIM and PSNR.
The present era has witnessed the wide deployment of reconfigurable hardware or Field Programmable Gate Arrays (FPGAs) in edge and cloud platforms. With its ability of dynamic partial reconfiguration at runtime, FPGAs...
详细信息
The present era has witnessed the wide deployment of reconfigurable hardware or Field Programmable Gate Arrays (FPGAs) in edge and cloud platforms. With its ability of dynamic partial reconfiguration at runtime, FPGAs provide the apt environment to execute a variety of real-time tasks in strict power and timing constraints. However, threats associated with the vulnerability of hardware like hardware Trojan horses may cause sudden delays at runtime or may even drain the power budget of the system to prevent completion of the tasks before their associated deadlines. We consider a resource-constraint FPGA-based edge platform with strict power budget. This is associated with execution of several periodic and nonperiodic hard real-time approximate computing tasks, i.e., tasks whose result can vary within a certain range but must complete within a prespecified deadline. We depict how delay inducing and power draining hardware Trojans may jeopardize the scenario. We propose deployment of low overhead agents or self-aware modules (SAMs) that can facilitate decentralized control and nonintrusive security in such an environment. With each FPGA that is entrusted with execution of a series of tasks or a task schedule, a SAM is associated. The SAM continuously monitors the performance of its host, based on prespecified power and timing data. On detecting any anomaly, it outsources the tasks to other SAMs for execution in other FPGAs, so that the tasks can complete their execution prior to their deadline. Low resource utilization and timing overhead of SAM, high task success rate for periodic tasks and low task rejection rate for nonperiodic tasks depict the suitability of our proposed mechanism.
The QRS detectors for wearable devices should be energy-efficient, even at the price of some loss of detection accuracy. This paper presents a low-complexity multiplierless QRS complex detection algorithm for mobile l...
详细信息
The QRS detectors for wearable devices should be energy-efficient, even at the price of some loss of detection accuracy. This paper presents a low-complexity multiplierless QRS complex detection algorithm for mobile longterm ECG monitoring based on arithmetic that approximates real numbers by a sum of power-of-two components. The proposed hardware-friendly computation technique involves only additions, subtractions, and multiplications or divisions by powers-of-two, executed simply by bit-shift operations. The performance of the algorithm, called Multiplierless Dual-Path Preprocessing (ML-DPP) QRS detector, on the MIT-BIH Database is as follows: Se = 99.66 %, PPV = 99.52 %, F1 = 99.59 %, ACC = 99.18 %, and DER = 0.82 %. The deterioration in detection accuracy of the multiplierless version is negligible compared to the optimal algorithm settings with real numbers, while computational complexity is substantially reduced. The proposed methodology to support the multiplication-free arithmetic can be applied to other QRS detection algorithms.
Nanotechnology has been plagued by reliability issues when used in building circuits and systems for computing. approximate computing, on the other hand, exploits the inherent tolerance of inaccuracies and imperfectio...
详细信息
Nanotechnology has been plagued by reliability issues when used in building circuits and systems for computing. approximate computing, on the other hand, exploits the inherent tolerance of inaccuracies and imperfections in many applications for performance and hardware efficiency. Will approximate computing provide a remedy for emerging nanotechnologies? Five research groups around the globe have made contributions in this special issue to address some key issues in building approximate computing systems using nanotechnologies.
approximate computing (AxC) has recently emerged as a successful approach for optimizing energy consumption in error-tolerant applications, such as deep neural networks (DNNs). The enormous model size and high computa...
详细信息
In this paper, we propose a novel design methodology for low-cost approximate radix-4 booth multipliers that significantly reduce energy consumption in error-tolerant signal processing applications. Unlike prior works...
详细信息
ISBN:
(数字)9798331543358
ISBN:
(纸本)9798331543365
In this paper, we propose a novel design methodology for low-cost approximate radix-4 booth multipliers that significantly reduce energy consumption in error-tolerant signal processing applications. Unlike prior works that focus solely on either partial product generation (PPG) or partial product accumulation (PPA), our approach co-designs these units to balance approximation errors, resulting in an internal error mean that is nearly zero. This integration enables substantial energy savings while maintaining comparable accuracy to existing designs. To further enhance energy efficiency, we incorporate a Synthesized-based clock gating technique, reducing unnecessary switching activity in the design. Experimental evaluations on Finite Impulse Response (FIR) filtering and image classification tasks demonstrate that the proposed multiplier achieves up to 39.3% reduction in total power consumption while maintaining accuracy within an acceptable range for error-tolerant applications. Additionally, delay and area utilisation are reduced by approximately 15.18% and 8.26%, respectively, compared to the existing method. Compared to traditional accurate booth multipliers, our design provides a better trade-off between power efficiency and computational precision, making it a promising alternative for energy-constrained digital signal processing (DSP) systems.
Loop perforation is a well-known software-based approximate computing technique that improves efficiency at the expense of accuracy. In this paper, we present UNApprox, an open-source software tool that facilitates th...
详细信息
ISBN:
(数字)9781665477635
ISBN:
(纸本)9781665477642
Loop perforation is a well-known software-based approximate computing technique that improves efficiency at the expense of accuracy. In this paper, we present UNApprox, an open-source software tool that facilitates the easy application of loop perforation within an application's source code. As an open-source tool, UNApprox can also be extended to implement other approximate computing techniques. We tested the tool on three iterative algorithms, and the results are discussed in this paper.
暂无评论