scaleTRIM: Scalable TRuncation-Based Integer Approximate Multiplier with Linearization and Compensation

Ali Mahani; Behnam Ghavami; Ebrahim Farahmand; Hassan Ghasemzadeh; Mohammad Javad Askarizadeh; Muhammad Abdullah Hanif; Muhammad Shafique

arxiv: 2303.02495 · v3 · submitted 2023-03-04 · 💻 cs.DC

scaleTRIM: Scalable TRuncation-Based Integer Approximate Multiplier with Linearization and Compensation

Ebrahim Farahmand , Mohammad Javad Askarizadeh , Ali Mahani , Behnam Ghavami , Hassan Ghasemzadeh , Muhammad Abdullah Hanif , Muhammad Shafique This is my paper

Pith reviewed 2026-05-24 09:58 UTC · model grok-4.3

classification 💻 cs.DC

keywords approximate multipliertruncationlinearizationerror compensationinteger arithmeticDNN accelerationpower delay producthardware efficiency

0 comments

The pith

scaleTRIM approximates multiplication by fitting linear functions to truncated operands and adding piecewise error compensation from segment averages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes scaleTRIM, an approximate integer multiplier that first truncates each operand to h bits according to leading-one position, then replaces the full product with a linear function obtained by curve fitting, and finally applies a correction term formed by averaging the approximation error inside each of M segments of the input space. These steps convert the operation into additions and bit shifts plus a small lookup table. The design supports multiple truncation depths and compensation levels to span different accuracy-efficiency points. When compared with prior approximate multipliers, it reports a 15.2 percent reduction in mean relative error distance under an efficiency constraint and a 22.8 percent reduction in power-delay product under joint accuracy and efficiency constraints. The same circuits, when placed inside deep neural networks for image classification, produce a better overall accuracy-efficiency operating point than the reference designs.

Core claim

Multiplication of two integers can be replaced by a linear function fitted to their h-bit truncated versions together with an M-segment piecewise-constant correction obtained by averaging the residual error inside each segment; the resulting circuit uses only additions, shifts, and a lookup table and yields lower mean relative error distance and lower power-delay product than existing approximate multipliers while preserving usability inside DNN inference workloads.

What carries the argument

Truncation of operands to h bits followed by curve-fitted linearization of the product term and an M-segment piecewise-average error compensation unit realized with a lookup table.

If this is right

Designers can select any truncation depth and any number of compensation segments to obtain a desired accuracy-efficiency operating point without redesigning the core arithmetic.
The multiplier satisfies both the accuracy constraint and the efficiency constraint while improving mean relative error distance relative to prior art.
The multiplier satisfies both the accuracy constraint and the efficiency constraint while improving power-delay product relative to prior art.
When substituted into deep neural networks for image classification, the design produces a measurably better accuracy-efficiency trade-off than the compared state-of-the-art approximate multipliers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same truncation-plus-linear-fit pattern could be retargeted to other fixed-point arithmetic operations such as squaring or division by deriving new fitting coefficients and segment tables.
An adaptive version that chooses the number of segments or the truncation depth at run time according to operand statistics might further improve the average trade-off without changing the hardware template.
Because the method is parameter-free once the linear coefficients and segment table are stored, it could be instantiated for many different integer widths simply by regenerating the small lookup tables.

Load-bearing premise

The linear functions obtained by curve fitting on the truncated operands and the piecewise averages used for compensation will continue to produce the reported accuracy-efficiency numbers for the bit widths, input distributions, and DNN workloads examined in the evaluation.

What would settle it

Measuring mean relative error distance and power-delay product on the same bit widths but with input distributions or neural-network models drawn from a source materially different from those used in the paper and finding that the claimed 15.2 percent and 22.8 percent gains disappear or reverse.

Figures

Figures reproduced from arXiv: 2303.02495 by Ali Mahani, Behnam Ghavami, Ebrahim Farahmand, Hassan Ghasemzadeh, Mohammad Javad Askarizadeh, Muhammad Abdullah Hanif, Muhammad Shafique.

**Figure 1.** Figure 1: Absolute Relative Error (ARED) of state-of-the-art works (a-c) [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 4.** Figure 4: An Example of Error Value (EV) of proposed approximate [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: The hardware design of scaleTRIM constants without the use of memory. Furthermore, the value for compensating the error is selected based on added Xh and Yh value by M × 1 multiplexer. In the last step, the result of added compensated unit and arithmetic unit is bit-wise shifted with the amount resulting from adding the LODs of the inputs operand. Error-configurability: The proposed multiplier is error con… view at source ↗

**Figure 6.** Figure 6: Accuracy and Efficiency of 8-bit Approximate Multipliers [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Design space of comparison the 8-bit scaleTRIM with the state [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

read the original abstract

In this paper, we propose a scalable approximate multiplier design, scaleTRIM, that approximates the multiplication operation using fitted linear functions, also referred to as linearization. We show that multiplication operations can be completely replaced by low-cost addition and bit-wise shift operations by exploiting linearization. Moreover, our proposed design utilizes a lookup table (LUT)-based compensation unit as a novel error-reduction method. In essence, input operands are truncated to a reduced bit-width representation (i.e., h bits) based on their leading-one positions. Then, a curve-fitting method is employed to map the product term to a linear function. Additionally, a piecewise constant error-correction term is used to reduce the approximation error. To compute the piecewise constant, we divide the function space into M segments and average the errors within each segment. In particular, our multiplier supports various degrees of truncation and error compensation to offer a range of accuracy-efficiency trade-offs. The proposed multiplier improves the Mean Relative Error Distance (MRED) by about 15.2% while satisfying the efficiency constraint and improves the Power Delay Product (PDP) by about 22.8% while satisfying the accuracy and efficiency constraints compared to different state-of-the-art approximate multipliers. From a usability perspective, our evaluation of the proposed design for image classification using Deep Neural Networks (DNNs) demonstrates that scaleTRIM offers a better accuracy-efficiency trade-off than state-of-the-art approximate multiplier designs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

scaleTRIM is a workable approximate multiplier whose reported gains rest on curve-fitting the linear terms and segment averages to the target operation.

read the letter

The main takeaway is that this paper describes a concrete hardware design for an approximate multiplier. It truncates operands around their leading one to h bits, replaces the product with a linear function implemented by adds and shifts, and adds a small LUT holding average error corrections over M segments. The design is scalable across truncation levels and segment counts, and the abstract reports 15% better MRED and 23% better PDP than some prior approximate multipliers, plus usable accuracy on a DNN image classifier. That combination of truncation, linearization, and piecewise compensation in one unit is the actual new piece here, even if each element has earlier precedents. The paper does a reasonable job showing the accuracy-efficiency points and the DNN use case. The soft spots are exactly where the stress-test note points. The linear coefficients and compensation values are obtained by fitting and averaging on the multiplication being approximated, so the numbers are specific to the fitting data, the chosen h and M, and the input statistics used. If real DNN operands or other bit widths differ, the gains may shrink or require retuning. The abstract gives no error bars, no description of the fitting procedure, and no cross-distribution checks, which leaves the central empirical claims only weakly supported until the full manuscript is examined. This work is aimed at hardware people building approximate arithmetic for DNN accelerators. A reader who needs concrete multiplier comparisons and synthesis numbers will find it useful. It deserves peer review because the design is described in enough detail to be built and measured, and the claims are testable. I would send it out rather than desk-reject.

Referee Report

3 major / 2 minor

Summary. The paper proposes scaleTRIM, a scalable approximate integer multiplier that truncates operands to h bits based on leading-one position, replaces multiplication with low-cost linear functions (additions and shifts) obtained via curve-fitting, and applies a LUT-based piecewise constant compensation (M segments, segment-wise error averages) for error reduction. It reports ~15.2% MRED improvement under efficiency constraints and ~22.8% PDP improvement under accuracy/efficiency constraints versus prior approximate multipliers, plus improved accuracy-efficiency trade-offs when used in DNN image classification.

Significance. If the accuracy and efficiency gains prove robust to input distributions and bit-widths beyond the fitting data, the design contributes a tunable truncation-plus-linearization methodology that converts multiplication to add/shift operations with explicit compensation, offering a new point in the approximate-computing design space for energy-constrained accelerators.

major comments (3)

[Abstract / design methodology] Abstract and design section: the linear functions are obtained by curve-fitting directly to the product of the truncated operands being approximated, and the piecewise constants are segment-wise averages computed on the identical error surface; this makes the reported MRED/PDP gains specific to the fitting distribution and segment boundaries rather than intrinsic, undermining the claim of general scalability across bit-widths and DNN workloads without per-application retuning.
[Evaluation / results] Evaluation section: the headline 15.2% MRED and 22.8% PDP improvements are stated without error bars, statistical tests, or explicit description of how the curve-fitting was validated (e.g., hold-out sets, cross-distribution testing); the central empirical claim therefore rests on point estimates whose stability is unquantified.
[DNN evaluation] DNN evaluation: while the paper shows better trade-offs on image classification, it does not report whether the same (h, M) parameters fitted on generic operands were used or whether retuning was performed per network; this directly affects the usability claim.

minor comments (2)

[Design] Notation for the linear function coefficients and the exact definition of the truncation width h should be introduced with an equation in the design section for reproducibility.
[Abstract / results tables] The abstract states concrete percentage improvements; the corresponding tables or figures should explicitly list the exact baseline designs and bit-widths used for each comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We provide point-by-point responses to the major comments below, indicating where we agree and will make revisions.

read point-by-point responses

Referee: [Abstract / design methodology] Abstract and design section: the linear functions are obtained by curve-fitting directly to the product of the truncated operands being approximated, and the piecewise constants are segment-wise averages computed on the identical error surface; this makes the reported MRED/PDP gains specific to the fitting distribution and segment boundaries rather than intrinsic, undermining the claim of general scalability across bit-widths and DNN workloads without per-application retuning.

Authors: The curve-fitting and compensation calculations are performed exhaustively over all possible combinations of the h-bit truncated operands, which constitute the complete function space rather than samples from any particular distribution. This makes the approximation intrinsic to the truncated multiplication operation. The scalability arises from the parametric nature of h and M, allowing the same methodology to be applied to different bit-widths. We will revise the design section to clarify that the fitting uses exhaustive enumeration of the truncated operand space. revision: partial
Referee: [Evaluation / results] Evaluation section: the headline 15.2% MRED and 22.8% PDP improvements are stated without error bars, statistical tests, or explicit description of how the curve-fitting was validated (e.g., hold-out sets, cross-distribution testing); the central empirical claim therefore rests on point estimates whose stability is unquantified.

Authors: We agree that more details on the validation of the curve-fitting would strengthen the paper. Since the fitting is deterministic and based on the exact mathematical products, hold-out validation is not relevant. We will add an explicit description of the curve-fitting and compensation computation process in the revised manuscript. We will also consider including results across multiple input distributions to quantify stability. revision: yes
Referee: [DNN evaluation] DNN evaluation: while the paper shows better trade-offs on image classification, it does not report whether the same (h, M) parameters fitted on generic operands were used or whether retuning was performed per network; this directly affects the usability claim.

Authors: We will update the DNN evaluation section to clarify that the (h, M) parameters were selected from the general accuracy-efficiency characterization of the multiplier and used consistently across all DNN experiments without any per-network retuning. This supports the claim of usability as a general-purpose approximate multiplier. revision: yes

Circularity Check

0 steps flagged

No significant circularity; design uses explicit curve-fitting as stated method

full rationale

The paper explicitly describes its core technique as employing curve-fitting to obtain linear functions for truncated operands and computing piecewise averages for compensation. These are presented as design choices for an approximate multiplier, not as a first-principles derivation or prediction. Performance claims (MRED/PDP improvements) are empirical comparisons to other designs and exact multiplication, with no reduction of the central result to a self-referential fit or self-citation chain. The approach is self-contained engineering work; the fitting process is the approximation mechanism itself rather than a hidden circular step.

Axiom & Free-Parameter Ledger

3 free parameters · 1 axioms · 0 invented entities

The central claim rests on fitted linear coefficients obtained via curve-fitting and on the choice of M segments for averaging errors; these are free parameters introduced to make the approximation work. The assumption that truncation based on leading-one position preserves enough information for the linear map is a domain assumption.

free parameters (3)

linear function coefficients
Obtained by curve-fitting the product term after truncation; directly determine the approximation.
M (number of segments)
Controls how the error space is divided for piecewise constant compensation; chosen to reduce error.
h (truncation width)
Determines how many bits are kept after leading-one detection; trades accuracy for cost.

axioms (1)

domain assumption Multiplication of truncated operands can be usefully approximated by a linear function of the inputs.
Invoked when the paper states that multiplication operations can be completely replaced by low-cost addition and bit-wise shift operations via linearization.

pith-pipeline@v0.9.0 · 5822 in / 1480 out tokens · 23286 ms · 2026-05-24T09:58:50.139568+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[2]

An Improved Logarithmic Multiplier for Energy-Efﬁcient Neural Computing

Mohammad Saeed Ansari, Bruce F. Cockburn, and Jie Han. “An Improved Logarithmic Multiplier for Energy-Efﬁcient Neural Computing”. In: IEEE Trans- actions on Computers 70.4 (2021), pp. 614–625. DOI: 10.1109/TC.2020.2992113

work page doi:10.1109/tc.2020.2992113 2021
[3]

AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch

Dimitrios Danopoulos et al. “AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022), pp. 1–1. DOI: 10 . 1109 / TCAD . 2022 . 3212645

work page 2022
[4]

On the use of approximate adders in carry-save multiplier- accumulators

Darjn Esposito et al. “On the use of approximate adders in carry-save multiplier- accumulators”. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS). 2017, pp. 1–4. DOI: 10.1109/ISCAS.2017.8050437

work page doi:10.1109/iscas.2017.8050437 2017
[5]

Approximate multipliers based on a novel unbiased approximate 4-2 compressor

Bao Fang et al. “Approximate multipliers based on a novel unbiased approximate 4-2 compressor”. In: Integration 81 (2021), pp. 17–24

work page 2021
[6]

High performance and optimal conﬁguration of accurate heterogeneous block-based approximate adder

Ebrahim Farahmand et al. “High performance and optimal conﬁguration of accurate heterogeneous block-based approximate adder”. In: arXiv preprint arXiv:2106.08800 (2021)

work page arXiv 2021
[7]

Error resilience analysis for systematically employing approximate computing in con- volutional neural networks

Muhammad Abdullah Hanif, Rehan Haﬁz, and Muhammad Shaﬁque. “Error resilience analysis for systematically employing approximate computing in con- volutional neural networks”. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE) . 2018, pp. 913–916. DOI: 10 . 23919 / DATE . 2018.8342139

work page arXiv 2018
[8]

DRUM: A Dynamic Range Unbiased Multiplier for approximate applications

Soheil Hashemi, R. Iris Bahar, and Sherief Reda. “DRUM: A Dynamic Range Unbiased Multiplier for approximate applications”. In: 2015 IEEE/ACM Interna- tional Conference on Computer-Aided Design (ICCAD). 2015, pp. 418–425. DOI: 10.1109/ICCAD.2015.7372600

work page doi:10.1109/iccad.2015.7372600 2015
[9]

ApproxLP: Approximate Multiplication with Linearization and Iterative Error Control

Mohsen Imani et al. “ApproxLP: Approximate Multiplication with Linearization and Iterative Error Control”. In: 2019 56th ACM/IEEE Design Automation Conference (DAC). 2019, pp. 1–6

work page 2019
[10]

A review, classiﬁcation, and comparative evaluation of approximate arithmetic circuits

Honglan Jiang et al. “A review, classiﬁcation, and comparative evaluation of approximate arithmetic circuits”. In: ACM Journal on Emerging Technologies in Computing Systems (JETC) 13.4 (2017), pp. 1–34

work page 2017
[11]

Leading one detectors and leading one position detectors - An evolutionary design methodology

K. Kunaraj and R. Seshasayanan. “Leading one detectors and leading one position detectors - An evolutionary design methodology”. In: Canadian Journal of Electrical and Computer Engineering 36.3 (2013), pp. 103–110. DOI: 10.1109/ CJECE.2013.6704691

work page arXiv 2013
[12]

Computer Multiplication and Division Using Binary Loga- rithms

John N. Mitchell. “Computer Multiplication and Division Using Binary Loga- rithms”. In: IRE Transactions on Electronic Computers EC-11.4 (1962), pp. 512–

work page 1962
[13]

DOI: 10.1109/TEC.1962.5219391

work page doi:10.1109/tec.1962.5219391 1962
[14]

Energy-Efﬁcient Approximate Multiplication for Digital Signal Processing and Classiﬁcation Applications

Srinivasan Narayanamoorthy et al. “Energy-Efﬁcient Approximate Multiplication for Digital Signal Processing and Classiﬁcation Applications”. In: IEEE Transac- tions on Very Large Scale Integration (VLSI) Systems23.6 (2015), pp. 1180–1184. DOI: 10.1109/TVLSI.2014.2333366

work page doi:10.1109/tvlsi.2014.2333366 2015
[15]

Architectural-space exploration of approximate multipli- ers

Semeen Rehman et al. “Architectural-space exploration of approximate multipli- ers”. In: 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE. 2016, pp. 1–8

work page 2016
[16]

Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication

Hassaan Saadat, Haseeb Bokhari, and Sri Parameswaran. “Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems37.11 (2018), pp. 2623–2635. DOI: 10.1109/TCAD.2018.2857262

work page doi:10.1109/tcad.2018.2857262 2018
[17]

REALM: Reduced-Error Approximate Log-based Integer Multiplier

Hassaan Saadat et al. “REALM: Reduced-Error Approximate Log-based Integer Multiplier”. In: 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) . 2020, pp. 1366–1371. DOI: 10 . 23919 / DATE48585 . 2020 . 9116315

work page 2020
[18]

Cross-layer approximate computing: From logic to architectures

Muhammad Shaﬁque et al. “Cross-layer approximate computing: From logic to architectures”. In: Proceedings of the 53rd Annual Design Automation Confer- ence. 2016, pp. 1–6

work page 2016
[19]

High-performance accurate and approximate multipliers for fpga-based hardware accelerators

Salim Ullah et al. “High-performance accurate and approximate multipliers for fpga-based hardware accelerators”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41.2 (2021), pp. 211–224

work page 2021
[20]

LETAM: A low energy truncation-based approximate multiplier

Shaghayegh Vahdat et al. “LETAM: A low energy truncation-based approximate multiplier”. In: Computers & Electrical Engineering 63 (2017), pp. 1–17. ISSN : 0045-7906. DOI: https://doi.org/10.1016/j.compeleceng.2017.08.019

work page doi:10.1016/j.compeleceng.2017.08.019 2017
[21]

TOSAM: An Energy-Efﬁcient Truncation- and Rounding-Based Scalable Approximate Multiplier

Shaghayegh Vahdat et al. “TOSAM: An Energy-Efﬁcient Truncation- and Rounding-Based Scalable Approximate Multiplier”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27.5 (2019), pp. 1161–1173. DOI: 10.1109/TVLSI.2018.2890712

work page doi:10.1109/tvlsi.2018.2890712 2019
[22]

Approximate computing and the quest for comput- ing efﬁciency

Swagath Venkataramani et al. “Approximate computing and the quest for comput- ing efﬁciency”. In: 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE. 2015, pp. 1–6

work page 2015
[23]

RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efﬁcient Digital Signal Processing

Reza Zendegani et al. “RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efﬁcient Digital Signal Processing”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 25.2 (2017), pp. 393–401. DOI: 10.1109/TVLSI.2016.2587696

work page doi:10.1109/tvlsi.2016.2587696 2017
[24]

Design-Efﬁcient Approximate Multiplication Circuits Through Partial Product Perforation

Georgios Zervakis et al. “Design-Efﬁcient Approximate Multiplication Circuits Through Partial Product Perforation”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 24.10 (2016), pp. 3105–3117. DOI: 10.1109/TVLSI. 2016.2535398

work page doi:10.1109/tvlsi 2016

[1] [2]

An Improved Logarithmic Multiplier for Energy-Efﬁcient Neural Computing

Mohammad Saeed Ansari, Bruce F. Cockburn, and Jie Han. “An Improved Logarithmic Multiplier for Energy-Efﬁcient Neural Computing”. In: IEEE Trans- actions on Computers 70.4 (2021), pp. 614–625. DOI: 10.1109/TC.2020.2992113

work page doi:10.1109/tc.2020.2992113 2021

[2] [3]

AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch

Dimitrios Danopoulos et al. “AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022), pp. 1–1. DOI: 10 . 1109 / TCAD . 2022 . 3212645

work page 2022

[3] [4]

On the use of approximate adders in carry-save multiplier- accumulators

Darjn Esposito et al. “On the use of approximate adders in carry-save multiplier- accumulators”. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS). 2017, pp. 1–4. DOI: 10.1109/ISCAS.2017.8050437

work page doi:10.1109/iscas.2017.8050437 2017

[4] [5]

Approximate multipliers based on a novel unbiased approximate 4-2 compressor

Bao Fang et al. “Approximate multipliers based on a novel unbiased approximate 4-2 compressor”. In: Integration 81 (2021), pp. 17–24

work page 2021

[5] [6]

High performance and optimal conﬁguration of accurate heterogeneous block-based approximate adder

Ebrahim Farahmand et al. “High performance and optimal conﬁguration of accurate heterogeneous block-based approximate adder”. In: arXiv preprint arXiv:2106.08800 (2021)

work page arXiv 2021

[6] [7]

Error resilience analysis for systematically employing approximate computing in con- volutional neural networks

Muhammad Abdullah Hanif, Rehan Haﬁz, and Muhammad Shaﬁque. “Error resilience analysis for systematically employing approximate computing in con- volutional neural networks”. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE) . 2018, pp. 913–916. DOI: 10 . 23919 / DATE . 2018.8342139

work page arXiv 2018

[7] [8]

DRUM: A Dynamic Range Unbiased Multiplier for approximate applications

Soheil Hashemi, R. Iris Bahar, and Sherief Reda. “DRUM: A Dynamic Range Unbiased Multiplier for approximate applications”. In: 2015 IEEE/ACM Interna- tional Conference on Computer-Aided Design (ICCAD). 2015, pp. 418–425. DOI: 10.1109/ICCAD.2015.7372600

work page doi:10.1109/iccad.2015.7372600 2015

[8] [9]

ApproxLP: Approximate Multiplication with Linearization and Iterative Error Control

Mohsen Imani et al. “ApproxLP: Approximate Multiplication with Linearization and Iterative Error Control”. In: 2019 56th ACM/IEEE Design Automation Conference (DAC). 2019, pp. 1–6

work page 2019

[9] [10]

A review, classiﬁcation, and comparative evaluation of approximate arithmetic circuits

Honglan Jiang et al. “A review, classiﬁcation, and comparative evaluation of approximate arithmetic circuits”. In: ACM Journal on Emerging Technologies in Computing Systems (JETC) 13.4 (2017), pp. 1–34

work page 2017

[10] [11]

Leading one detectors and leading one position detectors - An evolutionary design methodology

K. Kunaraj and R. Seshasayanan. “Leading one detectors and leading one position detectors - An evolutionary design methodology”. In: Canadian Journal of Electrical and Computer Engineering 36.3 (2013), pp. 103–110. DOI: 10.1109/ CJECE.2013.6704691

work page arXiv 2013

[11] [12]

Computer Multiplication and Division Using Binary Loga- rithms

John N. Mitchell. “Computer Multiplication and Division Using Binary Loga- rithms”. In: IRE Transactions on Electronic Computers EC-11.4 (1962), pp. 512–

work page 1962

[12] [13]

DOI: 10.1109/TEC.1962.5219391

work page doi:10.1109/tec.1962.5219391 1962

[13] [14]

Energy-Efﬁcient Approximate Multiplication for Digital Signal Processing and Classiﬁcation Applications

Srinivasan Narayanamoorthy et al. “Energy-Efﬁcient Approximate Multiplication for Digital Signal Processing and Classiﬁcation Applications”. In: IEEE Transac- tions on Very Large Scale Integration (VLSI) Systems23.6 (2015), pp. 1180–1184. DOI: 10.1109/TVLSI.2014.2333366

work page doi:10.1109/tvlsi.2014.2333366 2015

[14] [15]

Architectural-space exploration of approximate multipli- ers

Semeen Rehman et al. “Architectural-space exploration of approximate multipli- ers”. In: 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE. 2016, pp. 1–8

work page 2016

[15] [16]

Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication

Hassaan Saadat, Haseeb Bokhari, and Sri Parameswaran. “Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems37.11 (2018), pp. 2623–2635. DOI: 10.1109/TCAD.2018.2857262

work page doi:10.1109/tcad.2018.2857262 2018

[16] [17]

REALM: Reduced-Error Approximate Log-based Integer Multiplier

Hassaan Saadat et al. “REALM: Reduced-Error Approximate Log-based Integer Multiplier”. In: 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) . 2020, pp. 1366–1371. DOI: 10 . 23919 / DATE48585 . 2020 . 9116315

work page 2020

[17] [18]

Cross-layer approximate computing: From logic to architectures

Muhammad Shaﬁque et al. “Cross-layer approximate computing: From logic to architectures”. In: Proceedings of the 53rd Annual Design Automation Confer- ence. 2016, pp. 1–6

work page 2016

[18] [19]

High-performance accurate and approximate multipliers for fpga-based hardware accelerators

Salim Ullah et al. “High-performance accurate and approximate multipliers for fpga-based hardware accelerators”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41.2 (2021), pp. 211–224

work page 2021

[19] [20]

LETAM: A low energy truncation-based approximate multiplier

Shaghayegh Vahdat et al. “LETAM: A low energy truncation-based approximate multiplier”. In: Computers & Electrical Engineering 63 (2017), pp. 1–17. ISSN : 0045-7906. DOI: https://doi.org/10.1016/j.compeleceng.2017.08.019

work page doi:10.1016/j.compeleceng.2017.08.019 2017

[20] [21]

TOSAM: An Energy-Efﬁcient Truncation- and Rounding-Based Scalable Approximate Multiplier

Shaghayegh Vahdat et al. “TOSAM: An Energy-Efﬁcient Truncation- and Rounding-Based Scalable Approximate Multiplier”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27.5 (2019), pp. 1161–1173. DOI: 10.1109/TVLSI.2018.2890712

work page doi:10.1109/tvlsi.2018.2890712 2019

[21] [22]

Approximate computing and the quest for comput- ing efﬁciency

Swagath Venkataramani et al. “Approximate computing and the quest for comput- ing efﬁciency”. In: 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE. 2015, pp. 1–6

work page 2015

[22] [23]

RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efﬁcient Digital Signal Processing

Reza Zendegani et al. “RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efﬁcient Digital Signal Processing”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 25.2 (2017), pp. 393–401. DOI: 10.1109/TVLSI.2016.2587696

work page doi:10.1109/tvlsi.2016.2587696 2017

[23] [24]

Design-Efﬁcient Approximate Multiplication Circuits Through Partial Product Perforation

Georgios Zervakis et al. “Design-Efﬁcient Approximate Multiplication Circuits Through Partial Product Perforation”. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 24.10 (2016), pp. 3105–3117. DOI: 10.1109/TVLSI. 2016.2535398

work page doi:10.1109/tvlsi 2016