# Efficient Co-Approximate Parallel Compressive Depth Reconstruction on FPGA

## Abstract

Efficient depth image reconstruction from sparse samples is crucial for machine perception applications, such as robotics, vehicle assistance and autonomy. It demands fast processing speed with low power consumption for sensing quality and safety, as well as cost reduction for FPGA and solid state implementations, within constrained resource budgets on edge devices. A new co-approximate framework of parallel approximate compressive depth reconstruction engine on FPGA is proposed using ℓ1 solvers, proximal gradient decent (PGD), with instrumented frequency and voltage scaling during the iterative optimization process. By evaluating various number of parallel approximate processing units for the depth image reconstruction engine, up to 51% further power saving is achieved, and 421× speed up of parallel processing compared to the baseline, henceforth the efficiency is elevated over 43×.

## Authors

Yun Wu *School of Electrical, Electronic and Computer Engineering, Queen’s University, Belfast*

John McAllister *School of Electrical, Electronic and Computer Engineering, Queen’s University, Belfast*

## Publication Information

**Journal:** ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) **Year:** 2025 **Pages:** 1-5 **DOI:** [10.1109/ICASSP49660.2025.10888441](https://doi.org/10.1109/ICASSP49660.2025.10888441) **Article Number:** 10888441 **ISSN:** Electronic ISSN: 2379-190X, Print on Demand(PoD) ISSN: 1520-6149

## Metrics

**Total Downloads:** 140

---

## Keywords

**IEEE Keywords:** Image coding, Power demand, Instruments, Voltage, Robot sensing systems, Throughput, Sensors, Image reconstruction, Field programmable gate arrays, Engines

**Index Terms:** Depth Reconstruction, Instrumentation, Power Consumption, Processing Speed, Processing Unit, Depth Images, Frequency Scale, Edge Devices, Proximal Gradient, Parallel Units, Voltage Scaling, Iterative Optimization Process, Mean Square Error, Fixed Point, Light Detection And Ranging, Data Streams, Approximate Computation, Clock Rate, Phase-locked Loop, Power Scaling, Digital Circuits, Structural Similarity Index Measure, Single Precision, Pixel Block, ARM Processor, Lowest Voltage, Bit-width, Considerable Depth, Dynamic Power

**Author Keywords:** Depth Reconstruction, Compressive Sensing, Convex Optimization, Field-Programmable Gate Array, Approximate Computing

undefined
## SECTION I. Introduction

Light detection and ranging (LiDAR) is an important perception technologies for autonomous systems [^1], which measures the time-of-flight (ToF) through emitted and reflected photons to reconstruct the depth information of surrounding environment. Power is consumed by both emitting and detecting photons, which is crucial to sensing quality and safety [^2]. However, for the latest high resolution spatial LiDAR depth reconstruction, the power consumption prohibitively limit its application to resource-constrained platforms, especially edge devices with limited battery supply and physical space, such as drone scene mapping with SLAM [^3], augmented reality [^4] and mobile robotics [^5].

Approximate computing (AC) is a promising paradigm to reduce power consumption in resource-constrained devices [^6], which has wide applications in signal processing [^7], and machine learning [^8]. Reduced precision (RP), a common AC technique, lowers the accuracy of arithmetic operations by shortening the bit-width of numbers representation, enabling the saving in computational cost of memory and logical units, as well as energy consumption [^9]. Power scaling, as another particular AC technique, excessively undervolts and overclocks the digital circuits, therefore unleashing extra processing performance with further power reductions [^10].

By adopting AC in LiDAR applications, it not only advances the processing speed but also reduce the power consumption of data processing unit near the photon detection components, where the energy saving also benefits the sensor from less thermal noise interference [^11]. However, in previous work [^12] [^13], the power consumption are estimated based on the simulation tools, assuming real-time performance achieved with parallel processing units on chip.

In this work, a new co-approximate compressive depth reconstruction with parallel processing engine using iterative *ℓ*1 solver, proximal gradient descent (PGD), is proposed for Field Programmable Gate Array (FPGA).

The contributions are summarized as follows:

- An approximate depth reconstruction engine with various number of parallel processing units
- A power scaling approximation on FPGA for the depth reconstruction engine.
- A primary study of the co-approximate parallel accelerator in power, performance and efficiency.

In Section II, compressive depth reconstruction is briefly described as well as the co-approximation. In Section III, the architecture of parallel approximate depth reconstruction engine is introduced with the power scaling instrument framework. The co-approximate performance is presented and the energy efficiency is evaluated in Section IV. A conclusion is given in Section V.

## SECTION II. Background

### A. Compressive Depth Reconstruction

To release the burden of raw histogram measurements in high resolution LiDar, a checkerboard compressive depth sensing (CBCS) solution applies compressive sensing to depth imaging that processes blocks of the depth image in parallel independently [^14], where the depth is reconstructed by solving two lasso problems [^15]: one for reconstructing the *depth-sum*, ${x_Q} \in {{\mathbb{R}}^{{n_B}}}$, where *nB* is the number of pixels in block *B*, and one for reconstructing the *photon count intensity*, ${x_I} \in {{\mathbb{R}}^{{n_B}}}$. Specifically, given ${y_Q} \in {{\mathbb{R}}^{{m_B}}}$ (resp. ${y_I} \in {{\mathbb{R}}^{{m_B}}}$) compressive measurements of the depth-sum (resp. photon count intensity), *xQ* and *xI* are recovered by solving

$$
\begin{align*} & \mathop {\operatorname{minimize} }\limits_{{x_Q}} \frac{1}{2}\left\| {{y_Q} - \bar A{x_Q}} \right\|_2^2 + \lambda {\left\| {F{x_Q}} \right\|_1}\tag{1a} \\ & \mathop {\operatorname{minimize} }\limits_{{x_I}} \frac{1}{2}\left\| {{y_I} - \bar A{x_I}} \right\|_2^2 + \lambda {\left\| {F{x_I}} \right\|_1}\tag{1b}\end{align*}
$$

where $\bar A \in {\{ 0.1\} ^{{m_B} \times {n_B}}}$ is a known binary matrix that encodes the active pixels in each block for each measurement in *yQ* or *yI*, and $F \in {{\mathbb{R}}^{{n_B} \times {n_B}}}$ is a sparsifying matrix, e.g. a invertible DCT matrix. *λ* &gt; 0 is a regularisation parameter, and ∥ · ∥2 and ∥ · ∥1 are, respectively, the *ℓ*2-and *ℓ*1-norms. After recovering *xQ* and *xI*, the final depth image at block *B* is obtained by dividing *xQ* by *xI* point-wise: ${x_D} = {x_Q} \cdot /{x_I} \in {{\mathbb{R}}^{{n_B}}}$. In this work, PGD is adopted to solve this twin lasso problems efficiently in (1) [^12].

![Figure 1](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu1-p5-wu-large.gif)

*Fig. 1: Parallel Depth Reconstruction Engine and System Architecture*

### B. Approximate Computing

#### 1) Reduced Precision

Reduced precision (RP) is an approximation technique that the bit width of a number in binary format is shortened, enabling lower cost hardware implementations as well as and energy consumption in computation and data storage [^9]. Only fixed point (FXP) representation is considered in this primary work for the approximate depth reconstruction engine, which stores real numbers with a fixed number of digits for the integer and fraction. The value of a fixed point number is given by:

$$
\begin{equation*}Int + Frac \times {2^{ - {\text{frac}\_\text{bits}}}}\tag{2}\end{equation*}
$$

where Integer bits *Int* denotes the integer value and Fractional bits *Frac* denotes the fractional value. By shorten the length of *Int* and *Frac*, it simplify the arithmetic implementation, however, introduce inaccuracy to computation.

#### 2) Power Scaling

Power consumption of modern digital signal processing circuits is the sum of:

$$
\begin{align*} & {P_{{\text{static}}}} = \mathop \sum \limits^C {I_{{\text{leakage}}}} \cdot V,\tag{3a} \\ & {P_{{\text{dynamic}}}} = \alpha \cdot C \cdot F \cdot {V^2},\tag{3b}\end{align*}
$$

where *α* is a constant depending on the process technology, *C* is the capacitance of resource utilization, *F* is the operating frequency, *V* is the supply voltage, and *I*leakage is the leakage current [^16]. The total power of FPGAs can be reduced by decreasing *α, C, F* and *V* , while throughput can be improved by increasing *F.* Power scaling, using undervolting *V* [^17], and overclocking *F* [^18], improves the energy efficiency, however, introduces vulnerability of digital circuits with random bit-flipping errors. This work develops a framework for compressive depth reconstruction combining power scaling with reduced precision, enabling parallel processing engine on FPGA with less power and better performance for sensing environment.

## SECTION III. Frameworks

### A. Depth Reconstruction Engine

The parallel approximate depth reconstruction engine is implemented through high level synthesis and deployed on Xilinx MPSoC, where each units include the twin lasso optimization process using PGD solver.

Figure 1 shows the hardware architecture of the Processing Systems (PS) and Programmable Logic (PL) on a Xilinx Ultrascale+ FPGA, with parallel depth reconstruction engine on the PL side.

On the PS side, the AXI Master is configured for streaming data from PS to PL. The AXI Slave is configured to stream data back from the PL. The SD peripheral I/O is configured for Linux booting and data preservation, while the *IIC* I/O is configured to enable PS PMBus access for adjusting the voltage in power scaling approximation.

On the PL side, all the depth reconstruction engine are connected to the Direct memory access (DMA) to form a parallel processing engine. The DMA is configured to write/read data to/from streaming FIFO to the depth reconstruction unit. The multiple depth reconstruction units share the FIFP data path to/from DMA that the more parallel depth reconstruction engines, the more data streaming over AXI bus in one package.

Table I shows the resource ultization of implemented parallel depth reconstruction engine and the system architecture on Xilinx Ultrascale+ ZCU106, where the latency is measured by profiling the time of data streaming and execution on PL for a single frame.

![Figure 2](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu.t1-p5-wu-large.gif)

*TABLE I:*

As shown, the approximate accelerator shows advantages in both resource utilization and processing performance. For single units, there are reduced resource cost of Look-Up-Table (LUT) about 10%, DSP over 75%, Flip-Flop (FF) around 20%, with one more BRAM. The processing time is above 3.5× faster, while the approximate engine does not achieve real-time performance with only 22 frame per second. This can be overcome by using more units up to 172×, however, the resources cost increase linearly leading to higher power consumption. By co-approximating with power scaling technique, the energy efficiency can be further improved.

### B. Co-Approximation

Firstly, the fixed-point arithmetic is approximated for the reduced precision implementation under the nominal voltage and frequency, where the 22 bits width is chosen with 10 bits integer and 12 bits fraction. The principle of this options is based on the least fidelity loss of depth reconstruction. Figure 2 shows the reconstructed depth image using single precision floating point and approximate fixed point, as well as the reference depth image.

![Figure 3](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu2-p5-wu-large.gif)

*Fig. 2: Reduced Precision Performance*

The mean square error (MSE), peak sigal to noise ratio (PSNR), and structure similarity index measure (SSIM) are used to justify the fidelity of reconstructed depth image according the reference. As shown, the approximate depth image loses fidelity less than 0.1 dB in PSNR and 0.05 in SSIM with slightly higher mean square error compared to the single precision. Though only single approximate depth image shows here, it is consistent over all cases using various number of units.

Then the power scaling approximation is implemented by instrumenting software application to adjust the voltage and frequency of PL. The power rails is accessed and controlled by on-chip power management integrated chip (PMIC) via *IIC* interface. The output voltage is modified using PMBus protocol command from ARM processor. The clock of PL is adjust by interfacing with the Xilinx device driver of the Phase Locked Loop (PLL) unit under the Linux system file. Notice that, this does not change the actual frequency of PL directly but adjust the PL frequency proportionally to the default value. For example, the default value read from the system file is 1*e*6. By tuning it to 1.5*e*6, the output clock rate from PS to PL is also boosted by 1.5×. To make it simpler, we call this factor as boost ratio in later context that indicates the proportion of frequency scaling.

A power management application is developed to monitor and scale the voltage through the *IIC* interface, as well as clock scaling through PLL unit with run-time visualization of voltage, current and power . [^19]

## SECTION IV. Evaluations

The co-approximation is evaluated on Xilinx MPSoC evaluation development boards, Zynq Ultrascale+ ZCU106. The parallel approximate depth reconstruction engine is designed and deployed using the Xilinx Vivado tool-set version 2019.1, targeting clock rate at 250 *MHz.* The power scaling application is running on the ARM processor and accessed via ssh x11 forwarding to the host. After the voltage scaling the input data is streamed from PS to PL and the outcome from PL is saved on SD card for later comparison to the reference depth image data.

### 1) Performance

Firstly, undervolting is evaluated by changing the PL power rail, *V CCINT*, from the nominal 0.85 *V* down to 0.65 *V* . Table II shows an example of power consumption for voltage scaling, keeping the nominal boost ratio at 1. As shown, the power consumption is reduced proportionally to the voltage, while increases as the unit number grows. Notice that the static and dynamic power can be separated by measuring and subtracting the power when no design configured on FPGA. We intent to show the total power consumption here as the voltage affects both static and dynamic power.

![Figure 4](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu.t2-p5-wu-large.gif)

*TABLE II:*

Next, overclocking is evaluated by adjusting the boost ratio from 1 to 3 with step of 0.5. Table III shows the example of power consumption for frequency scaling, keeping the lowest voltage 0.65 *V* . As shown, the power consumption increase proportionally to the boost ratio. As the frequency only affects the dynamic power, the total power consumption does not grow linearly. Notice that for 64 units, there is no measurement at boost ratio 3 due to the halted device.

![Figure 5](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu.t3-p5-wu-large.gif)

*TABLE III:*

In Table IV, the speedup of overclocking is also evaluated. The FP32 with boost ratio 1 is considered as the baseline. The speedup is calculated considering the ratio between the old execution time and the new execution time with the profiling approach described in Section III-A. As shown, the speedup increases almost linearly against the boost ratio, up to about 421× with 64 parallel units and 2.5 boost ratio.

![Figure 6](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu3-p5-wu-large.gif)

*Fig. 3: Energy Efficiency (Frame/Second/Watt)*

![Figure 7](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu.t4-p5-wu-large.gif)

*TABLE IV:*

### 2) Fidelity

By considering the co-approximation, the reconstructed depth image with reduced precision has introduced fidelity loss, however, improves the resource utilization and processing speed significantly. The power scaling using under-volting and overclocking further reduces the power consumption and speedup, however, would introduce new random bit-flip errors that affects the accuracy of depth reconstruction, especially when scaling the voltage down and scaling the frequency up simultaneously.

Table V shows the reconstructed depth image selected by prioritising lower voltage in co-approximation. The numbers in the bracket are the voltage and boost ratio correspondingly.

![Figure 8](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/10887540/10887541/10888441/wu.t5-p5-wu-large.gif)

*TABLE V:*

As shown, the single precision FP32, using the lowest voltage and highest boost ratio, can still maintain the same fidelity compared to Figure 2b. By considering the comparable fidelity in Figure 2c of fixed point precision, a minor degradation occurs when 1 units adopted with the lowest voltage and highest boost ratio. By slightly increasing the voltage or reducing the frequency, it can achieve the similar fidelity as shown in Figure 2c. As the number of units increases, there is a clear trend of less undervolting and overclocking to maintain considerable depth image fidelity.

### 3) Efficiency

By measuring the performance and power consumption with various voltage, frequency and unit number, the efficiency of co-approximate parallel depth reconstruction engine is evaluated considering the throughput of parallel engine against the consumed the power with Frame/Second/Watt. Figure 3 illustrates the calculated efficiency for all possible combination of factors. The single precision using single unit with nominal voltage, 0.85 *V* , and boost ratio 1 is considered as the baseline. As shown, the higher boost ratio, or the lower voltage, the higher efficiency. Furthermore, as the number of parallel processing unit grows, the higher Frame/Second/Watt is achieved up to 3500, which is over 43× of baseline.

## SECTION V. Conclusion

In this work, the parallel compressive depth reconstruction using co-approximation with joint reduced precision and power scaling is introduced. By instrumenting voltage and frequency scaling of Xilinx Ultrascale+ FPGA at run-time, the power consumption and processing speed are adjusted to meet the considerable depth image fidelity comparing the outcome with baseline single precision. The results shows significant power saving and speedup using the co-approximate approach and the efficiency improvement in throughput per watt is extraordinarily high compared to single precision implementation.

## Footnotes

. The application executable and simple user guide are on public now at https://github.com/wincle626/ZCU106PowerMonitor

## References

[^1]: K. Park, S. Kim, and K. Sohn, “High-precision depth estimation with the 3d lidar and stereo fusion,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 2156–2163. [IEEE](https://ieeexplore.ieee.org/document/8461048) [Google Scholar](https://scholar.google.com/scholar?as_q=High-precision+depth+estimation+with+the+3d+lidar+and+stereo+fusion&as_occt=title&hl=en&as_sdt=0%2C31)

[^2]: A. Aßmann, Y. Wu, B. Stewart, and A. M. Wallace, “Accelerated 3d image reconstruction for resource constrained systems,” in 2020 28th European Signal Processing Conference (EUSIPCO), 2021, pp. 565–569. [IEEE](https://ieeexplore.ieee.org/document/9287749) [Google Scholar](https://scholar.google.com/scholar?as_q=Accelerated+3d+image+reconstruction+for+resource+constrained+systems&as_occt=title&hl=en&as_sdt=0%2C31)

[^3]: R. Latif and A. Saddik, “Slam algorithms implementation in a uav, based on a heterogeneous system: A survey,” in 2019 4th World Conference on Complex Systems (WCCS), 2019, pp. 1–6. [IEEE](https://ieeexplore.ieee.org/document/8930783) [Google Scholar](https://scholar.google.com/scholar?as_q=Slam+algorithms+implementation+in+a+uav%2C+based+on+a+heterogeneous+system%3A+A+survey&as_occt=title&hl=en&as_sdt=0%2C31)

[^4]: L. Qian, J. Y. Wu, S. P. DiMaio, N. Navab, and P. Kazanzides, “A review of augmented reality in robotic-assisted surgery,” IEEE Transactions on Medical Robotics and Bionics, vol. 2, no. 1, pp. 1–16, 2020. [IEEE](https://ieeexplore.ieee.org/document/8918274) [Google Scholar](https://scholar.google.com/scholar?as_q=A+review+of+augmented+reality+in+robotic-assisted+surgery&as_occt=title&hl=en&as_sdt=0%2C31)

[^5]: M. Liu, Z. Hou, Z. Sun, N. Yin, H. Yang, Y. Wang, Z. Chu, and H. Kong, “Campus guide: A lidar-based mobile robot,” in 2019 European Conference on Mobile Robots (ECMR), 2019, pp. 1–6. [IEEE](https://ieeexplore.ieee.org/document/8870916) [Google Scholar](https://scholar.google.com/scholar?as_q=Campus+guide%3A+A+lidar-based+mobile+robot&as_occt=title&hl=en&as_sdt=0%2C31)

[^6]: W. Liu, C. Gu, M. O’Neill, G. Qu, P. Montuschi, and F. Lombardi, “Security in approximate computing and approximate computing for security: Challenges and opportunities,” Proceedings of the IEEE, vol. 108, no. 12, pp. 2214–2231, 2020. [IEEE](https://ieeexplore.ieee.org/document/9244148) [Google Scholar](https://scholar.google.com/scholar?as_q=Security+in+approximate+computing+and+approximate+computing+for+security%3A+Challenges+and+opportunities&as_occt=title&hl=en&as_sdt=0%2C31)

[^7]: K. Roy and A. Raghunathan, “Approximate computing: An energy-efficient computing technique for error resilient applications,” in 2015 IEEE Computer Society Annual Symposium on VLSI, July 2015, pp. 473–475. [IEEE](https://ieeexplore.ieee.org/document/7309615) [Google Scholar](https://scholar.google.com/scholar?as_q=Approximate+computing%3A+An+energy-efficient+computing+technique+for+error+resilient+applications&as_occt=title&hl=en&as_sdt=0%2C31)

[^8]: A. Ibrahim, M. Osta, M. Alameh, M. Saleh, H. Chible, and M. Valle, “Approximate computing methods for embedded machine learning,” in 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 12 2018, pp. 845–848. [IEEE](https://ieeexplore.ieee.org/document/8617877) [Google Scholar](https://scholar.google.com/scholar?as_q=Approximate+computing+methods+for+embedded+machine+learning&as_occt=title&hl=en&as_sdt=0%2C31)

[^9]: A. Agrawal, J. Choi, K. Gopalakrishnan, S. Gupta, R. Nair, J. Oh, D. A. Prener, S. Shukla, V. Srinivasan, and Z. Sura, “Approximate computing: Challenges and opportunities,” 2016 IEEE International Conference on Rebooting Computing (ICRC), pp. 1–8, 2016. [IEEE](https://ieeexplore.ieee.org/document/7738674) [Google Scholar](https://scholar.google.com/scholar?as_q=Approximate+computing%3A+Challenges+and+opportunities&as_occt=title&hl=en&as_sdt=0%2C31)

[^10]: Y. Wu, J. F. C. Mota, and A. M. Wallace, “Joint undervolting and overclocking power scaling approximation on fpgas,” in 2022 Sensor Signal Processing for Defence Conference (SSPD), 2022, pp. 1–5. [IEEE](https://ieeexplore.ieee.org/document/9896229) [Google Scholar](https://scholar.google.com/scholar?as_q=Joint+undervolting+and+overclocking+power+scaling+approximation+on+fpgas&as_occt=title&hl=en&as_sdt=0%2C31)

[^11]: E. A. Webster, L. A. Grant, and R. K. Henderson, “A high-performance single-photon avalanche diode in 130-nm cmos imaging technology,” IEEE Electron Device Letters, vol. 33, no. 11, pp. 1589–1591, nov 2012. [IEEE](https://ieeexplore.ieee.org/document/6329401) [Google Scholar](https://scholar.google.com/scholar?as_q=A+high-performance+single-photon+avalanche+diode+in+130-nm+cmos+imaging+technology&as_occt=title&hl=en&as_sdt=0%2C31)

[^12]: Y. Wu, A. Assmann, B. Stewart, and A. M. Wallace, “Energy efficient approximate 3d image reconstruction,” IEEE Transactions on Emerging Topics in Computing, pp. 1–1, 2021. [IEEE](https://ieeexplore.ieee.org/document/9559866) [Google Scholar](https://scholar.google.com/scholar?as_q=Energy+efficient+approximate+3d+image+reconstruction&as_occt=title&hl=en&as_sdt=0%2C31)

[^13]: Y. Wu, A. M. Wallace, J. F. Mota, A. Aßmann, and B. Stewart, “Efficient reconfigurable mixed precision ℓ 1 solver for compressive depth reconstruction,” Journal of Signal Processing Systems, vol. 94, no. 10, pp. 1083–1099, Oct 2022. [Online]. Available: https://doi.org/10.1007/s11265-022-01766-3 [DOI](https://doi.org/10.1007/s11265-022-01766-3) [Google Scholar](https://scholar.google.com/scholar?as_q=Efficient+reconfigurable+mixed+precision&as_occt=title&hl=en&as_sdt=0%2C31)

[^14]: A. Aßmann, B. Stewart, J. F. C. Mota, and A. M. Wallace, “Compressive super-pixel lidar for high-framerate 3d depth imaging,” in 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2019, pp. 1–5. [IEEE](https://ieeexplore.ieee.org/document/8969177) [Google Scholar](https://scholar.google.com/scholar?as_q=Compressive+super-pixel+lidar+for+high-framerate+3d+depth+imaging&as_occt=title&hl=en&as_sdt=0%2C31)

[^15]: R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 58, pp. 267–288, 1996. [Online]. Available: https://www.jstor.org/stable/2346178 [Google Scholar](https://scholar.google.com/scholar?as_q=Regression+shrinkage+and+selection+via+the+lasso&as_occt=title&hl=en&as_sdt=0%2C31)

[^16]: R. Woods, J. Mcallister, R. Turner, and et al, FPGA-Based Implementation of Signal Processing Systems. Wiley Publishing, 2008. [DOI](https://doi.org/10.1002/9780470713785) [Google Scholar](https://scholar.google.com/scholar?as_q=FPGA-Based+Implementation+of+Signal+Processing+Systems&as_occt=title&hl=en&as_sdt=0%2C31)

[^17]: B. Salami, O. Unsal, and A. Cristal, “Fault characterization through fpga undervolting,” in 2018 28th International Conference on Field Programmable Logic and Applications (FPL), 2018, pp. 85–853. [IEEE](https://ieeexplore.ieee.org/document/8533473) [Google Scholar](https://scholar.google.com/scholar?as_q=Fault+characterization+through+fpga+undervolting&as_occt=title&hl=en&as_sdt=0%2C31)

[^18]: M. Rowlings, A. M. Tyrrell, and M. A. Trefzer, “Operating beyond fpga tool limitations: Nervous systems for embedded runtime management,” in 2021 Design, Automation Test in Europe Conference Exhibition (DATE), 2021, pp. 68–71. [IEEE](https://ieeexplore.ieee.org/document/9474251) [Google Scholar](https://scholar.google.com/scholar?as_q=Operating+beyond+fpga+tool+limitations%3A+Nervous+systems+for+embedded+runtime+management&as_occt=title&hl=en&as_sdt=0%2C31)

### Additional References