# FOG: Fast Octree Generator for LiDAR Point Clouds

## Abstract

As the need for realistic and immersive 3-D representations of the environment continues to increase across various industries, finding efficient ways to represent data has become paramount. A well-known approach to partitioning 3-D space into a structured data format is the use of octrees, primarily due to their efficiency in handling both sparse and dense 3-D data. This method is particularly useful in applications involving automotive light detection and ranging (LiDAR) sensors, which are widely used in autonomous driving systems for their ability to capture detailed spatial information in real-time. This letter introduces the fast octree generator (FOG) algorithm, a novel approach for generating octrees from 3-D LiDAR point clouds that leverages hardware acceleration. FOG achieves a performance improvement of up to 88.8% compared to PCL's octree implementation, enabling real-time octree generation for high-end sensors on embedded platforms.

## Authors

Ricardo Roriz *ALGORITMI Research Center/LASI, University of Minho, Guimarães, Portugal* [ORCID: 0000-0002-8543-550X](https://orcid.org/0000-0002-8543-550X)

Diogo Costa *ALGORITMI Research Center/LASI, University of Minho, Guimarães, Portugal*

Mongkol Ekpanyapong *School of Engineering and Technology, Asian Institute of Technology, Pathum Thani, Thailand*

Tiago Gomes *ALGORITMI Research Center/LASI, University of Minho, Guimarães, Portugal* [ORCID: 0000-0002-4071-9015](https://orcid.org/0000-0002-4071-9015)

## Publication Information

**Journal:** IEEE Sensors Letters **Year:** 2025 **Volume:** 9 **Issue:** 1 **Pages:** 1-4 **DOI:** [10.1109/LSENS.2024.3520800](https://doi.org/10.1109/LSENS.2024.3520800) **Article Number:** 10816469 **ISSN:** Electronic ISSN: 2475-1472

## Metrics

**Paper Citations:** 1 **Total Downloads:** 170

---

## Keywords

**IEEE Keywords:** Octrees, Three-dimensional displays, Point cloud compression, Sensors, Image reconstruction, Laser radar, Codes, Real-time systems, PSNR, Field programmable gate arrays

**Index Terms:** Point Cloud, Light Detection And Ranging, 3D Space, Hardware Accelerators, Processing Speed, Real-time Performance, Bounding Box, Root Node, Design Considerations, Level Of Resolution, Peak Signal-to-noise Ratio, Leaf Node, Child Nodes, Obstacle Avoidance, Hardware Resources, Depth-first, Peak Signal-to-noise Ratio Values, Higher Peak Signal-to-noise Ratio, Occupational Codes, Point Cloud Reconstruction, Original Point Cloud

**Author Keywords:** Sensor applications, 3-D light detection and ranging (LiDAR), field-programmable gate array (FPGA), octree

undefined
## SECTION I. Introduction

With the increasing demand for realistic 3-D environments, the need for sophisticated representations has surged across various industries, including car manufacturing. Light detection and ranging (LiDAR) sensors are becoming a key component in modern vehicles to assist autonomous tasks, as they can generate rich 3-D data at high rates. However, processing this massive information can be quite challenging since most onboard computing platforms are composed of embedded systems with limited hardware, while sensors present further advancements every year.

One of the most effective approaches for representing 3-D data is the utilization of octrees [^1]. As depicted in Fig. 1, the octree's hierarchical structure partitions the 3-D space into progressively smaller regions, optimizing the organization, storage, retrieval, and manipulation of spatial data. This makes octrees particularly valuable in driving-related tasks [^2], such as real-time obstacle detection and dynamic environment mapping and planning. In addition, their inherent downsampling structure facilitates the management of large LiDAR datasets, reducing overall data storage and computational demands.

![Figure 1](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz1-3520800-large.gif)

*Fig. 1. Octree visualization in 3-D space.*

Point cloud library (PCL) [^3], a widely used library in robotics and computer vision, provides different octree implementations according to their target application. While some implementations support complex data structures to offer fast search functions, others target data storage, applying entropy-based compression techniques to the output bitstream. Nonetheless, using octrees in real-time applications can still face some limitations. While efficiency and rapid access to data are guaranteed, the complexity and time required to create and update the data structure can become a bottleneck, especially when deployed in resource-constrained devices.

This letter introduces the fast octree generator (FOG) algorithm, a hardware-accelerated approach for the generation of octrees that implements a multitree architecture to parallelize the generation of the final occupancy bitstream. To further improve performance, FOG predefines octrees actuation boundaries and simplifies their computations by utilizing fixed-point arithmetic. The conducted evaluation shows an execution performance improvement of up to nearly 89% compared to PCL's octree implementation, enabling real-time octree generation for high-end sensors on the embedded platform Xilinx Zynq UltraScale+.

## SECTION II. FOG: Fast Octree Generator

FOG is an octree implementation optimized for resource-constrained embedded systems. Its primary goal is to generate memory-efficient octrees and node's occupation code bitstream in hardware to enable the processing of LiDAR data in real-time. FOG design focuses on key aspects that significantly shape the generation of the octree, such as its bounding box (BB), internal structure, and how the bitstream is created.

### A. Design Considerations

*BB:* The BB is the volume that defines the limits of the 3-D data. It sets the minimum and maximum coordinates along all axes, establishing the 3-D space within which the octree is built. Most octree implementations select the BB in a preliminary step [^4], analyzing the point cloud to establish the maximum and minimum boundaries, while other approaches use a dynamic BB, expanding it when new points are inserted into the octree [^5]. FOG constructs the octree using a predefined BB, which often leaves a few points outside the specified region of interest (ROI). These distant points are usually considered noise as they do not contain useful information about the surroundings. Nonetheless, this approach significantly optimizes resources and computations during the creation of the octree.

*Octree structure:* The spatial regions within the 3-D space are represented by nodes inside the octree. The tree structure is defined by the interconnections between these nodes and the information they store. Due to the sparsity of LiDAR data, octree structures generally connect nodes using memory pointers [^6]. This allows trees to expand only where data exists, avoiding unnecessary subdivisions of empty regions and ultimately optimizing memory usage. The smallest subdivisions within the octree are known as leaves, which are determined by a stop condition. In some implementations like OctSqueeze [^7], leaf nodes are determined by the maximum number of points that a node can hold. For nodes containing more points than a threshold, the structure further subdivides them until meeting the criteria. Thus, this structure enables finer subdivisions in dense areas and coarser resolutions in sparse regions. In contrast, OcTr [^8] uses a fixed-depth approach, establishing a fixed number of subdivisions from root to leaf nodes in any branch. This strategy, which is also adopted by FOG, simplifies traversal algorithms, making them faster and easier to implement. Octrees can also include extra information in the leaf nodes, such as memory pointers to point cloud points [^9]. Despite allowing to preserve the original point cloud precision, which can be required for segmentation applications, it significantly increases the memory footprint of the generated octree. FOG does not store any additional information on leaf nodes. Instead, it generates a rough approximation of the original point(s) by calculating the centroid of their voxel.

*Bitstream generation:* Encoding an octree structure into a bitstream typically involves traversing the tree to retrieve the occupation code of each node. One of the most common strategies is the depth first search (DFS), which fully traverses a branch before backtracking to the next one. The sequential nature of DFS ensures that the final bitstream includes independently generated branch bitstreams, concatenated in a predictable order. This makes DFS particularly suitable for parallelized processing, such as deployed in hardware by FOG.

### B. Implementation

*Multitree architecture:* FOG implementation adopted a multitree architecture, dividing the spatial data into eight smaller octrees, each managing an octant from the BB. This approach enables parallel processing, where multiple points can be simultaneously processed and inserted into the octrees. After all points are inserted, the occupation sequence from all octrees is concatenated to form the final bitstream structure, where each octree can be seen as a branch of a larger parent octree.

*Octree instances:* Each octree allocates dedicated memory to store its nodes, organized sequentially and identified by their index. The structure of a node contains only the information of its eight child nodes. To allow backtrack operations, every parent node id is stored in a memory stack, making the overall octree structure more compact and efficient. A level counter complements this process by monitoring the depth of the octree and identifying whether a node is a leaf or an intermediate node. To further reduce memory utilization, leaf nodes are not stored as nodes, which would require the representation of eight empty children. Instead, they are only represented within their parent node. This strategy is represented in Fig. 2, which depicts the data structure of an octree with a maximum depth of 3. The root node (node #1), octree depth 0, contains three children nodes, i.e., node #2, #12, and #24. By its turn, node #2 is the parent node of nodes #3, #12, #26, #70, and #90. Since the second child of node #26 is a leaf node (depth level 3)—marked with “1”—it does not require further memory resources to store empty nodes.

![Figure 2](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz2-3520800-large.gif)

*Fig. 2. Memory data structure of an octree managed by FOG.*

Inserting points into the structure and creating the occupation code is also a task of each tree. Points are inserted into the last inserted node, optimizing the traversal speed by leveraging the spatial proximity of consecutive points. The algorithm evaluates each node's BB to determine if the point lies within it, descending to the child nodes if inside or ascending to the parent node if outside. The insertion process is complete when the point is discarded for being outside the root's BB, or for being successfully added to a leaf node. Once all points are inserted, the occupation code bitstream is created, using the DFS to traverse the octree.

## SECTION III. Evaluation

FOG was implemented and tested using the ALFA framework, a set of tools designed to support developers in creating and evaluating LiDAR-based algorithms on embedded systems with hardware acceleration support. The evaluation setup includes the platform ZCU104 Evaluation Kit from Xilinx, which features the Zynq UltraScale+ MPSoC powered by a quad-core Arm Cortex-A53 processor, field-programmable gate array (FPGA) with 504 K logic cells and 1728 DSP slices, and 2 GB of DDR4 memory.

FOG was first implemented in software (*sFOG*) to validate its design. After, the octree generation and occupation code extraction were implemented in hardware (*hFOG*) in the FPGA. The evaluation uses data from four Velodyne LiDAR sensors: VLP-16, HDL-32, HDL-64, and VLS-128, collected in different environments, such as rough terrain, urban, and highways. For each dataset, the proposed method was evaluated using three different octree resolutions: low (10 cm), medium (5 cm), and high (2.5 cm). These values correspond to the size of the voxel created when the octree reaches its maximum depth, which will directly affect the resolution of the reconstructed point cloud, as shown in Fig. 3.

![Figure 3](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz3-3520800-large.gif)

*Fig. 3. Visual representation of the octree degradation effects on a Velodyne VLS-128 frame across different resolution levels. (a) Original point cloud. (b) Reconstructed point cloud from high resolution (2.5 cm) octree. (c) Reconstructed point cloud from medium resolution (5 cm) octree. (d) Reconstructed point cloud from low resolution (10 cm) octree.*

The evaluation compares FOG with the *octreepointcloudcompression* library from the PCL, whose implementation provides structural and functional features similar to FOG. The configuration parameters are summarized in Table 1, where the *octreeResolution* varies between 10, 5, and 2.5 cm. By its turn, FOG is defined by three parameters: the *numberOfNodes*, which was set to 250 000, that is necessary in the *hFOG* to allocate the required hardware resources to support the octree creation; the *BBCoordinates*, which defines the boundaries (coordinates) of the BB; and the *maxDepth*. Due to hardware limitations, two sets of *BBCoordinates* were defined to accommodate all datasets in the selected platform.

The first set, *Full BB*, is a cube-shaped BB with 200 m, centered on the sensor. For this BB, the *maxDepth* varied between 11, 12, and 13, correspond to resolutions of 10, 5, and 2.5 cm, respectively. The *full BB* was applied to all resolution levels of the VLP-16 and HDL-32 datasets, as well as the low and medium resolutions of the HDL-64 dataset. The second *BBCoordinates* configuration, defines a noncubic *BB* with a specific ROI with a 100 m length and 12 m width. This approach is commonly used with high-resolution sensors, such as the VLS-128, when the hardware resources are limited, and when real-time processing is required. The ROI includes important regions of the point cloud, such as the environment in the direction of the movement of the sensor [^10]. Since the size of the BB was reduced, the same octree resolutions of 10, 5, and 2.5 cm, can be achieved with the *maxDepth* set to 8, 9, and 10, respectively. The *ROI BB* was applied to the high resolution level of the HDL-64 dataset, and to all resolution levels of the VLS-128 dataset.

![Figure 4](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz.t1-3520800-large.gif)

*Table 1*

### A. Performance Evaluation and Hardware Resources

*Execution performance:* The processing time of a point cloud frame for each different dataset/sensor across the different resolution levels for *PCL*, *sFOG*, and *hFOG* are summarized in Table 2, where $\bar{x}$ represents the calculated average and $\sigma$ the standard deviation for the full dataset. Overall, the *hFOG* consistently outperforms both *PCL* and *sFOG* across all datasets and resolutions. The *hFOG* at the medium resolution processes a VLP-16 frame in 13.7 ms, which represents an 87.6% improvement over the *PCL* (111 ms), and an 88.2% improvement over *sFOG* version (116 ms). For the HDL-64 with the medium resolution configuration, the *hFOG* processes one point cloud frame in 39.4 ms, which is 88.8% faster than *PCL*, while for the VLS-128, *hFOG* processes one frame in 47.2 ms, which corresponds to an improvement of 68.5% compared to *PCL*.

![Figure 5](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz.t2-3520800-large.gif)

*Table 2*

*Reconstructed point cloud size and peak signal-to-noise ratio (PSNR):* The reconstructed point cloud size indicates how much information of the original data can be recovered, while the PSNR quantifies the similarity between the original and the reconstructed point cloud. Higher PSNR indicates better quality, meaning the reconstructed image is closer to the original, while lower PSNR indicates higher distortion in the reconstructed image. Typically, a PSNR range of 30–50 dB indicates high-quality point cloud reconstruction. Table 3 summarizes the obtained results, showing that *PCL* and *sFOG* provide similar performance in almost all datasets. At the medium resolution level, the *PCL* is able to recover nearly 64.8% of the original VLP-16 dataset with a peak signal-to-noise ratio (PSNR) of 70.9 dB, while *sFOG* recovers 64.8% of the original point cloud with a PSNR of 70.8 dB. For the same dataset, the *hFOG* reconstructs 60.5% of the point cloud with a PSNR of 57.7 dB. Similar trends are observed in the other datasets, where *hFOG* consistently reconstructs smaller percentages of the point cloud while still achieving good PSNR values.

![Figure 6](https://ieeexplore.ieee.org/mediastore/IEEE/content/media/7782634/10790557/10816469/roriz.t3-3520800-large.gif)

*Table 3*

*Hardware resources:* Deploying hardware accelerators to speed up FOG comes at the cost of FPGA hardware resources. In the selected platform, FOG uses 37.31% of the available 230,400 LUTs, 2.53% of the 460 800 flip–flops, 87.18% of the 312 BRAM tiles, fully occupies all 96 URAM blocks, 5.58% of the 1720 DSP units and 54.95% of the 201 600 MUXes.

### B. Discussion

The results presented in Tables 2 and 3 show that, under similar conditions, *sFOG* and PCL can deliver similar performance results in terms of PSNR and point cloud size, being that the *sFOG* is slightly faster, but without being able to achieve real-time performance. However, the *hFOG* is able to provide real-time performance across all datasets, at the cost of a small PSNR degradation, which is mainly caused by hardware design considerations, such as fixed-point arithmetic instead of the floating-point used by both PCL and *sFOG*. This is particularly noticeable in the HDL-64 and VLS-128 datasets, mainly due to the density of their point clouds. However, when real-time tasks are required, such as obstacle detection and collision avoidance, FOG can be a good alternative to PCL. On the other hand, when the point cloud must be closer to the original data, e.g., for high-definition maps or detailed object classification, the PCL can have the upper hand as it can provide slightly higher PSNR values. In FOG, the resolution of the octree is determined by setting a BB and the maximum depth level. This results in noncubic leaf voxels when non-cubic BBs are applied, meaning that different resolutions exist across the axes, whereas PCL maintains a uniform resolution. Finally, the performance of *hFOG* is significantly influenced by hardware resource constraints, with the fixed number of nodes being the most prominent limitation. When processing high-resolution or densely populated datasets, the required nodes can vary depending on the specific frame or frame region. As *hFOG* ’s predefined BB cannot be dynamically adjusted on a frame-by-frame basis, it is essential to carefully define an appropriate BB in advance in order to find a balance between resources used, processing speed and region covered by the octree.

## SECTION IV. Conclusion

This letter presents FOG: an octree generation method optimized for embedded real-time point cloud processing. While the software version performs comparably to PCL's implementation, the hardware-accelerated FOG showed notable speed improvements over software alternatives. However, its optimization for processing speed led to a slight PSNR reduction in dense or high-resolution scenarios. Future work will focus on improving the memory efficiency of the octree structure, allowing the implementation of more nodes within the same hardware platform. Also, FOG could be further extended to integrate a compression step after retrieving the occupation code, similar to PCL's *OctreePointCloudCompression* library, reducing the memory size required to store point cloud data.

## References

[^1]: D. Meagher, “Geometric modeling using octree encoding,” Comput. Graph. Image Process., vol. 19, no. 2, pp. 129–147, 1982. [DOI](https://doi.org/10.1016/0146-664X(82)90104-6) [Google Scholar](https://scholar.google.com/scholar?as_q=Geometric+modeling+using+octree+encoding&as_occt=title&hl=en&as_sdt=0%2C31)

[^2]: R. Roriz, J. Cabral, and T. Gomes, “Automotive LiDAR technology: A survey,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 7, pp. 6282–6297, Jul. 2022. [IEEE](https://ieeexplore.ieee.org/document/9455394) [Google Scholar](https://scholar.google.com/scholar?as_q=Automotive+LiDAR+technology%3A+A+survey&as_occt=title&hl=en&as_sdt=0%2C31)

[^3]: R. B. Rusu and S. Cousins, “3D is here: Point cloud library (PCL),” in Proc. 2011 IEEE Int. Conf. Robot. Autom., 2011, pp. 1–4. [IEEE](https://ieeexplore.ieee.org/document/5980567) [Google Scholar](https://scholar.google.com/scholar?as_q=3D+is+here%3A+Point+cloud+library+%28PCL%29&as_occt=title&hl=en&as_sdt=0%2C31)

[^4]: C. Fu, G. Li, R. Song, W. Gao, and S. Liu, “OctAttention: Octree-based large-scale contexts model for point cloud compression,” in Proc. AAAI Conf. Artif. Intell., Jun. 2022, pp. 625–633. [DOI](https://doi.org/10.1609/aaai.v36i1.19942) [Google Scholar](https://scholar.google.com/scholar?as_q=OctAttention%3A+Octree-based+large-scale+contexts+model+for+point+cloud+compression&as_occt=title&hl=en&as_sdt=0%2C31)

[^5]: J. Zhu, H. Li, Z. Wang, S. Wang, and T. Zhang, “i-Octree: A fast, lightweight, and dynamic octree for proximity search,” in Proc. 2024 IEEE Int. Conf. Robot. Autom., 2024, pp. 12290–12296. [IEEE](https://ieeexplore.ieee.org/document/10611019) [Google Scholar](https://scholar.google.com/scholar?as_q=i-Octree%3A+A+fast%2C+lightweight%2C+and+dynamic+octree+for+proximity+search&as_occt=title&hl=en&as_sdt=0%2C31)

[^6]: D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, “An overview of ongoing point cloud compression standardization activities: Video-based (V-PCC) and geometry-based (G-PCC),” APSIPA Trans. Signal Inf. Process., vol. 9, no. 1, 2020, Art. no. e13. [DOI](https://doi.org/10.1017/ATSIP.2020.12) [Google Scholar](https://scholar.google.com/scholar?as_q=An+overview+of+ongoing+point+cloud+compression+standardization+activities%3A+Video-based+%28V-PCC%29+and+geometry-based+%28G-PCC%29&as_occt=title&hl=en&as_sdt=0%2C31)

[^7]: L. Huang, S. Wang, K. Wong, J. Liu, and R. Urtasun, “OctSqueeze: Octree-structured entropy model for LiDAR compression,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2020, pp. 1310–1320. [IEEE](https://ieeexplore.ieee.org/document/9157381) [Google Scholar](https://scholar.google.com/scholar?as_q=OctSqueeze%3A+Octree-structured+entropy+model+for+LiDAR+compression&as_occt=title&hl=en&as_sdt=0%2C31)

[^8]: C. Zhou, Y. Zhang, J. Chen, and D. Huang, “OcTr: Octree-based transformer for 3D object detection,” in Proc. 2023 IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 5166–5175. [IEEE](https://ieeexplore.ieee.org/document/10203760) [Google Scholar](https://scholar.google.com/scholar?as_q=OcTr%3A+Octree-based+transformer+for+3D+object+detection&as_occt=title&hl=en&as_sdt=0%2C31)

[^9]: J. Elseberg, D. Borrmann, and A. Nüchter, “One billion points in the cloud–an octree for efficient processing of 3D laser scans,” ISPRS J. Photogrammetry Remote Sens., vol. 76, pp. 76–88, 2013. [DOI](https://doi.org/10.1016/j.isprsjprs.2012.10.004) [Google Scholar](https://scholar.google.com/scholar?as_q=One+billion+points+in+the+cloud%E2%80%93an+octree+for+efficient+processing+of+3D+laser+scans&as_occt=title&hl=en&as_sdt=0%2C31)

[^10]: X. T. Nguyen, K.-T. Nguyen, H.-J. Lee, and H. Kim, “ROI-Based LiDAR sampling algorithm in on-road environment for autonomous driving,” IEEE Access, vol. 7, pp. 90243–90253, 2019. [IEEE](https://ieeexplore.ieee.org/document/8755965) [Google Scholar](https://scholar.google.com/scholar?as_q=ROI-Based+LiDAR+sampling+algorithm+in+on-road+environment+for+autonomous+driving&as_occt=title&hl=en&as_sdt=0%2C31)

### Additional References

