11 June 2013

Octree based point cloud stream compression in PCL

The point cloud produced from Kinect-like sensor is dense and updated at real-time frame rate (30 Hz). At usual case, 240k ~ 270k points are generated per second per Kinect depend on complexity of scene. As the dynamic point clouds are sampled at 30 Hz, it contains a total of about 7.5 million points per second. Since two Kinects have been used in this project, double of the point cloud is required to transmit.
The Point Cloud Library (PCL) has implemented an octree based compression class which enables both spacial and temporal compression at the same time. The full framework of the algorithm is introduced in [1].

Spacial compression:
The point cloud is decomposed into nodes by employing an octree structure which can be briefly illustrated as Figure 1. When constructing the octree, every node occupied by point(s) is set to non-empty (filled black in the figure). For each node contains point(s), we can further subdivide into 8 octants until the number of level or size of node reaches our predefined threshold. The child node configuration can be efficiently represented in a single byte (8-bit for 8 octants). Therefore, the whole octree structure is encoded into a stream, and the size of the stream equals to the number of branch nodes (unit is byte).

Figure 1 Schematic overview of the octree data structure and its serialization [1].

Temporal compression:
Since changes in the point cloud of the captured scene often occur only partially in confined areas, the correlation between currently sampled points and previously encoded point cloud can be exploited to improve compression performance. In [1] authors proposed a double-buffering octree method to solve the problem. Instead of eight child pointers to child nodes, the double-buffering use an additional set of eight child pointers to every branch. As shown in Figure 2, let's call them buffer A and B. Two buffers are alternatively assigned to every new captured point cloud. It is pointed out in the paper that the current point cloud in one octree structure allows us to spatially match, reference and analyse the structural changes within consecutive octree structure [1].

Figure 2 Illustration of differential encoding technique when using double-buffering octree [1].
In order to obtain and output the structural changes for every branch of two consecutive point clouds, the authors applied an exclusive disjunction operation (XOR) on the two bytes representing the node configuration for that branch over the whole octree structure.

Compression architecture:
In addition to the spacial and temporal encoded information as introduced above, for more precise and completed points representation, the authors also encode point detail and component information for their compression architecture as shown in Figure 3. For instance, distance between point and origin of node it belongs to was also calculated during octree construction. The other properties of point such as colour and normal are recorded as well. All the information that will be transmitted and recovered at decoder is encoded and compressed by range encoder [2].

Figure 3 The framework of compression and decompression [1].

Performance evaluation:
pointRes
octreeRes
FrameRate
Color
compression percentage
size of compressed cloud (kBytes)
Config 1
0.001
0.001
30
0
~5.0
~200
0.01
0.001
30
0
~5.0
~200
0.1
0.001
30
0
~5.0
~200
Config 2
0.001
0.01
30
0
~8.9
~370
0.01
0.01
30
0
~1.4
~19
0.1
0.01
30
0
~1.18
~50
Config 3
0.001
0.1
30
0
~15.5
~640
0.01
0.1
30
0
~8.0
~330
0.1
0.1
30
0
~5.4
~1
Config 4
0.01
0.01
5
0
~1.4(2.1)
~19(28)
Config 5
0.01
0.01
30
1
~13.9;~50.5(color)
~190

The table shows the compression performance of the framework in terms of compression percentage and size of compressed cloud under different configurations. The compression percentage equals to size of compressed cloud / size of uncompressed cloud (i.e. lower compression percentage means better compression performance).

Considering the best choice of configuration, we choose the one has low compression percentage and acceptable uncompressed cloud  resolution. For the requirements of this project, the second row of config 2 gives us small compressed cloud (faster to transmit) while achieving useful cloud resolution.
The compression framework also supports other information like colour. Although I haven't used any colour information in the shape model method so far, I experimented the colour encoding. As shown in config 5, compression percentage of colour information is relatively large (50.5%) which means that only half of the information is compressed. Comparing to the second row of config 2, by turning on the colour encoding the total compression percentage and the size of compressed cloud is 10 times larger than non-colour encoding configuration.

It's worth noting the frame rate parameter which control the interval to completed resynchronize the octree structure.

[1] Kammerl J, Blodow N, Rusu R B, et al. Real-time compression of point cloud streams[C]//Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, 2012: 778-785.Reference:

[2] Martin G N N. Range encoding: an algorithm for removing redundancy from a digitised message[C]//Video and Data Recording Conference. 1979: 24-27.

No comments:

Post a Comment