[Documentation] [TitleIndex] [WordIndex


The active realtime segmenter was developed by Mårten Björkman from KTH Stockholm (Sweden). The details of the method are explained in the following papers:

Summary of the Segmentation Method

Instead of just relying on two possible hypothesis, figure and ground, the segmentation method implemented in this package adds a third hypothesis, a flat surface. It is assumed that most objects of interest are placed on flat surfaces somewhere in the scene. This simplifies the problem of segregating an object from the surface it stands on, when both are very similar in appearance.

The segmentation approach is an iterative two-stage method that first performs pixel-wise labeling using a set of model parameters and then updates these parameters in the second stage. As such the method is similar to Expectation-Maximization with the distinction that instead of enumerating over all combinations of labellings, which in case of dependencies between neighboring pixels becomes prohibitively many, model evidence is summed up on a per-pixel basis using marginal distributions of labels obtained using belief propagation.

The most critical phase of any iterative scheme for figure-ground segmentation is initialization. There has to be some prior assumption of what is likely to belong to the foreground. In the original system to which the segmentation system has been originally applied (humanoid head from ARMAR III), we have a fixating system and assume that points close to the center of fixation are most likely to be part of the foreground. An imaginary 3D ball is placed around this fixation point and everything within the ball is initially labeled as foreground. For the flat surface hypothesis we apply RANSAC to find the most likely plane. The remaining points are initially labeled as background points.

Other solutions are to use e.g. attention. On the PR2, an interactive approach has been chosen for initialising the object hypotheses through an rviz plugin implemented in the package object_segmentation_gui.

Using the Active Segmenter

An example ros node is contained in the package.


This specific node reads out an RGB image and a disparity image and segments it iteratively. The main API functions are used and documented. Specifically, the function to convert a ros image into a 16bit aligned image suitable for the GPU computation is used. In this case, only one object is added as foreground hypothesis. In the current implementation, 6 objects can be segmented simultaneously. Furthermore, to obtain coloured images from the PR2 narrow stereo cameras, the package rgbd_assembler can be used.

Furthermore, offline data is provided with the package on which the segmentation can be tested. It was collected with the vision system running on the humanoid head from ARMAR III. To get the data, uncomment the following lines at the botom of the CMakeLists.txt file.

# Demo datasets
rosbuild_download_test_data (http://www.csc.kth.se/~bohg/testImgDisp.tgz demo/testImgDisp.tgz)
rosbuild_untar_file (demo/testImgDisp.tgz demo/data/ ALL)

The code in


can then be run to test the segmentation on this data. This example code is fully documented and similar to the previous example code.

Example Output

Here are example results from the offline testing data that was collected with the ARMAR III humanoid head. Since we have a fixating system, the initial point around which the forground is initilised is in the center of the image.

Peaches.jpg Tiger.jpg

On the left, we have one example from Kinect data using the package object_segmentation_gui to initialise object hypotheses from human input. On the right side, RGB-D data from the PR2 vision system has been assembled with the package rgbd_assembler. Initial object hypotheses have been again selected using human input through the object_segmentation_gui. The original segmentation method has been extended to deal with regions in the image that only have grey scale information.

Kinect_2.jpg PR2Stereo.jpg

2024-07-13 12:37