[Documentation] [TitleIndex] [WordIndex



This package contains an object localization system that returns 6-DoF poses for textured objects in RGBD-data. descriptor_overview_eng.png


This package uses a combination of 2D and 3D recognition algorithms to detect objects in a textured point cloud. This task is achieved by using a 5-step process:

  1. At first a 2D-recognition on the input-image is done. This recognition is based on the descriptor-matching functions the HALCON library offers (see here and here for more information on the used matching algorithms and the used parameters) and it uses images of the object, taken from different orientations (we call them views), as references for the matching with the scene image (therefore a coarse orientation of the object can already be computed.

  2. Once a texture (view) of the searched object is found in the image, the point cloud is reduced with the diameter of the object at the corresponding location (this correspondence exists because of the registration of the camera image to the depth image). In the training phase the object's diameter was set and so was the location of the found view's center. As the matching returns a homography as a result which describes the transformation of the trained view into the scene image, it is possible to transform the set center point with this homography into the image. For the pixel which results from this transformation, the corresponding 3D-point in the cloud can be obtained. To reduce errors which can occur because of noise in the point cloud data, the median of multiple points in the area of this corresponding pixel is used. To get those other points, an offset is added to the pixel location and the corresponding 3D-points are taken into account for the calculation of the median. Now the point cloud is reduced with the diameter of the object around this median point.

    Transformation into the scene image (abstracted; view's center depicted as cross; descriptor points depicted as red dots):

    Cloud reduction:

  3. The reduced point cloud, which should only contain the object, now is used for a 3D matching with the trained object model (see here and here for more information).

  4. In case of a rotation-variant object, the resulting pose from the last step can now be returned; as we want to be able to recognize objects without this feature as well though (e.g. cylindrical objects or spherical ones), more steps are necessary to ensure that the found pose is correct. That is because the last step only uses the geometrical attributes of the object to find it in the point cloud but not the texture/color information. To ensure that the found pose is also valid for those informations, the 2D and 3D matching results are compared and the pose is adjusted (as we know which view was found and how it is transformed into the image we can approximate the orientation of the object).

    Rotation-invariance (from left: no rotation, cylindrical, spherical):

    Geometrical view problem (both views on the left appear as the cloud on the right):

  5. The last step is optional and is used to add another validation layer to the recognition process. The textured mesh of the found object is rendered to an artificial image with the found pose. Now the 2D matching from step 2 is done again on this new image. Based on a metric comparing the resulting homographies of the matching on both the real and the artificial image, it can be checked whether the found pose is valid and can be returned.

    Real scene image left, artificial one right:

In addition to the object recognizer, the package offers a graphical training application which is used to add new objects to the list of recognizable entities.


Needed packages

Needed software

Needed hardware

A depth sensor with a registered RGB camera is needed to get textured point clouds as an input for this package. In our scenario we used a Microsoft Kinect and an additional AVT Guppy camera for the RGB images because of the low resolution of the internal RGB-camera of the Kinect.

As this package makes use of the HALCON library provided by the asr_halcon_bridge package, you need to make sure that a valid license is available on your machine. If this license is obtained by using a USB-dongle, make sure that it is plugged in when you use this software or you will get an exception at startup.

Start system

To start the process, call:

roslaunch asr_descriptor_surface_based_recognition descriptor_surface_based_recognition.launch

Now you can call one of the provided services to add or remove objects to/from the list of recognizable ones.

ROS Nodes

Subscribed Topics

Published Topics


There are two types of parameters which can be set, static and dynamic ones. The static ones can be found in the .yaml file in the param-directory and the dynamic ones in the launch-file (or during runtime by using dynamic_reconfigure).

Static parameters:

Dynamic parameters:

Needed Services

The process calls services from the asr_object_database to get information about the objects which are recognizable:

Provided Services


2024-07-20 12:40