Kinect colour/ IR/ depth image reading

The Kinect SDK is a development platform which includes several APIs for programmer to communicate with Kinect hardware. In this project, we only concern on colour and depth sensor (ignore microphones). The sample programs for the SDK were mainly written in C# while only relatively few resources for C++ (the language I am going to use). Here is a useful tutorial for Kinect C++ SDK programming.
There are simply two steps to get data from Kinect: initialise Kinect and get frame from image stream.
For initialisation, these codes are required:

   bool initKinect() {
    // Get a working kinect sensor
    int numSensors;
    if (NuiGetSensorCount(&numSensors) < 0 || numSensors < 1) return false;
    if (NuiCreateSensorByIndex(0, &sensor) < 0) return false;

    // Initialize sensor
    sensor->NuiInitialize(NUI_INITIALIZE_FLAG_USES_DEPTH | NUI_INITIALIZE_FLAG_USES_COLOR);
    sensor->NuiImageStreamOpen(
        NUI_IMAGE_TYPE_COLOR,            // Depth camera or rgb camera?
        NUI_IMAGE_RESOLUTION_640x480,    // Image resolution
        0,        // Image stream flags, e.g. near mode
        2,        // Number of frames to buffer
        NULL,   // Event handle
        &rgbStream);
    return sensor;
}

HRESULT NuiInitialize(DWORD dwFlags);
dwFlags is a flag used to determine which content you want to capture in the NUI API, include:

Constant	Description
NUI_INITIALIZE_DEFAULT_HARDWARE_THREAD	This flag was deprecated in version 1.5; it is no longer used.
NUI_INITIALIZE_FLAG_USES_AUDIO	Initialize the sensor to provide audio data.
NUI_INITIALIZE_FLAG_USES_COLOR	Initialize the sensor to provide color data.
NUI_INITIALIZE_FLAG_USES_DEPTH	Initialize the sensor to provide depth data.
NUI_INITIALIZE_FLAG_USES_DEPTH_AND_PLAYER_INDEX	Initialize the sensor to provide depth data with a player index.
NUI_INITIALIZE_FLAG_USES_SKELETON	Initialize the sensor to provide skeleton data.

These flags can be combined together by | (bitwise-OR).

HRESULT NuiImageStreamOpen(NUI_IMAGE_TYPE eImageType,NUI_IMAGE_RESOLUTION eResolution,DWORD dwImageFrameFlags,DWORD dwFrameLimit,HANDLE hNextFrameEvent,HANDLE *phStreamHandle);
This method create a specified data stream for frame grabbing.
Parameters:
eImageType [in]: Specifying what type of data stream we want, it must correspond to the parameter dwFlags in NuiInitialize.

eResolution [in]: Specifying resolution of the image we will get. For colour image, Kinect supports: 1280x1024 (12fps) and 640x240 (30fps). For depth image, it supports: 640x480, 320x240 and 80x60.

dwImageFrameFlags [in]: Specifies the frame event options (like enable near mode for Kinect for Windows).

dwFrameLimit [in]: The number of frames that the Kinect runtime should buffer. The maximum value is NUI_IMAGE_STREAM_FRAME_LIMIT_MAXIMUM. Most applications should use a frame limit of two.

hNextFrameEvent [in, optional]: A handle to a manual reset event that will be fired when the next frame in the stream is available.

phStreamHandleType [out]: A pointer that contains a handle to the opened stream.

Return value: Type: Returns S_OK if successful; otherwise, returns one of the failure codes.

Getting frame from the stream by following codes:

void getKinectData(GLubyte* dest) {
 NUI_IMAGE_FRAME colorFrame;
 NUI_LOCKED_RECT c_LockedRect;

 if (sensor->NuiImageStreamGetNextFrame(colorStream, 10, &colorFrame) < 0) return;

 INuiFrameTexture* c_texture = colorFrame.pFrameTexture;
 c_texture->LockRect(0, &c_LockedRect, NULL, 0);

    if (c_LockedRect.Pitch != 0) {// check valid data
  BYTE* c_buf = (BYTE*) c_LockedRect.pBits;
  for (int y = 0; y < height; ++y)
  {
   const BYTE* pImage = c_buf;
   for (int x = 0; x < width; ++x)
   {
    // Get depth of pixel in millimeters
    *dest++ = pImage[0]; // B
    *dest++ = pImage[1]; // G 
    *dest++ = pImage[2]; // R
    *dest++ = pImage[3]; // A
    pImage += 4; // Go to next pixel
   }
   c_buf += width*4; // Go to next line
  }
    }
 c_texture->UnlockRect(0);
 sensor->NuiImageStreamReleaseFrame(colorStream, &colorFrame);
}

NuiImageStreamGetNextFrame() retrieve a frame for a given stream. The returned frame is a NUI_IMAGE_FRAME structure which contains information like frame_number, resolution, texture and etc. NUI_LOCKED_RECT contains a pointer to the actual data. An INuiFrameTexture manages the frame data. Then we get an INuiFrameTexture so that we can get the pixel data out of it, using a NUI_LOCKED_RECT. The Kinect data is in BGRA format, so we can copy it to our buffer and use if as an OpenGL texture.
Finally, the frame and sensor must be release for later or other program use.

Results:

Colour frame

Depth frame (mod by 256)

Depth frame(single region)

In order to distinguish the depth easily, a common method is to "compress" the depth in to several intensity region. For example, if the depth values range from 800 to 4000 (mm), we can divide the depth into different regions by mod them by 256 (if the depth is saved in char).

Raw IR frame

When acquiring the raw IR image, the IR emitter need to be covered or turned off by SDK (not valid for Kinect for Xbox) to avoid random dot pattern showing in the image.

Notes:
- For the same codes, the Kinect for Window requires some delay (800ms) after the initialization before acquiring frame from stream while Kinect for Xbox doesn't need the delay. If no delay is applied, the frame returned by the SDK may be empty or incorrect. The reason is not known so far.
- E_NUI_NOTGENUINE error was returned by the Kinect after few seconds normal running. It is caused by inadequate bandwidth of USB controller (multiple devices are connected to the same controller). It can be solved by plugging the Kinect into another USB port.

Reference: http://msdn.microsoft.com/en-us/library/jj663864.aspx

TraumaBot - 3D body reconstruction & recognition

Pages

17 February 2013

Kinect colour/ IR/ depth image reading

4 comments:

Code_Highlight