Calico Kinect

From IPRE Wiki
Jump to: navigation, search
Kinect-depth-gray.gif

This page describes an experimental connection to Microsoft's Kinect 3D camera and microphone.


Skel1.gif

To use this, you will need:

  1. A Microsoft Kinect (about $150) connected to a PC running Windows7
  2. Microsoft's Kinect SDK on Windows7 server
  3. Rolf Lakaemper's Kinect TCP Server, installed and running on the Windows7 computer (can be same computer)
  4. Calico version 1.0.5 or greater running on same, or different computer (could be a Mac or Linux computer, or different Windows OS/computer)


Overview

Microsoft's Kinect is a 3D camera for getting depth information, and more interesting, skeletal joint positions. You can track up to 6 people at a time.

The Calico Kinect project uses a Windows7 server to provide the data via the real Microsoft Kinect SDK... nothing has been reverse engineered. The Kinect TCP Server was written by Rolf Lakaemper and serves RGB, depth, and skeletal information over a port. Currently you must request data from the server each time you want some more data. The Calico code runs on a client, which can be the same Windows7 server, or a different computer running any operating system (Linux, Mac, or Windows).

Setup

Once the Kinect TCP Server is installed, and running, from Calico Python:

import Kinect
client = Kinect.Client("127.0.0.1", 8001)

Client takes the name of a machine and a port number. To make these ports available to computers other than the one that you are running on, you need to allow access to the port through the firewall.

Your Calico code parameters must match the parameters that the server is running. For example, if you are serving 640x480 RGB data, then your client must match.

Kinect-server.gif

Kinect-server2.gif

Many computers can connect onto the server simultaneously.

Functions

  • Kinect.Client(server, port) - the main interface
  • client.hello() - internal
  • client.readByte() - internal
  • client.initRGB(resX) - call to initialize the RGB API
  • client.initDepth(resX) - call to initialize the Depth API
  • client.startKinect(default) - set default to True to use the default settings
  • client.readData() - internal
  • client.readDepth() - get the Depth data
  • client.readRGBImageArray() - returns RGB data in a format for display
  • client.readDepthXYZ() - get Depth data as a point cloud
  • client.readSkeleton() - get the Skeletal data
  • client.write(byte, ...) - internal
  • client.read(count) - internal
  • client.close() - closes the stream and connection
  • client.getJointPositions(data, skelIndex) - given Skeletal data, and index (1-based) get Joint positions
  • client.getJointSegments(data, skelIndex) - given Skeletal data, and index (1-based) get Joint segment positions
  • client.convertDepthToImageArray(depth) - returns Depth data in a format for display

Examples

Camera

Kinect-depth.gif
import Kinect
client = Kinect.Client("127.0.0.1", 8001)
client.initRGB(640)
client.initDepth(320)
client.startKinect(False)
depthValues = client.readDepth()


Skeleton

The skeleton comes in a 20-element matrix with the following position meanings:

Skel2.gif
Skeleton.gif
  • 0. HipCenter
  • 1. Spine
  • 2. ShoulderCenter
  • 3. Head
  • 4. ShoulderLeft
  • 5. ElbowLeft
  • 6. WristLeft
  • 7. HandLeft
  • 8. ShoulderRight
  • 9. ElbowRight
  • 10. WristRight
  • 11. HandRight
  • 12. HipLeft
  • 13. KneeLeft
  • 14. AnkleLeft
  • 15. FootLeft
  • 16. HipRight
  • 17. KneeRight
  • 18. AnkleRight
  • 19. FootRight


Example Programs

This exampled pulls all three components together, depth, RGB, and skeleton. Note that the Client takes a DNS name or IP address as a string. In this is example, it assumes the server is listening on localhost, 127.0.0.1.

import Kinect
client = Kinect.Client("127.0.0.1", 8001)
depthWidth, depthHeight = 320, 240
rgbWidth, rgbHeight = 640, 480
client.initRGB(rgbWidth)
client.initDepth(depthWidth)
client.startKinect(False)

from Graphics import *
win = Window("Depth", depthWidth, depthHeight)
pic = Picture(depthWidth, depthHeight)
pic.draw(win)

win2 = Window("RGB", rgbWidth, rgbHeight)
pic2 = Picture(rgbWidth, rgbHeight)
pic2.draw(win2)

def getDepth():
    depthValues = client.readDepth()
    maximum = max([depthValues[x,0] for x in range(depthValues.GetUpperBound(0) + 1)])
    for v in range(depthValues.GetUpperBound(0) + 1):
        depth = depthValues[v,0]
        gray = depth/maximum * 255
        pic.setPixel(depthWidth - v % depthWidth, int(v/depthWidth), Color(gray, gray, gray))

def getRGB():
    data = client.readRGB()
    pic2.fromArray(data, "BGRX")

lines = []

def getJoints():
    global lines
    s = client.readSkeleton()
    skeletons = s[0]
    for line in lines:
        line.undraw()
    lines = []
    for index in range(1, skeletons + 1):
        joints = client.getJointPositions(s, index)
        segments = client.getJointSegments(s, index)
        z = joints[1,3]
        zscale = z/500.0
        for i in range(segments.GetUpperBound(0) + 1):
            if (segments[i,4] != 0):
                x1 = 640 - (int)(segments[i,0]/zscale+320.0);
                y1 = 50 + (int)(-segments[i,1]/zscale+200.0);
                x2 = 640 - (int)(segments[i,2]/zscale+320.0);
                y2 = 50 + (int)(-segments[i,3]/zscale+200.0);
                line = Line((x1, y1), (x2, y2))
                line.setWidth(10)
                line.draw(win2)
                lines.append(line)

        headsize = 40;
        head = Circle((640 - (segments[0,0]/zscale+320),
                      480 - 50 - (segments[0,1]/zscale+200)), headsize);
        head.fill = Color("yellow")
        head.draw(win2)
        lines.append(head)

def main():
    while win.IsRealized and win2.IsRealized:
        getRGB()
        getDepth()

        getJoints()

win.run(main)

This example just gets the skeleton data:

# D.S. Blank
# Kinect example: reading skeletons

from Graphics import *
import Kinect

client = Kinect.Client("colossus.brynmawr.edu", 8001)
client.startKinect(False)
rgbWidth, rgbHeight = 640, 480
win = Window("Skeleton", rgbWidth, rgbHeight)

# Global places for graphical objects:
bodies = {}
heads = {}

def getJoints():
    try:
        s = client.readSkeleton()
    except:
        return
    skeletons = s[0]
    for index in range(1, skeletons + 1):
        joints = client.getJointPositions(s, index)
        segments = client.getJointSegments(s, index)

        z = joints[1,3]
        zscale = z/500.0

        bodies[index] = bodies.get(index, {})

        for i in range(segments.GetUpperBound(0) + 1):
            if (segments[i,4] != 0):
                x1 = 640 - (int)(segments[i,0]/zscale+320.0);
                y1 = 50 + (int)(-segments[i,1]/zscale+200.0);
                x2 = 640 - (int)(segments[i,2]/zscale+320.0);
                y2 = 50 + (int)(-segments[i,3]/zscale+200.0);
                if i in bodies[index]:
                    line = bodies[index][i]
                else:
                    line = Line((x1, y1), (x2, y2))
                    line.setWidth(10)
                    line.outline = Color("black")
                    line.draw(win)
                    bodies[index][i] = line
                line.set_points(Point(x1, y1), Point(x2, y2))
                line.update()

        if index in heads:
            head = heads[index]
        else:
            head = Circle((640 - (segments[0,0]/zscale+320),
                          480 - 50 - (segments[0,1]/zscale+200)), 35);
            head.fill = Color("yellow")
            head.draw(win)
            heads[index] = head
        head.moveTo(640 - (segments[0,0]/zscale+320),
                    480 - 50 - (segments[0,1]/zscale+200))

while win.IsRealized:
    getJoints() 

References

  1. http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf - "Real-Time Human Pose Recognition in Parts from Single Depth Images", by Shotten, et al
  2. http://www.youtube.com/watch?v=HNkbG3KsY84 - video
  3. Kinect.cs - client-side TCP code
  4. https://sites.google.com/a/temple.edu/kinecttcp/ - Kinect TCP Server