Acquire Image and Skeletal Data Using Kinect V1
In Detect the Kinect V1 Devices, you see that the two sensors on
the Kinect® for Windows® device are represented by two device
IDs, one for the color sensor and one of the depth sensor. In that
example, Device 1 is the color sensor and Device 2 is the depth sensor.
This example shows how to create a videoinput object
for the color sensor to acquire RGB images and then for the depth
sensor to acquire skeletal data.
Create the
videoinputobject for the color sensor.DeviceID1 is used for the color sensor.vid = videoinput('kinect',1,'RGB_640x480');Look at the device-specific properties on the source device, which is the color sensor on the Kinect camera.
src = getselectedsource(vid); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = ColorSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] AutoExposure = on AutoWhiteBalance = on BacklightCompensation = AverageBrightness Brightness = 0.2156 CameraElevationAngle = 3 Contrast = 1 ExposureTime = 1.0 FrameInterval = 0 FrameRate = 30 Gain = 0 Gamma = 2.2 Hue = 0 PowerLineFrequency = Disabled Saturation = 1 Sharpness = 0.5 WhiteBalance = 2700As you can see in the output, the color sensor has a set of device-specific properties.
Device-Specific Property – Color Sensor Description AccelerometerReturns 3-D vector of acceleration data for both the color and depth sensors. The data is updated while the device is running or previewing.
This 1 x 3 double represents the
x,y, andzvalues of acceleration in gravity unitsg(9.81m/s^2). For example,[0.06 -1.00 -0.09]represents values of
xas0.06g,yas-1.00g, andzas-0.09g.AutoExposureUse to set the exposure automatically. This control whether other related properties are activated. Values are
on(default) andoff.onmeans that exposure is set automatically, and these properties are not able to be set and will throw a warning:FrameInterval,ExposureTime, andGain.offmeans that these properties are not able to be set and will throw a warning:PowerLineFrequency,BacklightCompensation, andBrightness.AutoWhiteBalanceUse to enable or disable automatic white balance setting.
on(default) means that it will automatically configure white balance and theWhiteBalanceproperty cannot be set.offmeans that theWhiteBalanceproperty is settable.BacklightCompensationConfigures backlight compensation modes to adjust the camera to capture images dependent on environmental conditions.
Note that this property is only valid if
AutoExposureis set toEnabled. The default isAverageBrightness.Values are:
AverageBrightnessfavors an average brightness levelCenterPriorityfavors the center of the sceneLowLightsPriorityfavors a low light levelCenterOnlyfavors the center onlyBrightnessIndicates the brightness level. The value range is
0.0to1.0, and the default value is0.2156.Note that this property is only valid if
AutoExposureis set toEnabled.CameraElevationAngleControls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the depth sensor also. ContrastIndicates contrast level. Values must be in the range 0.5to2, with a default value of1.ExposureTimeIndicates the exposure time in increments of 1/10,000 of a second. The value range is
0to4000, and the default is0.Note that this property is only valid if
AutoExposureis set toDisabled.FrameIntervalIndicates the frame interval in units of 1/10,000 of a second. The value range is
0to4000, and the default is0.Note that this property is only valid if
AutoExposureis set toDisabled.FrameRateFrames per second for the acquisition. This property is read only and the possible values for the color sensor are 12,15, and30(default). It reflects the actual frame rate when running.GainIndicates a multiplier for the RGB color values. The value range is
1.0to16.0, and the default is1.0.Note that this property is only valid if
AutoExposureis set toDisabled.GammaIndicates gamma measurement. Values must be in the range 1to2.8, with a default value of2.2.HueIndicates hue setting. Values must be in the range -22to22, with a default value of0.PowerLineFrequencyOption for reducing flicker caused by the frequency of a power line. Values are
Disabled,FiftyHertz, andSixtyHertz. The default isDisabled.Note that this property is only valid if
AutoExposureis set toEnabled.SaturationIndicates saturation level. Values must be in the range 0to2, with a default value of1.SharpnessIndicates sharpness level. Values must be in the range 0to1, with a default value of0.5.WhiteBalanceIndicates color temperature in degrees Kelvin. The value range is
2700to6500and the default is2700.Note that this property is only valid if
AutoWhiteBalanceis set toDisabled.You can optionally set some of these properties shown in the previous step. For example, you might be acquiring images in a low light situation. You could adjust the acquisition for this by setting the
BacklightCompensationproperty toLowLightsPriority, which favors a low light level.src.BacklightCompensation = 'LowLightsPriority';
Preview the color stream by calling
previewon the color sensor object you created.preview(vid);
When you are done previewing, close the preview window.
closepreview(vid);
Create the
videoinputobject for the depth sensor. Note that a second object is created (vid2), andDeviceID2 is used for the depth sensor.vid2 = videoinput('kinect',2,'Depth_640x480');Look at the device-specific properties on the source device, which is the depth sensor on the Kinect.
src = getselectedsource(vid2); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = DepthSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] BodyPosture = Standing CameraElevationAngle = 4 DepthMode = Default FrameRate = 30 IREmitter = on SkeletonsToTrack = [1x0 double] TrackingMode = offAs you can see in the output, the depth sensor has a set of device-specific properties associated with skeletal tracking. These properties are specific to the depth sensor.
Device-Specific Property – Depth Sensor Description AccelerometerReturns 3-D vector of acceleration data for both the color and depth sensors. The data is updated while the device is running or previewing.
This 1 x 3 double represents the
x,y, andzvalues of acceleration in gravity unitsg(9.81m/s^2). For example,[0.06 -1.00 -0.09]represents values of
xas0.06g,yas-1.00g, andzas-0.09g.BodyPostureIndicates whether the tracked skeletons are standing or sitting. Values are
Standing(gives 20 point skeleton data) andSeated(gives 10 point skeleton data, using joint indices 2 - 11).Standingis the default.Note that if
BodyPostureis set toSeatedmode, andTrackingModeis set toPosition, no position is returned, sincePositionis the location of the hip joint and the hip joint is not tracked inSeatedmode.See the subsection “BodyPosture Joint Indices” at the end of this example for the list of indices of the 20 skeletal joints.
CameraElevationAngleControls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the color sensor also. DepthModeIndicates the range of depth in the depth map. Values are Default(range of 50 to 400 cm) andNear(range of 40 to 300 cm).FrameRateFrames per second for the acquisition. This property is read only and is fixed at 30for the depth sensor for all formats. It reflects the actual frame rate when running.IREmitterControls whether the IR emitter is on or off. Values are
onandoff. Initially, the default value ison. However, this is a sticky property, so the default value is the last set value. If you set it tooff, it will remain off in future uses until you change the setting.An advantage of this property is that it is useful when using multiple Kinect devices to avoid interference.
SkeletonsToTrackIndicates the Skeleton Tracking ID returned as part of the metadata. Values are:
[]Default tracking[TrackingID1]Track 1 skeleton with Tracking ID = TrackingID1[TrackingID1 TrackingID2]Track 2 skeletons with Tracking IDs = TrackingID1 and TrackingID2TrackingModeIndicates tracking state. Values are:
Skeletontracks full skeleton with jointsPositiontracks hip joint position onlyOffdisables skeleton position tracking (default)Note that if
BodyPostureis set toSeatedmode, andTrackingModeis set toPosition, no position is returned, sincePositionis the location of the hip joint and the hip joint is not tracked inSeatedmode.Start the second
videoinputobject (the depth stream).start(vid2);
Skeletal data is accessed as metadata on the depth stream using
getdata.% Get the data on the object. [frame, ts, metaData] = getdata(vid2); % Look at the metadata to see the parameters in the skeletal data. metaData metaData = 10x1 struct array with fields: AbsTime: [1x1 double] FrameNumber: [1x1 double] IsPositionTracked: [1x6 logical] IsSkeletonTracked: [1x6 logical] JointDepthIndices: [20x2x6 double] JointImageIndices: [20x2x6 double] JointTrackingState: [20x6 double] JointWorldCoordinates: [20x3x6 double] PositionDepthIndices: [2x6 double] PositionImageIndices: [2x6 double] PositionWorldCoordinates: [3x6 double] RelativeFrame: [1x1 double] SegmentationData: [640x480 double] SkeletonTrackingID: [1x6 double] TriggerIndex: [1x1 double]These metadata fields are related to tracking the skeletons.
MetaData Description AbsTimeA 1 x 1 double that represents the full timestamp, including date and time, in MATLAB® clock format. FrameNumberA 1 x 1 double that represents the frame number. IsPositionTrackedA 1 x 6 Boolean matrix of true/false values for the tracking of the position of each of the six skeletons. A 1indicates the position is tracked and a0indicates it is not.IsSkeletonTrackedA 1 x 6 Boolean matrix of true/false values for the tracked state of each of the six skeletons. A 1indicates it is tracked and a0indicates it is not.JointDepthIndicesIf the BodyPostureproperty is set toStanding, this is a 20 x 2 x 6 double matrix of x-and y-coordinates for 20 joints in pixels relative to the depth image, for the six possible skeletons. IfBodyPostureis set toSeated, this would be a 10 x 2 x 6 double for 10 joints.JointImageIndicesIf the BodyPostureproperty is set toStanding, this is a 20 x 2 x 6 double matrix of x-and y-coordinates for 20 joints in pixels relative to the color image, for the six possible skeletons. IfBodyPostureis set toSeated, this would be a 10 x 2 x 6 double for 10 joints.JointTrackingStateThis 20 x 6 integer matrix contains enumerated values for the tracking accuracy of each joint for all six skeletons. Values include:
0not tracked1position inferred2position trackedJointWorldCoordinatesA 20 x 3 x 6 double matrix of x-, y- and z-coordinates for 20 joints, in meters from the sensor, for the six possible skeletons, if the
BodyPostureis set toStanding. If it is set toSeated, this would be a 10 x 3 x 6 double for 10 joints.See step 9 for the syntax on how to see this data.
PositionDepthIndicesA 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the depth image. PositionImageIndicesA 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the color image. PositionWorldCoordinatesA 3 x 6 double matrix of the X, Y and Z coordinates of each skeleton in meters relative to the sensor. RelativeFrameThis 1 x 1 double represents the frame number relative to the execution of a trigger if triggering is used. SegmentationDataImage size double array with each pixel mapped to a tracked/detected skeleton, represented by numbers 1 to 6. This segmentation map is a bitmap with pixel values corresponding to the index of the person in the field-of-view who is closest to the camera at that pixel position. A value of 0 means there is no tracked skeleton. SkeletonTrackingIDThis 1 x 6 integer matrix contains the tracking IDs of all six skeletons. These IDs track specific skeletons using the
SkeletonsToTrackproperty in step 5.Tracking IDs are generated by the Kinect and change from acquisition to acquisition.
TriggerIndexA 1 x 1 double that represents the trigger the event is associated with if triggering is used. Look at any individual property by drilling into the metadata. For example, look at the
IsSkeletonTrackedproperty.metaData.IsSkeletonTracked ans = 1 0 0 0 0 0In this case the data shows that of the six possible skeletons, there is one skeleton being tracked and it is in the first position. If you have multiple skeletons, this property is useful to confirm which ones are being tracked.
Get the joint locations for the first person in world coordinates using the
JointWorldCoordinatesproperty. Since this is the person in position 1, the index uses1.metaData.JointWorldCoordinates(:,:,1) ans = -0.1408 -0.3257 2.1674 -0.1408 -0.2257 2.1674 -0.1368 -0.0098 2.2594 -0.1324 0.1963 2.3447 -0.3024 -0.0058 2.2574 -0.3622 -0.3361 2.1641 -0.3843 -0.6279 1.9877 -0.4043 -0.6779 1.9877 0.0301 -0.0125 2.2603 0.2364 0.2775 2.2117 0.3775 0.5872 2.2022 0.4075 0.6372 2.2022 -0.2532 -0.4392 2.0742 -0.1869 -0.8425 1.8432 -0.1869 -1.2941 1.8432 -0.1969 -1.3541 1.8432 -0.0360 -0.4436 2.0771 0.0382 -0.8350 1.8286 0.1096 -1.2114 1.5896 0.1196 -1.2514 1.5896The columns represent the X, Y, and Z coordinates in meters of the 20 points on skeleton 1.
Optionally view the segmentation data as an image.
% View the segmentation data as an image. imagesc(metaDataDepth.SegmentationData); % Set the color map to jet to color code the people detected. colormap(jet);
BodyPosture Joint Indices
The BodyPosture property, in step 5, indicates
whether the tracked skeletons are standing or sitting. Values are Standing (gives
20 point skeleton data) and Seated (gives 10 point
skeleton data, using joint indices 2 - 11).
This is the order of the joints returned by the Kinect adaptor:
Hip_Center = 1; Spine = 2; Shoulder_Center = 3; Head = 4; Shoulder_Left = 5; Elbow_Left = 6; Wrist_Left = 7; Hand_Left = 8; Shoulder_Right = 9; Elbow_Right = 10; Wrist_Right = 11; Hand_Right = 12; Hip_Left = 13; Knee_Left = 14; Ankle_Left = 15; Foot_Left = 16; Hip_Right = 17; Knee_Right = 18; Ankle_Right = 19; Foot_Right = 20;
When BodyPosture is set to Standing,
all 20 indices are returned, as shown above. When BodyPosture is
set to Seated, numbers 2 through 11 are returned,
since this represents the upper body of the skeleton.