Main Content

detectPeopleACF

Detect people using aggregate channel features (ACF)

detectPeopleACF will be removed in a future release. Use peopleDetectorACF instead.

Description

example

bboxes = detectPeopleACF(I) returns a matrix, bboxes, that contains the locations of detected upright people in the input image, I. The locations are represented as bounding boxes. The function uses the aggregate channel features (ACF) algorithm.

[bboxes,scores] = detectPeopleACF(I) also returns the detection scores for each bounding box.

[___] = detectPeopleACF(I,roi) detects people within the rectangular search region specified by roi, using either of the previous syntaxes.

[___] = detectPeopleACF(Name,Value) uses additional options specified by one or more Name,Value pair arguments. Unspecified properties have default values.

Code Generation Support:
Supports Code Generation: No
Supports MATLAB Function block: No
Code Generation Support, Usage Notes, and Limitations

Examples

collapse all

Read an image.

I = imread('visionteam1.jpg');

Detect people in the image and store results as bounding boxes and score.

[bboxes,scores] = detectPeopleACF(I);

Annotate the detected upright people in the image.

I = insertObjectAnnotation(I,'rectangle',bboxes,scores);

Display the results with annotation.

figure
imshow(I)
title('Detected people and detection scores')

Input Arguments

collapse all

Input image, specified as a truecolor image. The image must be real and nonsparse.

Data Types: uint8 | uint16 | int16 | double | single

Rectangular search region, specified as a four-element vector, [x,y,width,height]. The roi must be fully contained in I.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'Threshold',-1

ACF classification model, specified as the comma-separated pair consisting of 'Model' and either 'inria-100x41' or 'caltech-50x21'. The 'inria-100x41' model was trained using the INRIA Person dataset. The 'caltech-50x21' model was trained using the Caltech Pedestrian dataset.

Number of scale levels per octave, specified as the comma-separated pair consisting of 'NumScaleLevels', and an integer. Each octave is a power-of-two downscaling of the image. Increase this number to detect people at finer scale increments. Recommended values are in the range [4,8].

Window stride for sliding window, specified as the comma-separated pair consisting of 'WindowStride', and an integer. Set this value to the amount you want to move the window, in the x and y directions. The sliding window scans the images for object detection. The function uses the same stride for the x and y directions.

Select strongest bounding box, specified as the comma-separated pair consisting of 'SelectStrongest' and either true or false. The process, often referred to as nonmaximum suppression, eliminates overlapping bounding boxes based on their scores. Set this property to true to use the selectStrongestBbox function to select the strongest bounding box. Set this property to false, to perform a custom selection operation. Setting this property to false returns detected bounding boxes.

Minimum region size in pixels, specified as the comma-separated pair consisting of 'MinSize', and a two-element vector [height width]. You can set this property to [50 21] for the 'caltech-50x21' model or [100 41] for the 'inria-100x41' model. You can reduce computation time by setting this value to the known minimum region size for detecting a person. By default, MinSize is set to the smallest region size possible to detect an upright person for the classification model selected.

Maximum region size in pixels, specified as the comma-separated pair consisting of 'MaxSize', and a two-element vector, [height width]. You can reduce computation time by setting this value to the known region size for detecting a person. If you do not set this value, by default the function determines the height and width of the image using the size of I.

Classification accuracy threshold, specified as the comma-separated pair consisting of 'Threshold' and a numerical value. Typical values are in the range [–1,1]. During multiscale object detection, the threshold value controls the person or nonperson classification accuracy and speed. Increase this threshold to speed up the performance at the risk of missing true detections.

Output Arguments

collapse all

Locations of people detected using the aggregate channel features (ACF) algorithm, returned as an M-by-4 matrix. The locations are represented as bounding boxes. Each row in bboxes contains a four-element vector, [x,y,width,height]. This vector specifies the upper-left corner and size of a bounding box, in pixels, for a detected person.

Confidence value for the detections, returned as an M-by-1 vector. The vector contains a value for each bounding box in bboxes. The score for each detection is the output of a soft-cascade classifier. The range of score values is [-inf inf]. Greater scores indicate a higher confidence in the detection.

References

[1] Dollar, P., R. Appel, S. Belongie, and P. Perona. "Fast feature pyramids for object detection." Pattern Analysis and Machine Intelligence, IEEE Transactions. Vol. 36, Issue 8, 2014, pp. 1532–1545.

[2] Dollar, C. Wojeck, B. Shiele, and P. Perona. "Pedestrian detection: An evaluation of the state of the art." Pattern Analysis and Machine Intelligence, IEEE Transactions.Vol. 34, Issue 4, 2012, pp. 743–761.

[3] Dollar, C., Wojeck, B. Shiele, and P. Perona. "Pedestrian detection: A benchmark." IEEE Conference on Computer Vision and Pattern Recognition. 2009.

Version History

Introduced in R2016a