Main Content

Performant Monocular Visual-Inertial SLAM

Since R2025a

Simultaneous Localization and Mapping (SLAM) is a sophisticated process that involves identifying a sensor's position and orientation, like a camera, relative to its environment. It also includes generating a map by pinpointing the 3D locations of various points in the area. This example illustrates the effective execution of Visual Inertial SLAM (viSLAM) by integrating images from a monocular camera with data from an Inertial Measurement Unit (IMU) sensor, using the monovslam object which contains a complete viSLAM workflow, including scale estimation. The object offers a practical solution with greatly improved execution speed and code generation, in addition to seemless fusion with IMU data and scale estimation.

MATLAB® viSLAM examples each show one of these viSLAM implementations:

  • Modular ─ Builds a visual inertial SLAM pipeline step-by-step by using functions and objects. For more details and a list of these functions and objects, see the Implement Visual SLAM in MATLAB topic. The approach described in the topic contains modular code, and is designed to teach the details of a viSLAM implementation, that is loosely based on the popular and reliable ORB-SLAM2 algorithm. The code is easily navigable, enabling you to explore the algorithm and test how its parameters can impact the performance of the system. This modular implementation is most suitable for learning, experimentation, and modification to test your own ideas.

  • Performant ─ Uses the monovslam object which contains a complete viSLAM workflow. The object offers a practical solution with greatly improved execution speed and code generation. To generate multi-threaded C/C++ code from monovslam, you can use MATLAB® Coder™. The generated code is portable, and you can deploy it on non-PC hardware as well as a ROS node, as demonstrated in the Build and Deploy Visual SLAM Algorithm with ROS in MATLAB example.

This example shows the performant implementation for processing combined image data from a monocular camera and positional data from an IMU. For more details about the modular implementation, see the Monocular Visual-Inertial SLAM example.

Download and Set-Up Data

This example uses data from the BlackBird UAV dataset, which and contains synchronized RGB images and IMU data. You can download the data set to a temporary folder using this code.

uavData = helperDownloadSLAMData();
images = uavData.images;

Set up the camera intrinsics and IMU parameters.

intrinsics = uavData.intrinsics;
imuParams = factorIMUParameters(SampleRate=100,GyroscopeNoise=0.1, ...
            GyroscopeBiasNoise=3e-3,AccelerometerNoise=0.3, ...
            AccelerometerBiasNoise=1e-3,ReferenceFrame="ENU");

Implement Visual SLAM

Create a monovslam object with both the camera intrinsics object and IMU parameters to activate the camera-IMU fusion. You may also need to modify the properties below, based on the camera and IMU data that you intend to use, to control the scale and gravity vector direction estimation that will run in the background.

  • NumPosesThreshold : Number of estimated camera poses to trigger camera-IMU alignment.

  • AlignmentFraction : Specifies a fraction of the number of recent pose estimates, calculated as round(NumPosesThreshold*AlignmentFraction), for use in the camera-IMU alignment process. It effectively filters out initial, potentially noisy pose estimates, ensuring only the most relevant data contributes to the alignment for improved accuracy.

vslam = monovslam(intrinsics,imuParams,MaxNumPoints=2000,SkipMaxFrames=10,TrackFeatureRange=[20,150],LoopClosureThreshold=100, ...
                  CameraToIMUTransform=rigidtform3d(tform(uavData.camToIMUTransform)),NumPosesThreshold=10,AlignmentFraction=0.9, ...
                  CustomBagOfFeatures=bagOfFeaturesDBoW('bagOfFeatures.bin.gz'));

Extract the IMU data and process each image by using the addFrame function. Use the plot method to visualize the estimated camera trajectory and the 3-D map points.

The scale and gravity vector direction estimation will run in the background, and its results will be displayed once the estimation is successful. You can track its status using the IsIMUAligned parameter.

startFrameIdx = 150; %Frame at which the UAV starts moving
flag = true; %Display the scale estimation result only once

for i=startFrameIdx:length(images)

    if(i>startFrameIdx)
        imuMesurements=helperExtractIMUMeasurements(uavData, i-1, i);
    else
        imuMesurements.gyro=[];
        imuMesurements.accel=[];
    end

    addFrame(vslam, images{i}, imuMesurements.gyro, imuMesurements.accel);

    if vslam.IsIMUAligned && flag
        helperGetAccuracy(vslam,uavData,startFrameIdx);
        flag = false;
    end

    if hasNewKeyFrame(vslam)
        plot(vslam);
    end
end
    "Absolute RMSE for trajectory (m): "    "0.041025"

Figure contains an axes object. The axes object with title SLAM VS GT, xlabel X-axis, ylabel Y-axis contains 3 objects of type line. These objects represent Actual trajectory, Scaled SLAM trajectory, Unscaled SLAM trajectory.

Note that the monovslam object runs several algorithm parts on separate threads, which can introduce a latency in processing of an image frame added by using the addFrame function.

while ~isDone(vslam)
    if hasNewKeyFrame(vslam)
        plot(vslam);
    end
end

Figure contains an axes object. The axes object with xlabel X, ylabel Y contains 12 objects of type line, text, patch, scatter. This object represents Camera trajectory.

Compare the obtained trajectory with the ground truth to evaluate the accuracy of the SLAM pipeline.

xyzPoints = mapPoints(vslam);
[camPoses, ids]  = poses(vslam);

helperGetAccuracy(vslam,uavData,startFrameIdx);
    "Absolute RMSE for trajectory (m): "    "0.22921"

Figure contains an axes object. The axes object with title SLAM VS GT, xlabel X-axis, ylabel Y-axis contains 3 objects of type line. These objects represent Actual trajectory, Scaled SLAM trajectory, Unscaled SLAM trajectory.

Supporting Functions

helperExtractIMUMeasurements extract the IMU measurements that correspond to the current frame.

function imuMesurements = helperExtractIMUMeasurements(data, startFrameIdx, currFrameIdx)

    timeStamps = data.timeStamps;
    
    startTimeStamp = timeStamps.imageTimeStamps(startFrameIdx);
    currTimeStamp  = timeStamps.imageTimeStamps(currFrameIdx);
    
    [~,startIMUIdx] = min(abs(timeStamps.imuTimeStamps - startTimeStamp));
    [~,currIMUIdx]  = min(abs(timeStamps.imuTimeStamps - currTimeStamp));
    
    imuMesurements.accel = data.accelReadings(startIMUIdx:(currIMUIdx-1),:);
    imuMesurements.gyro  = data.gyroReadings(startIMUIdx:(currIMUIdx-1),:);

end

helperGetAccuracy calculates the absolute RMSE and plots a comparison between the SLAM trajectory and the ground truth.

function helperGetAccuracy(vslam,data,startIdx)

    cam2IMU = vslam.CameraToIMUTransform.A;
    gravityRotation = vslam.GravityRotation.A;
    scale = vslam.IMUScale;
    
    [viPoses, ids]  = poses(vslam);

    pred = cat(1,viPoses(:).Translation);

    gt_t = data.gTruth(ids+startIdx, 1:3);
    gt_r = data.gTruth(ids+startIdx, 4:7);

    gravityRotationInv = se3(inv(gravityRotation));
    cam2IMU = se3(cam2IMU(1:3,1:3),[0 0 0]);

    pred = gravityRotationInv.transform(pred);
    pred = cam2IMU.transform(pred);

    initTF = se3(quat2rotm(gt_r(1,:)),gt_t(1,:));
    initTFInv = se3(inv(tform(initTF)));
    gt_t = initTFInv.transform(gt_t);

    pred_rtf = [];
    gt_rtf = [];

    for i=1:height(viPoses)
        pred_rtf = [pred_rtf rigidtform3d(eye(3),pred(i,:))];
        gt_rtf = [gt_rtf rigidtform3d(eye(3),gt_t(i,:))];
    end

    metrics = compareTrajectories(pred_rtf, gt_rtf);
    disp(["Absolute RMSE for trajectory (m): ", metrics.AbsoluteRMSE(2)]);

    figure 
    plot3(gt_t(:,1), gt_t(:,2), gt_t(:,3), 'g','LineWidth',2, 'DisplayName', 'Actual trajectory');
    hold on
    plot3(pred(:,1), pred(:,2), pred(:,3), 'r','LineWidth',2, 'DisplayName', 'Scaled SLAM trajectory');
    hold on
    plot3(pred(:,1)/scale, pred(:,2)/scale, pred(:,3)/scale, 'b','LineWidth',2, 'DisplayName', 'Unscaled SLAM trajectory');
    legend
    view(0, 90);
    xlabel('X-axis');
    ylabel('Y-axis');
    zlabel('Z-axis');
    title('SLAM VS GT')
    hold off
end

helperDownloadData downloads the data set.

function vioData = helperDownloadSLAMData()

    vioDataTarFile =  matlab.internal.examples.downloadSupportFile(...
        'shared_nav_vision/data','BlackbirdVIOData.tar');  

    % Extract the file.
    outputFolder = fileparts(vioDataTarFile);
    if (~exist(fullfile(outputFolder,"BlackbirdVIOData"),"dir"))
        untar(vioDataTarFile,outputFolder);
    end

    vioData = load(fullfile(outputFolder,"BlackbirdVIOData","data.mat"));
end