Performant Monocular Visual-Inertial SLAM
Simultaneous Localization and Mapping (SLAM) is a sophisticated process that involves identifying a sensor's position and orientation, like a camera, relative to its environment. It also includes generating a map by pinpointing the 3D locations of various points in the area. This example illustrates the effective execution of Visual Inertial SLAM (viSLAM) by integrating images from a monocular camera with data from an Inertial Measurement Unit (IMU) sensor, using the monovslam object which contains a complete viSLAM workflow, including scale estimation. The object offers a practical solution with greatly improved execution speed and code generation, in addition to seemless fusion with IMU data and scale estimation.
MATLAB® viSLAM examples each show one of these viSLAM implementations:
Modular ─ Builds a visual inertial SLAM pipeline step-by-step by using functions and objects. For more details and a list of these functions and objects, see the Implement Visual SLAM in MATLAB topic. The approach described in the topic contains modular code, and is designed to teach the details of a viSLAM implementation, that is loosely based on the popular and reliable ORB-SLAM2 algorithm. The code is easily navigable, enabling you to explore the algorithm and test how its parameters can impact the performance of the system. This modular implementation is most suitable for learning, experimentation, and modification to test your own ideas.
Performant ─ Uses the monovslam object which contains a complete viSLAM workflow. The object offers a practical solution with greatly improved execution speed and code generation. To generate multi-threaded C/C++ code from
monovslam
, you can use MATLAB® Coder™. The generated code is portable, and you can deploy it on non-PC hardware as well as a ROS node, as demonstrated in the Build and Deploy Visual SLAM Algorithm with ROS in MATLAB example.
This example shows the performant implementation for processing combined image data from a monocular camera and positional data from an IMU. For more details about the modular implementation, see the Monocular Visual-Inertial SLAM example.
Download and Set-Up Data
This example uses data from the BlackBird UAV dataset, which and contains synchronized RGB images and IMU data. You can download the data set to a temporary folder using this code.
uavData = helperDownloadSLAMData(); images = uavData.images;
Set up the camera intrinsics and IMU parameters.
intrinsics = uavData.intrinsics; imuParams = factorIMUParameters(SampleRate=100,GyroscopeNoise=0.1, ... GyroscopeBiasNoise=3e-3,AccelerometerNoise=0.3, ... AccelerometerBiasNoise=1e-3,ReferenceFrame="ENU");
Implement Visual SLAM
Create a monovslam
object with both the camera intrinsics object and IMU parameters to activate the camera-IMU fusion. You may also need to modify the properties below, based on the camera and IMU data that you intend to use, to control the scale and gravity vector direction estimation that will run in the background.
NumPosesThreshold : Number of estimated camera poses to trigger camera-IMU alignment.
AlignmentFraction : Specifies a fraction of the number of recent pose estimates, calculated as
round(NumPosesThreshold*AlignmentFraction)
, for use in the camera-IMU alignment process. It effectively filters out initial, potentially noisy pose estimates, ensuring only the most relevant data contributes to the alignment for improved accuracy.
vslam = monovslam(intrinsics,imuParams,MaxNumPoints=2000,SkipMaxFrames=10,TrackFeatureRange=[20,150],LoopClosureThreshold=100, ... CameraToIMUTransform=rigidtform3d(tform(uavData.camToIMUTransform)),NumPosesThreshold=10,AlignmentFraction=0.9, ... CustomBagOfFeatures=bagOfFeaturesDBoW('bagOfFeatures.bin.gz'));
Extract the IMU data and process each image by using the addFrame
function. Use the plot
method to visualize the estimated camera trajectory and the 3-D map points.
The scale and gravity vector direction estimation will run in the background, and its results will be displayed once the estimation is successful. You can track its status using the IsIMUAligned parameter.
startFrameIdx = 150; %Frame at which the UAV starts moving flag = true; %Display the scale estimation result only once for i=startFrameIdx:length(images) if(i>startFrameIdx) imuMesurements=helperExtractIMUMeasurements(uavData, i-1, i); else imuMesurements.gyro=[]; imuMesurements.accel=[]; end addFrame(vslam, images{i}, imuMesurements.gyro, imuMesurements.accel); if vslam.IsIMUAligned && flag helperGetAccuracy(vslam,uavData,startFrameIdx); flag = false; end if hasNewKeyFrame(vslam) plot(vslam); end end
"Absolute RMSE for trajectory (m): " "0.041025"
Note that the monovslam
object runs several algorithm parts on separate threads, which can introduce a latency in processing of an image frame added by using the addFrame
function.
while ~isDone(vslam) if hasNewKeyFrame(vslam) plot(vslam); end end
Compare the obtained trajectory with the ground truth to evaluate the accuracy of the SLAM pipeline.
xyzPoints = mapPoints(vslam); [camPoses, ids] = poses(vslam); helperGetAccuracy(vslam,uavData,startFrameIdx);
"Absolute RMSE for trajectory (m): " "0.22921"
Supporting Functions
helperExtractIMUMeasurements extract the IMU measurements that correspond to the current frame.
function imuMesurements = helperExtractIMUMeasurements(data, startFrameIdx, currFrameIdx) timeStamps = data.timeStamps; startTimeStamp = timeStamps.imageTimeStamps(startFrameIdx); currTimeStamp = timeStamps.imageTimeStamps(currFrameIdx); [~,startIMUIdx] = min(abs(timeStamps.imuTimeStamps - startTimeStamp)); [~,currIMUIdx] = min(abs(timeStamps.imuTimeStamps - currTimeStamp)); imuMesurements.accel = data.accelReadings(startIMUIdx:(currIMUIdx-1),:); imuMesurements.gyro = data.gyroReadings(startIMUIdx:(currIMUIdx-1),:); end
helperGetAccuracy calculates the absolute RMSE and plots a comparison between the SLAM trajectory and the ground truth.
function helperGetAccuracy(vslam,data,startIdx) cam2IMU = vslam.CameraToIMUTransform.A; gravityRotation = vslam.GravityRotation.A; scale = vslam.IMUScale; [viPoses, ids] = poses(vslam); pred = cat(1,viPoses(:).Translation); gt_t = data.gTruth(ids+startIdx, 1:3); gt_r = data.gTruth(ids+startIdx, 4:7); gravityRotationInv = se3(inv(gravityRotation)); cam2IMU = se3(cam2IMU(1:3,1:3),[0 0 0]); pred = gravityRotationInv.transform(pred); pred = cam2IMU.transform(pred); initTF = se3(quat2rotm(gt_r(1,:)),gt_t(1,:)); initTFInv = se3(inv(tform(initTF))); gt_t = initTFInv.transform(gt_t); pred_rtf = []; gt_rtf = []; for i=1:height(viPoses) pred_rtf = [pred_rtf rigidtform3d(eye(3),pred(i,:))]; gt_rtf = [gt_rtf rigidtform3d(eye(3),gt_t(i,:))]; end metrics = compareTrajectories(pred_rtf, gt_rtf); disp(["Absolute RMSE for trajectory (m): ", metrics.AbsoluteRMSE(2)]); figure plot3(gt_t(:,1), gt_t(:,2), gt_t(:,3), 'g','LineWidth',2, 'DisplayName', 'Actual trajectory'); hold on plot3(pred(:,1), pred(:,2), pred(:,3), 'r','LineWidth',2, 'DisplayName', 'Scaled SLAM trajectory'); hold on plot3(pred(:,1)/scale, pred(:,2)/scale, pred(:,3)/scale, 'b','LineWidth',2, 'DisplayName', 'Unscaled SLAM trajectory'); legend view(0, 90); xlabel('X-axis'); ylabel('Y-axis'); zlabel('Z-axis'); title('SLAM VS GT') hold off end
helperDownloadData downloads the data set.
function vioData = helperDownloadSLAMData() vioDataTarFile = matlab.internal.examples.downloadSupportFile(... 'shared_nav_vision/data','BlackbirdVIOData.tar'); % Extract the file. outputFolder = fileparts(vioDataTarFile); if (~exist(fullfile(outputFolder,"BlackbirdVIOData"),"dir")) untar(vioDataTarFile,outputFolder); end vioData = load(fullfile(outputFolder,"BlackbirdVIOData","data.mat")); end