Text, Barcode, and Fiducial Marker Detection and Recognition
Computer Vision Toolbox™ supports text, barcode, and fiducial marker detection in images and videos using a combination of deep learning models and classical computer vision techniques. These capabilities are essential for applications such as autonomous driving, industrial automation, document analysis, and augmented reality.
For text detection, you can use a two-step process: first, detect regions in the image that contain text, and then recognize the text within those regions using optical character recognition (OCR). The toolbox offers multiple text detection approaches, including blob analysis, the maximally stable extremal regions (MSER) feature detector, and the deep learning-based CRAFT model. These methods help locate regions containing text in complex scenes. Alternatively, you can use the Image Labeler and Video Labeler apps to perform interactive and AI-assisted annotation of text regions in images.
Once you have identified text regions in an image, you can use OCR to
recognize the text using pretrained language models that support multiple
languages. For custom applications, you can train your own OCR models using the
trainOCR function. For more information, see Getting Started with OCR and Train Custom OCR Model.
For barcode and fiducial marker detection, the toolbox supports reading and decoding 1-D and 2-D barcodes and detecting fiducial markers such as AprilTags and ArUco markers. You can also generate ArUco markers programmatically, which is useful for calibration and tracking tasks in robotics and AR systems.
Apps
| Image Labeler | Label images for computer vision applications |
| Video Labeler | Label video for computer vision applications |
Functions
Topics
Get Started
- Getting Started with OCR
Detect and recognize text in multiple languages, train OCR models to recognize custom text. - Train Custom OCR Model
Train an optical character recognition (OCR) model to recognize custom text. - Install OCR Language Data Files
Support files for optical character recognition (OCR) languages. - Camera Calibration Using Custom Planar Calibration Patterns
Detect and localize AprilTags in a calibration pattern.







