Text, Barcode, and Fiducial Marker Detection and Recognition

Detect and recognize text (OCR), barcodes, and fiducial markers using AI models

Computer Vision Toolbox™ supports text, barcode, and fiducial marker detection in images and videos using a combination of deep learning models and classical computer vision techniques. These capabilities are essential for applications such as autonomous driving, industrial automation, document analysis, and augmented reality.

For text detection, you can use a two-step process: first, detect regions in the image that contain text, and then recognize the text within those regions using optical character recognition (OCR). The toolbox offers multiple text detection approaches, including blob analysis, the maximally stable extremal regions (MSER) feature detector, and the deep learning-based CRAFT model. These methods help locate regions containing text in complex scenes. Alternatively, you can use the Image Labeler and Video Labeler apps to perform interactive and AI-assisted annotation of text regions in images.

Once you have identified text regions in an image, you can use OCR to recognize the text using pretrained language models that support multiple languages. For custom applications, you can train your own OCR models using the trainOCR function. For more information, see Getting Started with OCR and Train Custom OCR Model.

For barcode and fiducial marker detection, the toolbox supports reading and decoding 1-D and 2-D barcodes and detecting fiducial markers such as AprilTags and ArUco markers. You can also generate ArUco markers programmatically, which is useful for calibration and tracking tasks in robotics and AR systems.

Apps

Image Labeler	Label images for computer vision applications
Video Labeler	Label video for computer vision applications

Functions

expand all

Text Detection

`detectTextCRAFT`	Detect texts in images by using CRAFT deep learning model (Since R2022a)
`detectMSERFeatures`	Detect MSER features
`vision.BlobAnalysis`	Properties of connected regions
`extractHOGFeatures`	Extract histogram of oriented gradients (HOG) features

Text Recognition

`ocr`	Recognize text using optical character recognition
`ocrText`	Store OCR results
`visionSupportPackages`	Start Installer to download, install, or uninstall Computer Vision Toolbox data

Barcode and Fiducial Marker Detection

`readAprilTag`	Detect and estimate pose for AprilTag in image
`readArucoMarker`	Detect and estimate pose for ArUco marker in image (Since R2024a)
`generateArucoMarker`	Generate ArUco marker images (Since R2024a)
`readBarcode`	Detect and decode 1-D or 2-D barcode in image

Training and Evaluation

`trainOCR`	Train OCR model to recognize text in image (Since R2023a)
`evaluateOCR`	Evaluate OCR results against ground truth (Since R2023a)
`ocrMetrics`	Store OCR quality metrics (Since R2023a)
`ocrTrainingOptions`	Options for training OCR model (Since R2023a)
`ocrTrainingData`	Create training data for OCR from ground truth (Since R2023a)

Quantization

quantizeOCR Quantize OCR model (Since R2023a)

Topics

Get Started

Getting Started with OCR
Detect and recognize text in multiple languages, train OCR models to recognize custom text.
Train Custom OCR Model
Train an optical character recognition (OCR) model to recognize custom text.
Install OCR Language Data Files
Support files for optical character recognition (OCR) languages.
Camera Calibration Using Custom Planar Calibration Patterns
Detect and localize AprilTags in a calibration pattern.