Computer Vision Toolbox Model for Grounding DINO Object Detection

Grounding DINO is a zero-shot pre-trained Vision Language Model (VLM) that enables open vocabulary, text-prompted object detection.

Vous suivez désormais cette soumission

Grounding DINO enables zero-shot object detection from textual inputs, without requiring dedicated class training on the input term. It can therefore detect objects outside of its training set. It combines a Transformer-based DINO object detector with grounded pre-training.

Add the first tag.

Compatibilité avec les versions de MATLAB

  • Compatible avec les versions R2026a à R2026b

Plateformes compatibles

  • Windows
  • macOS (Apple Silicon)
  • macOS (Intel)
  • Linux