How to make OCR recognize Upper and Lower case letters with similar shapes

8 vues (au cours des 30 derniers jours)
Jacob Ebilane
Jacob Ebilane le 6 Juin 2022
I'm having trouble with Upper and Lower case letter recognition. For instance, I'm trying to read 'c' but it keeps returning as 'C', same goes for the letter O/o or any other letters with similar upper and lower case shapes. My classifier is trained using a dataset which contains both upper and lower case samples from the emnist byclass dataset.

Réponses (1)

Ayush
Ayush le 3 Jan 2024
Hi Jacob,
I understand that you want to distinguish between lowercase and uppercase letters that have similar shapes, such as "c" versus "C" and "o" versus "O", during classification.
One way to do this is by using the contour formation perspective. Refer the below steps for better understanding:
  • Normalize the character size. This will ensure that the characters will have the consistent size and aspect ratio is maintained. This can help the classifier learn size-based differences between upper and lower-case letters.
  • Use contour feature extraction. Extract features based on the contours of the characters that might help to distinguish between letter cases. For example, the relative size of the character within a bounding box could be a useful feature, as upper-case letters are larger.
  • You can also use the additional features like topological structure of letters. When the aspect ratio of characters is maintained, you can compare structure of the letters such as in case of “O” the hole is larger compared to “o”.
For more information on the contour formation, refer to the link below:
Regards,
Ayush

Produits


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by