How to make custom OCR recognize decimals

2 vues (au cours des 30 derniers jours)
Jacob Ebilane
Jacob Ebilane le 29 Avr 2022
Réponse apportée : Ayush le 14 Déc 2023
I've got a running program to read handwriting. It's all the basic stuff and I'm using the EMNIST dataset. I encountered a problem when I tried reading a line with numbers and decimals. I don't really know how to make it so my program reads the decimal points without making it think that any dot is a decimal.

Réponses (1)

Ayush
Ayush le 14 Déc 2023
Hi Jacob,
I understand that you want to detect the decimal points when reading a line with numbers and decimals, while avoiding detecting any dot as decimal.
You can follow below steps:
  1. Segment the digits in the line. Segmenting the individual digits in the line will help you do contextual analysis.
  2. Identify the position of digits and their relationship to the nearby dots. For instance, if a dot is located between two digits, it is more likely to represent a decimal point.
  3. Find patterns that suggest the presence of a decimal number. For instance, search for digit sequences that are followed by a dot and then additional digits, as this is a typical way to identify a decimal number.
Refer the following pseudo code for better understanding:
% Assuming you have already segmented the input line into individual characters. Refer the documentation below for exploring various segmentation process in the image.
inputLine = 'Your input line containing numbers and decimals';
characters = convertInputLineToCharacters(inputLine); % Convert the input line to individual characters, using the function you have written for your use case (I have provided a pseudo function name, replace it with your function).
for i = 1:length(characters)
currentChar = characters{i};
% Perform contextual analysis to distinguish decimal points from regular dots
if istrcmp(previousChar, '.') % Check if the current character is a dot
% Check the neighbouring characters to determine if the current dot is a decimal point
if i > 2 && i < length(characters)-1
previousChar = characters{i-1};
nextChar = characters{i+1};
if isdigit(previousChar) && isdigit(nextChar)
% The dot is likely a decimal point
% Perform appropriate processing for decimal numbers
end
end
else
% Perform processing for non-digit characters
end
end
For more information on various segmentation techniques for images and “OCR” function refer the links below:
Regards,
Ayush

Produits


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by