Spliing text between characters
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hello everyone, I have a very long list of questions with their answers. but their pattern is the same. I took a photo ocr. This is the pattern question number, question, three dots and and answer. I have hundrerds of question and answer like that. I want to separete it into question and answer for csv. is it possible to do it?
I used extractbetween for question.
extractbetween(text,'.','...')
"640. En selektif etkili antibiyotik...Penisilinler"
but with very long list of text I am stucked with both question and the answer .
can someone help me or tell is it possible.
0 commentaires
Réponses (2)
Mathieu NOE
le 1 Fév 2021
hello Ongun
I created a small txt file containing these lines (as example) :
"640. En selektif etkili antibiyotik...Penisilinler"
"641. En selektif etkili antibiyotik...Penisilinler1"
"642. En selektif etkili antibiyotik...Penisilinler2"
"643. En selektif etkili antibiyotik...Penisilinler3"
"644. En selektif etkili antibiyotik...Penisilinler4"
then I tested this code that generated the attached xlsx file
lines = readlines('Document1.txt');
for ci =1:numel(lines)
Qnumber_str = extractBefore(lines(ci),'.'); % extract question number (with double quote)
Qnumber_str = strrep(Qnumber_str, '"', ''); % remove start double quote
Qnumber{ci} = str2num(Qnumber_str); % convert to num
Question{ci} = extractBetween(lines(ci),'.','...'); % extract question
Answer_str = extractAfter(lines(ci),'...'); % extract answer
Answer_str = strrep(Answer_str, '"', ''); % remove end double quote
Answer{ci} = Answer_str;
end
A = [Qnumber' Question' Answer'];
T = array2table(A, 'VariableNames',{'Q number','Question','Answer'})
writetable(T,'test.xlsx')
0 commentaires
Walter Roberson
le 1 Fév 2021
S = fileread('Document1.txt');
tokens = regexp(S, '^(?<Q>.+)\.{3}(?<A>).+)$', 'lineanchors', 'names', 'dotexceptnewline');
Questions = vertcat(tokens.Q);
Answers = vertcat(tokens.A);
T = table(Questions, Answers);
writetable(T, 'QandA.csv')
0 commentaires
Voir également
Catégories
En savoir plus sur Text Data Preparation dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!