Textscan with '@' as delimiter

4 vues (au cours des 30 derniers jours)
AMM
AMM le 7 Mai 2020
I'm working with an inherited script that calls TEXTSCAN as follows:
allData = textscan(fid,'%s','Delimiter','@');
What does the at-sign delimiter parameter do, and is this documented anywhere?
I don't see anything in the TEXTSCAN help for this, but when I parse the same text file with and without that parameter specified, I get different results. The input file contains no explicit at-sign characters anywhere. Is TEXTSCAN treating the @ as some special control character?
  5 commentaires
Walter Roberson
Walter Roberson le 9 Mai 2020
Please attach your data file, and also the code you use to reproduce the problem.
The tests I have done find nothing special about using @ . The effect I get when I use any character not found in the file exactly the same as if I use
textscan(fid, '%s', 'Delimiter', '\n', 'Multiple', true)
or
textscan(fid, '%s', 'whitespace', '\n')
and the effect is:
  • each time the %s fires, skip all leading spaces and newlines
  • once the %s starts reading something non-blank, continue until the first newline
AMM
AMM le 12 Mai 2020
Modifié(e) : AMM le 12 Mai 2020
Hi Walter,
Here you go. Here is what I'm seeing with the attached file:
>> fid=fopen('textscan_test.txt','rt');
>> out1=textscan(fid,'%s'); out1=out1{1}; frewind(fid);
>> out2=textscan(fid,'%s','Delimiter','@'); out2=out2{1};
>> out3=textscan(fid,'%s','whitespace','\n'); out3=out3{1}; fclose(fid);
>> whos
Name Size Bytes Class Attributes
ans 1x1 8 double
fid 1x1 8 double
out1 2700x1 351730 cell
out2 538x1 134220 cell
out3 538x1 134220 cell
As you can see, the attached file contains no at-signs.
Indeed, what seems to be happening is exactly what you describe: if textscan is given a delimiter that doesn't occur in the input, it falls back to the default behavior you mention above.

Connectez-vous pour commenter.

Réponse acceptée

per isakson
per isakson le 13 Mai 2020
I've reproduced your result on R2018b. The result is according to the textscan documentation - I think.
  • out1 is a cell array of character arrays with one item per cell
  • out2 is a cell array of character arrays with one data row per cell
Case 1. One or more spaces are used as delimiter. That's by default and regardless of the value of 'MultipleDelimsAsOne'. Doc says: If you do not specify a delimiter, then: the delimiter characters are the same as the white-space characters.
Case 2. '@' is used as delimiter. '%s' matches the entire row, since no delimiter is found. (I don't find a sentence in the documentation to copy. There is something about row-oriented that goes without saying.)

Plus de réponses (0)

Catégories

En savoir plus sur Text Data Preparation dans Help Center et File Exchange

Tags

Produits


Version

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by