readtable of csv file with opts.DataLines =[n1 n2] and n1>2 doesn't work as expected

21 vues (au cours des 30 derniers jours)
Hello,
I'm trying to read a csv file by blocks, according to documentation, this shoudl work:
dir_load='some_dir';
file='some_file';
filename=fullfile(dir_load,file);
opts = detectImportOptions(filename);
opts.DataLines = [1 10];
T1=readtable(filename,opts);
opts.DataLines = [11 20];
T2=readtable(filename,opts);
opts.DataLines = [1 20];
T=readtable(filename,opts);
So, this T should be "[T1;T2]", but what i got is that T1 actually have lines 1 to 10 and T2 contains lines 6 to 15. What I'm doing wrong? You can find the file here.
T =
20x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
21.63 22.63 502 0.135
21.63 22.63 503 0.139
21.63 22.63 503 0.134
21.63 22.63 508 0.136
21.63 22.63 505 0.142
T1 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
T2 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
edited by Guillaume to attach the file to the question. Please don't use external file sharing sites

Réponse acceptée

Guillaume
Guillaume le 25 Juin 2019
If you look at the actual content of the file, you see that it has a blank line between each line of data. Although blank lines are ignored by default during reading, they still count for the purpose of line counting, so it's normal that line 11 is only the 6th line of data (because of the 5 blank lines ignored).
Now, there is indeed a bug with the end point of DataLines. For me (R2019a), I get
>> opts.DataLines = [1 10];
>> readtable('ex.txt', opts)
ans =
5×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
>> opts.DataLines = [11 20];
>> readtable('ex.txt', opts)
ans =
10×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
The result with DataLines = [1 10] I expected. The result wit DataLines = [11 20] has too many rows.
I will investigate a bit more then report to mathworks.
  2 commentaires
kira
kira le 25 Juin 2019
so odd, i think i created the file without blank lines. Without blank lines it works fine, I don't see too many rows, but I'm in R2018b...
Guillaume
Guillaume le 25 Juin 2019
Yes, the problem only shows if there are blank lines (or any skipped lines under the EmptyLineRule of the importoptions).
I've reported the bug.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by