MATLAB Answers

Problems in reading large matrix with large empty cells

1 view (last 30 days)
Hello:
I am facing problem in reading a large matrix with header file containing values and missing data/empty cells in each column. A few days ago, I have enquired for similar matrix, which could be read with fixedWidthImportOptions, however with simpler matrix: https://de.mathworks.com/matlabcentral/answers/451154-problems-with-empty-cell-in-a-large-matrix?s_tid=srchtitle
But for this case, matrix dimension are pretty complex with some more values. Could any one pls. suggest how to read in this case, especially for fixing 'VariableWidths'. I am attaching the sample matrix here for your reference.
  3 Comments
Poulomi Ganguli
Poulomi Ganguli on 7 May 2021
I could read upto column 19 using foll. codes. But after this, I am not able to read the values.
opts = fixedWidthImportOptions('NumVariables',36,'DataLines',151,...
'VariableNames',["INDEX" "YEAR" "MN" "HR" "DT" "SLP" "MSLP" "DBT" "WBT" "DPT" "RH" "VP" "DD" "FFF" "AW" "VV" "Cl" "A" "Cm" "A" "Ch" "A" "Dl" "Dm" "Dh" "TC" "h" "c" "a" "Ht" "R/F" "EVP" "DW" "P" "H" "WAT"],...
'VariableWidths',[5 5 3 4 3 6 7 6 6 6 4 5 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 3 3 3 3 3],...
'VariableTypes',repmat("double",1,36));
T = readtable(sprintf('%s%s',fullfile(Data_path),'tab3.txt'),opts);

Sign in to comment.

Accepted Answer

per isakson
per isakson on 7 May 2021
Edited: per isakson on 7 May 2021
This script reads your sample file
%%
opts = fixedWidthImportOptions('NumVariables',36,'DataLines',4,...
'VariableNames',{'INDEX','YEAR','MN','HR','DT','SLP','MSLP','DBT','WBT','DPT','RH','VP','DD','FFF','AW','VV','Cl','A','Cm','A','Ch','A','Dl','Dm','Dh','TC','h','c','a','Ht','R_F','EVP','DW','P','H','WAT'},...
'VariableWidths',[5,5,3,3,3,7,7,6,6,6,4,5,3,4,3,3,2,3,2,3,2,3,2,3,3,3,3,2,2,3,6,5,3,2,2,5],...
'VariableTypes',repmat("double",1,36));
T = readtable('Test_data.txt', opts );
T
T = 5×36 table
INDEX YEAR MN HR DT SLP MSLP DBT WBT DPT RH VP DD FFF AW VV Cl A Cm A_1 Ch A_2 Dl Dm Dh TC h c a Ht R_F EVP DW P H WAT _____ ____ __ __ __ _____ ______ ____ ____ ____ __ ____ __ ___ __ __ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ____ ___ ___ ___ ___ ___ 42182 1970 1 3 20 987.1 1012.5 15.6 15 14.6 93 16.6 14 4 12 96 9 5 2 2 0 0 9 9 0 7 5 9 1 30 10.5 1.5 NaN NaN NaN NaN 42182 1970 1 3 21 988.8 1014.6 12 11.7 11.4 97 13.5 0 0 6 91 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.5 0 NaN NaN NaN NaN 42182 1970 1 3 22 990.2 1016.3 9 8.8 8.6 97 11.2 27 4 5 90 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN 42182 1970 1 3 23 990.7 1016.7 9.8 9.4 9 95 11.5 0 0 2 93 0 0 3 2 1 3 0 9 9 5 9 3 2 60 0 0.5 NaN NaN NaN NaN 42182 1970 1 3 24 989.6 1015.7 9 8.6 8.2 94 10.9 14 4 0 93 0 0 0 0 0 0 0 0 0 0 9 0 0 99 0 1.8 NaN NaN NaN NaN
Determine the column widths based on the header line (second line). The widths of the columns " Cl A" differ between header and data. The first and last column must be added determined visually.
%%
hdr = 'INDEX YEAR MN HR DT ...SLP ..MSLP ..DBT ..WBT ..DPT .RH ..VP DD FFF AW VV Cl A Cm A Ch A Dl Dm Dh TC h c a Ht ..R/F .EVP DW P H .WAT';
pos = strfind( hdr, ' ' );
dpos = diff(pos)
dpos = 1×34
5 3 3 3 7 7 6 6 6 4 5 3 4 3 3 3 2 3 2 3 2 3 3 3 3 2 2 2 3 6
A ruler like this is useful, especially if the header line is not as good as this one.
%%
chr = '42182 1970 01 03 20 0987.1 1012.5 15.6 15.0 14.6 093 16.6 14 004 12 96 9 5 2 2 0 0 9 9 0 7 5 9 1 30 010.5 01.5';
db_ruler( chr )
1 2 3 4 5 6 7 8 9 0 1 2 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 42182 1970 01 03 20 0987.1 1012.5 15.6 15.0 14.6 093 16.6 14 004 12 96 9 5 2 2 0 0 9 9 0 7 5 9 1 30 010.5 01.5

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by