Effacer les filtres
Effacer les filtres

How to figure out the number of rows for headers?

13 vues (au cours des 30 derniers jours)
Leon
Leon le 30 Mar 2020
Commenté : Leon le 2 Avr 2020
I have Excel files with unknown number of headers. For exmple, some files only have one row of header info. Another one could have 12 header lines.
I would need to know the accurate count of the number of rows of headerlines, so that my program would know the Excel Row# for a cetain data value. Right now, it only knows the relative Row # of the numerical portion of the Excel file.
What makes things more complicated is that my data also routinely contain columns that are made of text-strings.
Right now, I use readtable to read the Excel file. It basically assume the first Row is the headerline, and the rest of them are data. Here is my question. Is there a straightforward way to identify the number of headerlines in my case?
Thanks!
  1 commentaire
Leon
Leon le 2 Avr 2020
The below code works for flat csv or txt files. It does a good job of reading all of the headers. Is there an easy way to make it work for xlsx files as well?
header_end = false;
header_count = 0;
% preallocate 50 rows of header text
header = cell(50, 1);
%% Loop through header lines
while ~header_end
%% Get a whole line from the file
aline = fgetl(FID);
%% Use a regular expression to look for the data block header
tokens = regexp(aline, ...
['(\d+)' ... % capture a number
'\s+!\s+' ... % followed by an !
'(\w+\s+)+'],... % and capture a series of strings
'tokens', 'warnings');
%% Parse data block header if found
if ~isempty(tokens) && numel(tokens{:}) == 2
% parse numeric token using string to double
num_datum = str2double(tokens{1}(1));
% parse variable string
varNameStr = deblank(tokens{1}(2));
names = strsplit(varNameStr{:});
%names = names{1};
% stop looping
header_end = true;
else
% Increment header line count and add new line
header_count = header_count+1;
header{header_count} = aline;
end
end

Connectez-vous pour commenter.

Réponses (2)

Fangjun Jiang
Fangjun Jiang le 30 Mar 2020
If you use [Num, Txt, Raw]=xlsread(), can you figure it out based on the numerical, text and raw data?
  1 commentaire
Leon
Leon le 30 Mar 2020
Thanks. Unfortunately, I can't.

Connectez-vous pour commenter.


Josh Zagorski
Josh Zagorski le 30 Mar 2020
[A DELIM NHEADERLINES] = importdata(...) returns the detected number of header
lines in the input ASCII file.
So, [Matrix_converted,'delimiter',No_headerlines = importdata('file.xlsx');
  5 commentaires
Josh Zagorski
Josh Zagorski le 31 Mar 2020
Do you have the correct filepath?
Otherwise, I glossed over your "text-strings" data comment - wondering if data needs to be double/numeric "data contains a double array."
Leon
Leon le 2 Avr 2020
Yes, I do.
Just tried a different file, below is what I get:
>> [A,B,C]= importdata('G01.xlsx')
A =
struct with fields:
data: [1×1 struct]
textdata: [1×1 struct]
colheaders: [1×1 struct]
B =
NaN
C =
0

Connectez-vous pour commenter.

Produits


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by